LoFP LoFP / testing updates to compliance policies.

Techniques

Sample rules

Unusual High Denied Sensitive Information Policy Blocks Detected

Description

Detects repeated compliance violation ‘BLOCKED’ actions coupled with specific policy name such as ‘sensitive_information_policy’, indicating persistent misuse or attempts to probe the model’s denied topics.

Detection logic

from logs-aws_bedrock.invocation-*

// Expand multi-valued policy name field
| mv_expand gen_ai.policy.name

// Filter for blocked actions related to sensitive info policy
| where
  gen_ai.policy.action == "BLOCKED"
  and gen_ai.compliance.violation_detected == "true"
  and gen_ai.policy.name == "sensitive_information_policy"

// keep only relevant fields
| keep user.id

// count how many times each user triggered a sensitive info block
| stats
    Esql.ml_policy_blocked_sensitive_info_count = count()
  by user.id

// Filter for users with more than 5 violations
| where Esql.ml_policy_blocked_sensitive_info_count > 5

// sort highest to lowest
| sort Esql.ml_policy_blocked_sensitive_info_count desc

Unusual High Confidence Content Filter Blocks Detected

Description

Detects repeated high-confidence ‘BLOCKED’ actions coupled with specific ‘Content Filter’ policy violation having codes such as ‘MISCONDUCT’, ‘HATE’, ‘SEXUAL’, INSULTS’, ‘PROMPT_ATTACK’, ‘VIOLENCE’ indicating persistent misuse or attempts to probe the model’s ethical boundaries.

Detection logic

from logs-aws_bedrock.invocation-*

// Expand multi-value fields
| mv_expand gen_ai.compliance.violation_code
| mv_expand gen_ai.policy.confidence
| mv_expand gen_ai.policy.name

// Filter for high-confidence content policy blocks with targeted violations
| where
  gen_ai.policy.action == "BLOCKED"
  and gen_ai.policy.name == "content_policy"
  and gen_ai.policy.confidence like "HIGH"
  and gen_ai.compliance.violation_code in ("HATE", "MISCONDUCT", "SEXUAL", "INSULTS", "PROMPT_ATTACK", "VIOLENCE")

// keep ECS + compliance fields
| keep
  user.id,
  gen_ai.compliance.violation_code

// count blocked violations per user per violation type
| stats
    Esql.ml_policy_blocked_violation_count = count()
  by
    user.id,
    gen_ai.compliance.violation_code

// Aggregate all violation types per user
| stats
    Esql.ml_policy_blocked_violation_total_count = sum(Esql.ml_policy_blocked_violation_count)
  by
    user.id

// Filter for users with more than 5 total violations
| where Esql.ml_policy_blocked_violation_total_count > 5

// sort by violation volume
| sort Esql.ml_policy_blocked_violation_total_count desc

Unusual High Word Policy Blocks Detected

Description

Detects repeated compliance violation ‘BLOCKED’ actions coupled with specific policy name such as ‘word_policy’, indicating persistent misuse or attempts to probe the model’s denied topics.

Detection logic

from logs-aws_bedrock.invocation-*

// Expand multivalued policy names
| mv_expand gen_ai.policy.name

// Filter for blocked profanity-related policy violations
| where
  gen_ai.policy.action == "BLOCKED"
  and gen_ai.compliance.violation_detected == "true"
  and gen_ai.policy.name == "word_policy"

// keep relevant user field
| keep user.id

// count blocked profanity attempts per user
| stats
    Esql.ml_policy_blocked_profanity_count = count()
  by user.id

// Filter for excessive policy violations
| where Esql.ml_policy_blocked_profanity_count > 5

// sort by violation volume
| sort Esql.ml_policy_blocked_profanity_count desc

Unusual High Denied Topic Blocks Detected

Description

Detects repeated compliance violation ‘BLOCKED’ actions coupled with specific policy name such as ’topic_policy’, indicating persistent misuse or attempts to probe the model’s denied topics.

Detection logic

from logs-aws_bedrock.invocation-*

// Expand multi-value policy name field
| mv_expand gen_ai.policy.name

// Filter for blocked topic policy violations
| where
  gen_ai.policy.action == "BLOCKED"
  and gen_ai.compliance.violation_detected == "true"
  and gen_ai.policy.name == "topic_policy"

// keep only user info
| keep user.id

// count how many times each user triggered a blocked topic policy
| stats
    Esql.ml_policy_blocked_topic_count = count()
  by user.id

// Filter for excessive violations
| where Esql.ml_policy_blocked_topic_count > 5

// sort highest to lowest
| sort Esql.ml_policy_blocked_topic_count desc