new model deployments.

Techniques

Sample rules

Unusual High Denied Sensitive Information Policy Blocks Detected

source: elastic
technicques:

Description

Detects repeated compliance violation ‘BLOCKED’ actions coupled with specific policy name such as ‘sensitive_information_policy’, indicating persistent misuse or attempts to probe the model’s denied topics.

Detection logic

from logs-aws_bedrock.invocation-*

// Expand multi-valued policy name field
| mv_expand gen_ai.policy.name
| mv_expand gen_ai.policy.action

// Filter for blocked actions related to sensitive info policy
| where
  gen_ai.policy.action == "BLOCKED"
  and gen_ai.compliance.violation_detected == "true"
  and gen_ai.policy.name == "sensitive_information_policy"

// keep only relevant fields
| keep user.id

// count how many times each user triggered a sensitive info block
| stats
    Esql.ml_policy_blocked_sensitive_info_count = count()
  by user.id

// Filter for users with more than 5 violations
| where Esql.ml_policy_blocked_sensitive_info_count > 5

// sort highest to lowest
| sort Esql.ml_policy_blocked_sensitive_info_count desc

Unusual High Confidence Content Filter Blocks Detected

source: elastic
technicques:

Description

Detects repeated high-confidence ‘BLOCKED’ actions coupled with specific ‘Content Filter’ policy violation having codes such as ‘MISCONDUCT’, ‘HATE’, ‘SEXUAL’, INSULTS’, ‘PROMPT_ATTACK’, ‘VIOLENCE’ indicating persistent misuse or attempts to probe the model’s ethical boundaries.

Detection logic

from logs-aws_bedrock.invocation-*

// Expand multi-value fields
| mv_expand gen_ai.compliance.violation_code
| mv_expand gen_ai.policy.confidence
| mv_expand gen_ai.policy.name
| mv_expand gen_ai.policy.action

// Filter for high-confidence content policy blocks with targeted violations
| where
  gen_ai.policy.action == "BLOCKED"
  and gen_ai.policy.name == "content_policy"
  and gen_ai.policy.confidence like "HIGH"
  and gen_ai.compliance.violation_code in ("HATE", "MISCONDUCT", "SEXUAL", "INSULTS", "PROMPT_ATTACK", "VIOLENCE")

// keep ECS + compliance fields
| keep
  user.id,
  gen_ai.compliance.violation_code

// count blocked violations per user per violation type
| stats
    Esql.ml_policy_blocked_violation_count = count()
  by
    user.id,
    gen_ai.compliance.violation_code

// Aggregate all violation types per user
| stats
    Esql.ml_policy_blocked_violation_total_count = sum(Esql.ml_policy_blocked_violation_count)
  by
    user.id

// Filter for users with more than 5 total violations
| where Esql.ml_policy_blocked_violation_total_count > 5

// sort by violation volume
| sort Esql.ml_policy_blocked_violation_total_count desc

Unusual High Denied Topic Blocks Detected

source: elastic
technicques:

Description

Detects repeated compliance violation ‘BLOCKED’ actions coupled with specific policy name such as ’topic_policy’, indicating persistent misuse or attempts to probe the model’s denied topics.

Detection logic

from logs-aws_bedrock.invocation-*

// Expand multi-value policy name field
| mv_expand gen_ai.policy.name
| mv_expand gen_ai.policy.action

// Filter for blocked topic policy violations
| where
  gen_ai.policy.action == "BLOCKED"
  and gen_ai.compliance.violation_detected == "true"
  and gen_ai.policy.name == "topic_policy"

// keep only user info
| keep user.id

// count how many times each user triggered a blocked topic policy
| stats
    Esql.ml_policy_blocked_topic_count = count()
  by user.id

// Filter for excessive violations
| where Esql.ml_policy_blocked_topic_count > 5

// sort highest to lowest
| sort Esql.ml_policy_blocked_topic_count desc

Unusual High Word Policy Blocks Detected

source: elastic
technicques:

Description

Detects repeated compliance violation ‘BLOCKED’ actions coupled with specific policy name such as ‘word_policy’, indicating persistent misuse or attempts to probe the model’s denied topics.

Detection logic

from logs-aws_bedrock.invocation-*

// Expand multivalued policy names
| mv_expand gen_ai.policy.name
| mv_expand gen_ai.policy.action

// Filter for blocked profanity-related policy violations
| where
  gen_ai.policy.action == "BLOCKED"
  and gen_ai.compliance.violation_detected == "true"
  and gen_ai.policy.name == "word_policy"

// keep relevant user field
| keep user.id

// count blocked profanity attempts per user
| stats
    Esql.ml_policy_blocked_profanity_count = count()
  by user.id

// Filter for excessive policy violations
| where Esql.ml_policy_blocked_profanity_count > 5

// sort by violation volume
| sort Esql.ml_policy_blocked_profanity_count desc