LoFP LoFP / new model deployments.

Techniques

Sample rules

Unusual High Confidence Misconduct Blocks Detected

Description

Detects repeated high-confidence ‘BLOCKED’ actions coupled with specific violation codes such as ‘MISCONDUCT’, indicating persistent misuse or attempts to probe the model’s ethical boundaries.

Detection logic

from logs-aws_bedrock.invocation-*
| where gen_ai.policy.confidence == "HIGH" and gen_ai.policy.action == "BLOCKED" and gen_ai.compliance.violation_code == "MISCONDUCT"
| stats high_confidence_blocks = count() by user.id
| where high_confidence_blocks > 5
| sort high_confidence_blocks desc