authorized heavy usage of the system that is business justified and monitored.

Techniques

Sample rules

Potential Abuse of Resources by High Token Count and Large Response Sizes

source: elastic
technicques:

Description

Detects potential resource exhaustion or data breach attempts by monitoring for users who consistently generate high input token counts, submit numerous requests, and receive large responses. This behavior could indicate an attempt to overload the system or extract an unusually large amount of data, possibly revealing sensitive information or causing service disruptions.

Detection logic

from logs-aws_bedrock.invocation-*
| keep user.id, gen_ai.usage.prompt_tokens, gen_ai.usage.completion_tokens
| stats max_tokens = max(gen_ai.usage.prompt_tokens),
         total_requests = count(*),
         avg_response_size = avg(gen_ai.usage.completion_tokens)
  by user.id
// tokens count depends on specific LLM, as is related to how embeddings are generated.
| where max_tokens > 5000 and total_requests > 10 and avg_response_size > 500
| eval risk_factor = (max_tokens / 1000) * total_requests * (avg_response_size / 500)
| where risk_factor > 10
| sort risk_factor desc