Web Application Firewall (WAF) logs are a rich source of security telemetry, capturing detailed information about HTTP/S requests and the actions taken to protect your applications.
Each log entry typically includes:
- Client IP Address: The origin of the request.
- URI: The requested resource path.
- Headers: Including User-Agent and other identifying information.
- Action: The WAF’s decision (ALLOW, BLOCK, COUNT).
- Rule Group: The specific rule or policy that matched.
Example Log Entry:
json{
"timestamp": 1713354000000,
"httpRequest": {
"clientIp": "203.0.113.12",
"uri": "/login.php",
"headers": [{"name": "User-Agent", "value": "sqlmap"}]
},
"action": "BLOCK",
"ruleGroupList": [{
"ruleGroupId": "AWS-AWSManagedRulesSQLiRuleSet",
"terminatingRule": "SQLi_BODY"
}]
}
In this example, the WAF blocked a request to /login.php
from a suspicious user agent (sqlmap
), triggered by an SQL injection detection rule.
Architecting the Threat Detection Pipeline
A robust threat detection pipeline transforms raw WAF logs into actionable insights. The architecture typically involves log collection, storage, processing, and alerting.
Data Collection and Storage
Log Ingestion: Configure your WAF to stream logs to a centralized storage location, such as an S3 bucket or a log analytics service. For distributed environments, aggregate logs from multiple regions or accounts.
Schema Definition: Use schema discovery tools to define the structure of your logs. For example, with AWS Glue, you can automatically create a schema for querying logs with Athena.
Example Table Definition:
sqlCREATE EXTERNAL TABLE waf_logs ( timestamp BIGINT, httpRequest STRUCT<clientIp:STRING, uri:STRING>, action STRING ) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' LOCATION 's3://your-bucket/waf-logs/';
Integration with SIEM:
Forward logs to a Security Information and Event Management (SIEM) platform for correlation with other security data, enabling advanced analytics and alerting.
Querying and Analyzing Logs
With your logs structured and accessible, you can use SQL-based tools or log analytics platforms to search for threats.
Example 1: Top Blocked IPs
sqlSELECT httpRequest.clientIp, COUNT(*) AS blocked_count
FROM waf_logs
WHERE action = 'BLOCK'
GROUP BY httpRequest.clientIp
ORDER BY blocked_count DESC
LIMIT 10;
This query identifies the IP addresses most frequently blocked by your WAF, which may indicate scanning or attack activity.
Example 2: Bot Traffic Analysis
sqlSELECT httpRequest.headers[1].value AS user_agent, COUNT(*) AS request_count
FROM waf_logs
WHERE action = 'BLOCK' AND httpRequest.headers[1].name = 'User-Agent'
GROUP BY user_agent
ORDER BY request_count DESC
LIMIT 10;
This surfaces user agents commonly associated with blocked requests, helping to distinguish between legitimate bots and malicious automation.
Advanced Detection and Automated Response
A mature pipeline doesn’t just detect threats—it responds to them. Here are strategies for advanced detection and automated mitigation.
Real-Time Anomaly Detection
Behavioral Baselines:
Establish normal patterns for request rates, endpoints accessed, and error rates. Use anomaly detection algorithms to flag significant deviations, such as a sudden spike in blocked requests or unusual access patterns.
Session Tracking:
Detect rapid endpoint scanning or brute-force attempts by tracking unique endpoints accessed per IP in a short timeframe.
Example Query:
sqlSELECT httpRequest.clientIp, COUNT(DISTINCT httpRequest.uri) AS unique_uris FROM waf_logs WHERE timestamp > (current_timestamp - interval '5' minute) GROUP BY httpRequest.clientIp HAVING unique_uris > 20;
This flags IPs accessing more than 20 unique URIs in five minutes, a common sign of reconnaissance.
Automated Mitigation
- Dynamic Blocklists:
Use Lambda functions or similar automation to update WAF IP sets based on analytics. For example, automatically block IPs with more than 100 blocked requests in an hour. - SIEM Integration:
Configure your SIEM to trigger incidents for high-confidence threat patterns, such as repeated SQL injection attempts, and notify your security team for rapid response. - Enriching with Threat Intelligence:
Cross-reference WAF logs with external threat intelligence feeds (such as known malicious IPs or Tor exit nodes) to prioritize alerts and blocklists.
Optimizing and Evolving Your Pipeline
Reduce False Positives:
Continuously refine WAF rules and detection logic based on log analysis. For example, exclude known good bots (like Googlebot) from certain rules to prevent unnecessary blocking.
Machine Learning Integration:
Train models on historical WAF logs to identify subtle or novel attack patterns that static rules may miss, such as unusual payload encodings or attack sequences.
Regular Audits and Updates:
Periodically review detection rules, automation logic, and response playbooks to adapt to evolving threats and business requirements.
A well-architected threat detection pipeline using WAF logs enables proactive defense against web threats.
By combining structured log analysis, real-time anomaly detection, and automated response, organizations can rapidly identify and mitigate attacks—turning raw logs into a powerful security asset
Find this Story Interesting! Follow us on LinkedIn and X to Get More Instant Update