Intrusion detection system (IDS) and intrusion prevention system (IPS) tend to be expensive and complicated. In AWS, you can go for much simpler solution - WAF. But that requires you to use Application Load Balancer or CloudFront. But even with WAF, you have to manage a list IP addresses of attackers that should be blocked. Or, if you only ever need to block single IPs for short periods of time, NACLs may be a much easier option! Here’s a walkthrough on how you can implement a terribly simple (yet very powerful) intrusion detection and prevention in AWS with Lambda and DynamoDB Streams for a web application.
First things first: I’m going to show you how to block IPs which send “naughty” requests to your server using NACLs in VPC. Note that you can only ever have 20 rules in one NACLs, and that includes the default deny rules. Assuming you only have IPv4 support enabled in your VPC, default DENY + one ALLOW rule, that leaves only 18 rules to use. With IPv6 support enabled, 16 are left. That means, if you’re going to be blocking single IPs, you can only block 16 at the time. You can potentially extend the code below to deny access to whole CIDR blocks, but if you have that kind of need, I’d advise you to use WAF and use the Lambdas below to curate your IP lists for WAF. I’m using NACLs, because the server I wrote this code for used a Classic Load Balancer and there was no appetite to use CloudFront or Application Load Balancer - plus it only ever needed to block 2-3 IPs at one time.
What does the setup below accomplishes? It detects repeated requests blocked by Mod_Security. Loads to stuff is logged to Datadog Events. Such IP is stored in Dynamo with 24 hours time-to-live (TTL). If the IP causes an error within the next 24 hours, the counter next to it is increased and TTL is extended for another 24 hours (from the time of the second blocked request). If the counter reaches 10, the IP is blocked by NACL. If the IP is “quiet” for 24 hours, it will be deleted by Dynamo (TTL expires) and removed from NACL (if it was added before). Obviously, IPs blocked by NACL will become “quiet”, so they will be removed from NACL after the 24-hour ban. This means, we are capable of blocking not only IPs committing offences within a short period of time, but also those waiting up to 24 hours between attacks.
OK, let’s get to it. First of all, you’re going to need a way to find the “bad” IPs. You can send your access/error logs to CloudWatch and use a Lambda to analyze each row. The webserver I was working with, already had Mod_Security installed as a module for Apache. You can use the capability of Mod_Security to block the IPs, but: a) there’s a list of IPs that mod_security has to manage, which will be tricky if you’re running on more that one EC2 ( how would you share this database?) b) the block would come from the server, to the request would still generate load on your machine. Still, use Mod_Security to block any attack attempts that reach your server.
If you block the request before it even reaches your server, the attacher cannot launch a DOS attack (DDoS is still possible, I hope you understand the difference ;) ).
So, there’s what we did: Apache Error Logs are pushed to CloudWatch, which are streamed to a Lambda function. Lambda recognizes errors and performs various actions.
An example of a Mod_Security error log (about a request that was blocked) (modified to obscure time and addresses):
[Mon Jan 01 01:01:01.000000 2017] [:error] [pid 1234:tid 123456789012345] [client 192.0.2.1:1234] [client 192.0.2.1] ModSecurity: Access denied with code 403 (phase 2). Operator EQ matched 0 at REQUEST_HEADERS. [file "/etc/httpd/modsecurity.d/activated_rules/modsecurity_crs_21_protocol_anomalies.conf"] [line "11"] [id "960009"] [rev "1"] [msg "Request Missing a User Agent Header"] [severity "NOTICE"] [ver "OWASP_CRS/2.2.9"] [maturity "9"] [accuracy "9"] [tag "OWASP_CRS/PROTOCOL_VIOLATION/MISSING_HEADER_UA"] [tag "WASCTC/WASC-21"] [tag "OWASP_TOP_10/A7"] [tag "PCI/6.5.10"] [hostname "192.0.2.2"] [uri "/"] [unique_id "abc"]
Lambda below analyses each error. For errors that are fairly common (empty User-agent, numeric Host header, etc.) it simply logs the attempt to Dynamo. For other errors, it also sends an event to Datadog ( you can modify this to send the info to SNS or anywhere else).
|
|
Send environment variable api_key
for the Lambda to your Datadog api key.
Once you have your logs in CloudWatch and the Lambda above, you can create a subscription for CloudWatch to Lambda. You can do it from the AWS Console (choose a log stream and “Stream to AWS Lambda”, use empty log format to stream the data as it is), or from CLI:
Give CloudWatch Logs permission to invoke Lambda:
aws lambda add-permission --function-name YOUR_LAMBDA_NAME --statement-id logs-httpd --principal "logs.REGION.amazonaws.com" --action "lambda:InvokeFunction" --source-arn "arn:aws:logs:REGION:ACCOUNT_ID:log-group:LOG_GROUP:*" --source-account ACCOUNT_ID
Create a subscription for logs to Lambda:
aws logs put-subscription-filter --log-group-name "LOG_GROUP" --filter-name LambdaStream_logs_http --filter-pattern "" --destination-arn "arn:aws:lambda:REGION:ACCOUNT_ID:function:YOUR_LAMBDA_NAME"
You’ll also need:
- DynamoDB table (put its name in YOUR_DYNAMODB_TABLE in the Lambda code) with
client
primary key (String). - Role for Lambda to allow it to write to logs (usual role for Lambda) and perform
dynamodb:UpdateItem
on the Dynamo table.
Now, this Lambda will only log the IPs to Dynamo and send various notifications to Datadog. That’s the IDS part of it.
Here’s where the “magic” happens:
- Set the
last_event
key as TTL on the DynamoDB table. - Create another Lambda (code to follow) and create a DynamoDB Stream (you can do it under Triggers in AWS Console for Dynamo) to stream each event from Dynamo to that new Lambda.
This will trigger your new Lambda whenever there’s an action performed against your DynamoDB Table (item is added,
modified or removed). last_event
is actually the time of last event + 24 hours - set as TTL for the Dynamo and it will
be extended every time this IP makes another naughty request. If TTL expires, Dynamo will remove the record.
Here’s the second Lambda:
|
|
This, as we said, will be triggered every time there’s an update on Dynamo table. What this code does:
- If the row is modified (it should only ever happen from the first Lambda) and
event_count
is at least 10, the IP address will be added to NACL (defined at the top of the code). - If the row is deleted (either manually or by Dynamo TTL), the Lambda will make sure the NACL entry for this IP is removed (if it existed).
Tip from #2 above: you can un-block an IP by simply deleting it from Dynamo.
This Lambda requires the following permissions on its role:
|
|
Note: those EC2 permissions cannot be restricted to specific NACLs (sic!), so you need to set the resource to “*”.
Possible improvements that you can make:
- Extend the ban by performing an update on dynamo in the second Lambda - if you want to block the IP for 48, 72, … hours (this will re-trigger second Lambda and be detected as MODIFY event, so modify the code accordingly).
- You can obviously tweak the numbers - 24 hours and 10 attempts can be changed to whatever you want.
- Use WAF update-ip-set instead of NACLs for larger sets of IPs.
- Items with TTL in DynamoDB are not guaranteed to be deleted at the exact time TTL expires. In this use-case this is not a problem (the IP will simply be blocked for longer), but take it into account if you’re using the TTL for other things.