When using AWS ElasticSearch Service, you have multiple ways of controlling access to your cluster. AWS Console gives you some pre-set policies you can use, but the ways of access can be confusing. Using the IP-based access, you can allow access from your EC2 instances. But what if you want to access your cluster from a Lambda?
The IP-based policy is the most popular one. It solves the problem if your application is running on EC2 instances within VPC, as you can assign them an Elastic IP(s) and add those IPs to the policy. Alternatively, if you run your instances within a private subnet, you can use a NAT Gateway (which has an Elastic IP) to access the cluster. The example of such policy (allowing full access to the cluster) looks like this:
|
|
More and more people are using AWS Lambda and getting involved in the serverless way of running their software. When running your code in Lambda, your function will by default access the internet from within an AWS-managed VPC with a random IP from their pool. This pool is not publicly known, but even if you were to use it with the IP-based policy, you would allow access to your cluster for anyone running their code in Lambda! Obviously, that’s not much better than simply using the allow-all policy.
To limit access to the cluster from only your own Lambdas, you can use those approaches:
IP-based policy
You can use IP-based policy, but you have to somehow limit the IPs are that used by your Lambdas.. Lambdas can run within your own VPC, so you can still use the NAT Gateway approach. That means you have additional piece of infrastructure (the NAT Gateway) to pay for, but is probably the easiest way of managing the access.
IAM access
ElasticSearch service also provides IAM-based access policy - you can allow IAM users to access the cluster. To do that, each user has to sign their requests with AWS signature ( preferably version 4). This is the signature that is used by AWS Console, CLI and SDKs whenever they make requests to AWS API on your behalf. You can create that yourself as well, the process is described in the documentation and there are even examples available.
To use IAM-based access, use a policy like the following:
|
|
You then need to grant specific users or roles access to the cluster, via the IAM policies assigned to those users/roles (just like you grant access to other resources).
Your code (from your Lambda, or any server) needs to sign each request made to the cluster and does not have to use any specific IP address. This approach does require the signature to be implemented within the code, but there’s no additional infrastructure needed.
If you’re interested in how to generate the signature, check out the examples in documentation or take a look at my GitHub repository - it contains an example Lambda function that can be used for housekeeping of ElasticSearch indices. It performs GET and DELETE requests, which are easier to sign as they don’t have a payload. For signing POST requests, check out the examples - the payload needs to be included in the signing process.