Connecting to white listed API from Lambda inside private subnet with NAT gateway

0

My goal is to have a request sent from a lambda with a static IP address. This is the current setup i have.

Enter image description here

I have created the following:

  • route table for private subnet (All traffic -> nat gateway)
  • route table for public subnet (All traffic -> internet gateway)
  • multiple API's inside private subnet
  • ALB pointing to private subnet, with IP whitelist
  • Route 53 record pointing to ALB
  • lambda inside VPC, within private subnet

However when i make requests from my lambda it is getting the following error:

ERROR! HTTPSConnectionPool(host='test-api', port=443): Max retries exceeded with url: /test 
(Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f1434cb9450>: 
Failed to establish a new connection: [Errno -2] Name or service not known'))

I am able to make API requests from other API's to this 'test-api' domain with no issues (So the traffic is clearly being sent through NAT gateway) and confirmed other IP's are unable to make requests to this domain.

I also am able to make requests to outside urls (https://google.com was fine) from inside the lambda.

I looked at the ENI flow logs for this lambda and i dont see the NAT gateway IP's anywhere.

But I also confirmed an ENI was created, and the lambda is (seemingly) connected to the VPC and inside private subnets. I can see this under the 'VPC configuration' on the console.

Is there something I may be missing? It seems as though my lambda is not sending traffic through my NAT gateway and im not sure why.

    # lambda
    resource "aws_lambda_function" "this" {
      function_name = "${var.project_name}-lambda"
      role          = aws_iam_role.this.arn
      package_type  = "Image"
      timeout       = 120
      image_uri     = "${resource.aws_ecr_repository.this.repository_url}:latest"
    
      vpc_config {
        security_group_ids = [aws_security_group.this.id]
        subnet_ids         = data.aws_subnets.private.ids
      }
    
    }
    
    # security group
    resource "aws_security_group" "this" {
      name        = "${var.project_name}"
      description = "SG for data loader lambda"
      vpc_id      = one(data.aws_vpcs.main.ids)
      egress {
        from_port   = 0
        to_port     = 0
        protocol    = "-1"
        cidr_blocks = ["0.0.0.0/0"]
      }
    }
    
    # IAM for lambda
    data "aws_iam_policy_document" "assume_role_lambda" {
      statement {
        effect = "Allow"
        principals {
          type        = "Service"
          identifiers = ["lambda.amazonaws.com"]
        }
        actions = ["sts:AssumeRole"]
      }
    }
    
    resource "aws_iam_role" "this" {
      name                 = "${var.project_name}-lambda"
      assume_role_policy   = data.aws_iam_policy_document.assume_role_lambda.json
    }
    
    resource "aws_iam_policy" "lambda_vpc_access" {
      name        = "${var.project_name}-lambda-vpc-access"
      description = "Allow lambda to create, describe and delete network interfaces so that it can access VPC resources"
      policy = jsonencode({
        Version = "2012-10-17"
        Statement = [
          {
            Action = [
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface",
              "ec2:AssignPrivateIpAddresses",
              "ec2:UnassignPrivateIpAddresses"
            ]
            Effect   = "Allow"
            Resource = "*"
          },
          {
            Action = [
              "logs:CreateLogGroup",
              "logs:CreateLogStream",
              "logs:PutLogEvents"
            ],
            Effect   = "Allow",
            Resource = "*"
          }
        ]
      })
    }

I then call my api simply like so inside the lambda:

    import requests
    def handler(event, context):
        try:
            requests.get(
                "https://my-api.com/test"
            )
        except Exception as e:
            print("ERROR!", e)
            raise e

3 Answers
2

I suggest you first check the URL for typos. If it's certain to be correct, then check in the DHCP option set for your VPC which DNS resolvers it's telling clients to use. Is it set to AmazonProvidedDNS or something else?

If you run dig or nslookup in the private subnets of your VPC, do they return IPv4 addresses (A records), IPv6 addresses (AAAA records), or both for the name? If the API is only available over IPv6 and your VPC is IPv4-only, the DNS resolver in the Lambda runtime wouldn't likely return any addresses, because no IPv4 addresses matching the client's network interface would be available. Connections would work fine from IPv6-enabled sources, however.

If there's nothing obviously wrong, you could consider enabling Route 53 Resolver logging for your VPC. It would give you detailed records of exactly which DNS names are being queried and which responses are returned: https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resolver-query-logs.html

EXPERT
Leo K
answered 2 months ago
profile picture
EXPERT
reviewed 2 months ago
  • Thank you again! So I checked what you mentioned:

    1. API URL is definitely correct - copy pasted into postman and got a response (I'm on an approved IP)
    2. DHCP option is set to AmazonProvidedDNS
    3. ALB is IPv4, and the route i created on route53 is an A record.

    I went ahead and enabled route 53 query logging and confirmed my suspicions. I see that the ALB is getting requests from the ENI IP created by my lambda - NOT the NAT gateway IP, which explains why it doesn't resolve. Thank you for this tip I had no idea this existed!

    Now my problem is how to get my lambda to actually go through the NAT gateway...I double checked the VPC config on my lambda and I see it is connected to the VPC within private subnets. The fact that it has internet access also means it seems like it should be going through the NAT gateway.

    EDIT: I saw the query to cloudwatch and thought this was for API, misunderstanding on my part.

  • After testing more this is what I see: { "version": "1.100000", "account_id": "xx", "region": "ap-northeast-1", "vpc_id": "xx", "query_timestamp": "2024-08-08T00:19:26Z", "query_name": "test-api", "query_type": "A", "query_class": "IN", "rcode": "NXDOMAIN", "answers": [], "srcaddr": "LAMBDA_ENI_IP", "srcport": "60230", "transport": "UDP", "srcids": { "instance": "xx" } }

    So it is sending the A query type, and it is the correct URL but from the wrong IP. (lambda eni IP)

1

Based on the error message, it sounds like the hostname my-api.com (whatever the actual name is) in the URL https://my-api.com/test isn't resolving to an IP address. That's why the VPC flow logs also wouldn't show any attempts made to connect to the destination.

EXPERT
Leo K
answered 2 months ago
  • Thank you for the response! Do you know why this would be the case? I am able to make requests to this URL from my approved IP's (curl command, and also from inside my other API, which is in ECS/private subnet). It seems like somehow the lambda is behaving differently from my API's even though they are both inside my private subnet so all requests should be going through my NAT gateway and having the correct IP.

    Also as I mentioned before I am able to send requests to other URL's with no issue (google for example)

0

Does the URL you're connecting to from inside the VPC actually point to the internet-facing ALB located in the same VPC? If so, I didn't catch that before. The Route 53 Resolver log shows an NXDOMAIN response, which means the name didn't resolve. Did you query the full domain name, such as test-api.mydomain.com, or just the hostname part, such as "test-api"?

In any case, if the Lambda is accessing an internet-facing ALB in the same VPC, it's a complex setup. The public DNS name would point to the public IPs of the ALB, which don't actually exist inside the VPC. If the ALB doesn't need to be reachable from outside the VPC, the simplest solution would be to make the ALB internal-only instead of internet-facing.

If the ALB needs to be accessible both from the public internet and from inside the VPC, one workaround would be to leave the current ALB as internet-facing and to create a second, internal ALB in the same VPC. Create a Route 53 private hosted zone (PHZ) for the hostname (not the domain name) of the API, such as "api.mydomain.com" (and not mydomain.com), and attach the PHZ to the VPC. Create an A alias apex (empty name) record in the PHZ that points to the internal-only ALB. https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zones-private.html

This way, when queried from the outside world, "api.mydomain.com" would point to the public IPs of the original, internet-facing ALB, but when queried from inside the VPC by the Lambda or other internal workloads, it would resolve to the private IPs of the new, internal ALB based on the PHZ attached to the VPC.

Many other workarounds exist, such as creating a separate VPC for your Lambda function and otherwise using your original design, or placing an internal NLB in front of the ALB and using the PHZ to point to the NLB, but I think creating the second, internal-only ALB would be the least convoluted option.

EXPERT
Leo K
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions