ALB 502 Bad Gateway Error and EC2 refused to connect

0

Hi AWS, I was writing the Infrastructure code for web application using Terraform. The requirements are:

  1. It must include a VPC which enables future growth / scale.
  2. It must include both a public and private subnet – where the private subnet is used for compute and the public is used for the load balancers.
  3. Assuming that the end-users only contact the load balancers and the underlying instance are accessed for management purposes, design a security group scheme which supports the minimal set of ports required for communication.
  4. The AWS generated load balancer hostname with be used for request to the public facing web application.
  5. An autoscaling group should be created which utilizes the latest AWS AMI
  6. The instance in the ASG Must contain both a root volume to store the application / services and must contain a secondary volume meant to store any log data bound from / var/log Must include a web server of your choice.
  7. Create self signed certificate for test.example.com and used this hostname with Load balancer, this dns should be resolve internally within VPC network with route 53 private hosted zone.

Also, the code should not be tightly coupled to your AWS account – it should be designed to that it can be deployed to any arbitrary AWS account.

Here is the code:

vpc.tf

# Resources to be created:
// Create a VPC
// public route table and routes
// private route table and routes
// public and private subnets
// internet gateway
// NAT gateway
// EIP for NAT gateway

resource "aws_vpc" "main" {
  cidr_block = var.cidr_block
  tags = {
    Name = "terraform_aws_vpc"
  }
  assign_generated_ipv6_cidr_block = true
  instance_tenancy                 = "default"
  enable_dns_hostnames             = true
  enable_dns_support               = true
}

# Create a Public Subnet
resource "aws_subnet" "public_subnet" {
  count                           = var.public_subnet_count
  vpc_id                          = aws_vpc.main.id
  cidr_block                      = cidrsubnet(aws_vpc.main.cidr_block, 4, count.index)
  ipv6_cidr_block                 = cidrsubnet(aws_vpc.main.ipv6_cidr_block, 8, count.index)
  map_public_ip_on_launch         = true
  assign_ipv6_address_on_creation = true
  tags = {
    Name = "${var.default_tags.project_name}-public-${data.aws_availability_zones.available.names[count.index]}"
  }
  availability_zone = data.aws_availability_zones.available.names[count.index]
}

# Create a Public Route table
resource "aws_route_table" "public_rt" {
  vpc_id = aws_vpc.main.id
  tags = {
    Name = "${var.default_tags.project_name}-public-route-table"
  }
}

# Create an Internet Gateway to access the route from internet
resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.main.id
  tags = {
    Name = "${var.default_tags.project_name}-internet-gateway"
  }
}

// Create a Public route
resource "aws_route" "public_route" {
  route_table_id         = aws_route_table.public_rt.id
  destination_cidr_block = "0.0.0.0/0"
  gateway_id             = aws_internet_gateway.igw.id
}

// Associate Route table with Subnet
resource "aws_route_table_association" "public_rt_subnet_associate" {
  count          = var.public_subnet_count
  subnet_id      = element(aws_subnet.public_subnet.*.id, count.index)
  route_table_id = aws_route_table.public_rt.id
}

// Create a Private Subnet
resource "aws_subnet" "private_subnet" {
  count      = var.private_subnet_count
  vpc_id     = aws_vpc.main.id
  cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 4, count.index + var.public_subnet_count)
  # ipv6_cidr_block = cidrsubnet(aws_vpc.main.ipv6_cidr_block, 8, count.index)
  # map_public_ip_on_launch = true
  # assign_ipv6_address_on_creation = true
  tags = {
    Name = "${var.default_tags.project_name}-private-${data.aws_availability_zones.available.names[count.index]}"
  }
  availability_zone = data.aws_availability_zones.available.names[count.index]
}

# Create a Private Route Table
resource "aws_route_table" "private_rt" {
  vpc_id = aws_vpc.main.id
  tags = {
    Name = "${var.default_tags.project_name}-private-route-table"
  }
}

# Create an EIP for NAT Gateway
resource "aws_eip" "nat_gateway" {
  vpc = true
  tags = {
    Name = "${var.default_tags.project_name}-nat-eip"
  }
}

# Create a NAT Gateway
resource "aws_nat_gateway" "nat_gw" {
  allocation_id = aws_eip.nat_gateway.id
  subnet_id     = aws_subnet.public_subnet.0.id

  tags = {
    Name = "${var.default_tags.project_name}-nat-gw"
  }

  # To ensure proper ordering, it is recommended to add an explicit dependency
  # on the Internet Gateway for the VPC.
  depends_on = [aws_eip.nat_gateway, aws_internet_gateway.igw]
}

// Create a Private route
resource "aws_route" "private_internet_access" {
  route_table_id         = aws_route_table.private_rt.id
  destination_cidr_block = "0.0.0.0/0"
  nat_gateway_id         = aws_nat_gateway.nat_gw.id
}

// Associate Private Route table with Subnet
resource "aws_route_table_association" "private_rt_subnet_associate" {
  count          = var.private_subnet_count
  subnet_id      = element(aws_subnet.private_subnet.*.id, count.index)
  route_table_id = aws_route_table.private_rt.id
}

asg.tf

# Launch Template
resource "aws_launch_template" "web_app_lt" {
  name = "web-app-launch-template"

  block_device_mappings {
    device_name = "/dev/xvda"
    ebs {
      volume_size = 30
    }
  }

  # Add a secondary volume for log data
  block_device_mappings {
    device_name = "/dev/xvdb"
    ebs {
      volume_size = 50
    }
  }

  image_id = data.aws_ami.amazon_linux.id

  instance_type = var.instance_type

  network_interfaces {
    associate_public_ip_address = true
    security_groups = [aws_security_group.client_alb.id]
    subnet_id = aws_subnet.public_subnet.0.id
  }

#   vpc_security_group_ids = [aws_security_group.client_alb.id]

  tags = {
      Name = "${var.default_tags.project_name}-lt"
    }

}

resource "aws_autoscaling_group" "web_app_asg" {
  availability_zones = ["us-east-1a"]
  desired_capacity   = 1
  max_size           = 2
  min_size           = 1

  launch_template {
    id      = aws_launch_template.web_app_lt.id
    version = aws_launch_template.web_app_lt.latest_version
  }

  tag {
    key                 = "Name"
    value               = "${var.default_tags.project_name}-asg"
    propagate_at_launch = true
  }

  instance_refresh {
    strategy = "Rolling"
    preferences {
      min_healthy_percentage = 50
    }
    triggers = ["tag"]
  }
}

security-groups.tf

resource "aws_security_group" "client_alb" {
  name        = "${var.default_tags.project_name}-alb"
  description = "security group for web application load balancer"
  vpc_id      = aws_vpc.main.id
  tags = {
    Name = "${var.default_tags.project_name}-sg"
  }
}

resource "aws_security_group_rule" "client_alb_allow_80" {
  security_group_id = aws_security_group.client_alb.id
  type              = "ingress"
  protocol          = "tcp"
  from_port         = 80
  to_port           = 80
  cidr_blocks       = ["0.0.0.0/0"]
  ipv6_cidr_blocks  = ["::/0"]
  description       = "Allow HTTP traffic."
}

resource "aws_security_group_rule" "client_alb_allow_22" {
  security_group_id = aws_security_group.client_alb.id
  type              = "ingress"
  protocol          = "tcp"
  from_port         = 22
  to_port           = 22
  cidr_blocks       = ["0.0.0.0/0"]
  ipv6_cidr_blocks  = ["::/0"]
  description       = "Allow SSH Login."
}

resource "aws_security_group_rule" "client_alb_allow_443" {
  security_group_id = aws_security_group.client_alb.id
  type              = "ingress"
  protocol          = "tcp"
  from_port         = 443
  to_port           = 443
  cidr_blocks       = ["0.0.0.0/0"]
  ipv6_cidr_blocks  = ["::/0"]
  description       = "Allow HTTP traffic."
}

# resource "aws_security_group_rule" "client_alb_allow_outbound" {
#   security_group_id = aws_security_group.client_alb.id
#   type              = "egress"
#   protocol          = "-1"
#   from_port         = 0
#   to_port           = 0
#   cidr_blocks       = ["0.0.0.0/0"]
#   ipv6_cidr_blocks  = ["::/0"]
#   description       = "Allow any outbound traffic."
# }

ec2-instance.tf

resource "aws_instance" "web_app" {
  ami                    = data.aws_ami.amazon_linux.id
  instance_type          = var.instance_type
  subnet_id              = aws_subnet.public_subnet.0.id
  vpc_security_group_ids = [aws_security_group.client_alb.id]
  user_data              = <<EOF
#!/bin/bash
sudo yum update -y
sudo yum install -y httpd
sudo systemctl start httpd
sudo systemctl enable httpd
sudo echo '<center><h1>Web App!!!</h1></center>' > /var/www/html/index.html
EOF
  tags = {
    Name = "${var.default_tags.project_name}-ec2-instance"
  }
  key_name                    = var.generated_key_name
  associate_public_ip_address = true
  monitoring                  = true
}

// Create key-pair for EC2 instance

resource "tls_private_key" "web_app_key" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "aws_key_pair" "generated_key" {
  key_name   = var.generated_key_name
  public_key = tls_private_key.web_app_key.public_key_openssh
  provisioner "local-exec" {
    command = <<-EOT
      echo '${tls_private_key.web_app_key.private_key_pem}' > web-app-keypair.pem
      chmod 400 web-app-keypair.pem
    EOT
  }
}

tls-cert.tf

resource "tls_self_signed_cert" "self_signed" {
#   key_algorithm   = tls_private_key.web_app_key.algorithm
  private_key_pem = tls_private_key.web_app_key.private_key_pem
  subject {
    common_name  = "test.example.com"
  }
  validity_period_hours = 8760

   allowed_uses = [
    "key_encipherment",
    "digital_signature",
    "server_auth",
  ]
  dns_names = [ "test.example.com" ]
}

route53.tf

# Route53 PHZ
resource "aws_route53_zone" "private" {
  name = "example.com"

  vpc {
    vpc_id = aws_vpc.main.id
  }
}

alb.tf

# User Facing Client Application Load Balancer
resource "aws_lb" "web_app_lb" {
  name               = "${var.default_tags.project_name}-lb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.client_alb.id]
  subnets            = [for subnet in aws_subnet.public_subnet : subnet.id]

  enable_deletion_protection = false
  tags = {
    "Name" = "${var.default_tags.project_name}-client-alb"
  }
}

// ALB Target Groups
resource "aws_lb_target_group" "alb_tg" {
  name        = "${var.default_tags.project_name}-tg"
  port        = 80
  protocol    = "HTTP"
  target_type = "instance"
  vpc_id      = aws_vpc.main.id
  health_check {
    healthy_threshold   = var.health_check["healthy_threshold"]
    interval            = var.health_check["interval"]
    unhealthy_threshold = var.health_check["unhealthy_threshold"]
    timeout             = var.health_check["timeout"]
    path                = var.health_check["path"]
  }
}


// Target Group Attachment
resource "aws_lb_target_group_attachment" "tg_attachment" {
  target_group_arn = aws_lb_target_group.alb_tg.arn
  target_id        = aws_instance.web_app.id
  port             = 80
}


// ALB Listener Rules
resource "aws_lb_listener" "http_rule" {
  load_balancer_arn = aws_lb.web_app_lb.arn
  port              = "80"
  protocol          = "HTTP"
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.alb_tg.arn
  }
}

# resource "aws_lb_listener" "https_rule" {
#   load_balancer_arn = aws_lb.web_app_lb.arn
#   port              = "443"
#   protocol          = "HTTPS"
#   ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
#   certificate_arn   = aws_acm_certificate.web_app_acm_cert.arn

#   default_action {
#     type             = "forward"
#     target_group_arn = aws_lb_target_group.alb_tg.arn
#   }
# }

data.tf

# Data Resource
data "aws_availability_zones" "available" {
  state = "available"
}

# AWS Linux2 AMI
data "aws_ami" "amazon_linux" {
  most_recent = true

  owners = ["amazon"]

  filter {
    name   = "name"
    values = ["al2023-ami-2023.3.20240131.0-kernel-6.1-x86_64"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

outputs.tf:

output "load_balancer_dns" {
  value = aws_lb.web_app_lb.dns_name
}

Having said that I have a couple of questions around the same:

  1. Is there anything which I have missed from the requirements point of view?
  2. I am seeing two EC2 instances one created using ec2-instance.tf and other one is asg.tf, is there something which I am doing wrong way?
  3. When I uncommented this code in alb.tf
# resource "aws_lb_listener" "https_rule" {
#   load_balancer_arn = aws_lb.web_app_lb.arn
#   port              = "443"
#   protocol          = "HTTPS"
#   ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
#   certificate_arn   = aws_acm_certificate.web_app_acm_cert.arn

#   default_action {
#     type             = "forward"
#     target_group_arn = aws_lb_target_group.alb_tg.arn
#   }
# }

I am getting this error: error creating ELBv2 Listener UnsupportedCertificate: The certificate must have a fully-qualified domain name, a supported signature, and a supported key size.

  1. When I am trying to hit AWS Load Balancer DNS endpoint I am getting the error as highlighted in the screenshot attached below. I checked the target group health status and it is showing unhealthy and the Reason is Request Timeout. I tried to update the NACLs and Security Group Inbound/Outbound Rules but no luck.

ALB 502 Error

  1. When I am trying to hit the ec2 public DNS I am getting refused to connect error. This happens when I created the instance using terraform. Attached the screenshot below for the same. Any reason why this is happening.

EC2 Refused to connect

  1. Last but not the least why the keypair is not gettting created properly for EC2.

Please help me out.

2 Answers
2

A few things:

  1. It must include both a public and private subnet – where the private subnet is used for compute and the public is used for the load balancers.

The EC2 instance is provisioned in a public subnet.

resource "aws_instance" "web_app" {
.
.
  subnet_id              = aws_subnet.public_subnet.0.id

Is it intentional that the load-balancer and the EC2 have the same security group associate with them? Usually the LB would have a much looser rule (e.g. allow traffic from all addresses on the internet, and pass it to the EC2) and the EC2 sitting behind it would have a much tighter rule (only accept traffic from the LB).

resource "aws_lb" "web_app_lb" {
.
.
  security_groups    = [aws_security_group.client_alb.id]

resource "aws_instance" "web_app" {
.
.
  vpc_security_group_ids = [aws_security_group.client_alb.id]

The client_alb security group has no outbound rules. This means that (i) the LB will not be able to pass any traffic onto any instances that sit behind it; and (ii) the EC2 has no outbound access to the internet (whether that's through an Internet Gateway or NAT Gateway, depending on whether the EC2 needs to live in a public or private subnet - see earlier comment).

This is also significant because the EC2 instance's user data script tries to install Apache, but as there is no outbound security group rule in the client_alb security group, it means the yum commands have no way of contacting the repo, and will just sit there until they eventually timeout.

  vpc_security_group_ids = [aws_security_group.client_alb.id]
  user_data              = <<EOF
#!/bin/bash
sudo yum update -y
sudo yum install -y httpd

The update to /var/www/html/index.html should also be made before starting Apache, not while it's running.

sudo systemctl start httpd
sudo systemctl enable httpd
sudo echo '<center><h1>Web App!!!</h1></center>' > /var/www/html/index.html
EOF
profile picture
EXPERT
Steve_M
answered 7 months ago
0

Hi Steve, thanks for answering first of all. I have made the changes for EC2 and ALB as suggested. Here are the code snippets:

ec2-instance.tf:

resource "aws_instance" "web_app" {
  ami                    = data.aws_ami.amazon_linux.id
  instance_type          = var.instance_type
  subnet_id              = aws_subnet.private_subnet.0.id
  vpc_security_group_ids = [aws_security_group.client_alb.id]
  user_data              = <<EOF
#!/bin/bash
sudo systemctl start httpd
sudo systemctl enable httpd
sudo echo '<center><h1>Web App!!!</h1></center>' > /var/www/html/index.html
EOF
  tags = {
    Name = "${var.default_tags.project_name}-ec2-instance"
  }
  key_name                    = var.generated_key_name
  associate_public_ip_address = true
  monitoring                  = true
}

// Create key-pair for EC2 instance

resource "tls_private_key" "web_app_key" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "aws_key_pair" "generated_key" {
  key_name   = var.generated_key_name
  public_key = tls_private_key.web_app_key.public_key_openssh
}

resource "local_sensitive_file" "pem_file" {
  filename             = pathexpand("~/.ssh/${local.ssh_key_name}.pem")
  file_permission      = "600"
  directory_permission = "700"
  content              = tls_private_key.web_app_key.private_key_pem
  provisioner "local-exec" {
    command = <<-EOT
      touch ec2-keypair.pem
      cp "~/.ssh/${local.ssh_key_name}.pem" ./ec2-keypair.pem
      chmod 400 ec2-keypair.pem
    EOT
  }
}

security-groups.tf:

resource "aws_security_group" "client_alb" {
  name        = "${var.default_tags.project_name}-alb"
  description = "security group for web application load balancer"
  vpc_id      = aws_vpc.main.id
  tags = {
    Name = "${var.default_tags.project_name}-sg"
  }
}

resource "aws_security_group_rule" "client_alb_allow_80" {
  security_group_id = aws_security_group.client_alb.id
  type              = "ingress"
  protocol          = "tcp"
  from_port         = 80
  to_port           = 80
  cidr_blocks       = ["0.0.0.0/0"]
  ipv6_cidr_blocks  = ["::/0"]
  description       = "Allow HTTP traffic."
}

resource "aws_security_group_rule" "client_alb_allow_22" {
  security_group_id = aws_security_group.client_alb.id
  type              = "ingress"
  protocol          = "tcp"
  from_port         = 22
  to_port           = 22
  cidr_blocks       = ["0.0.0.0/0"]
  ipv6_cidr_blocks  = ["::/0"]
  description       = "Allow SSH Login."
}

resource "aws_security_group_rule" "client_alb_allow_443" {
  security_group_id = aws_security_group.client_alb.id
  type              = "ingress"
  protocol          = "tcp"
  from_port         = 443
  to_port           = 443
  cidr_blocks       = ["0.0.0.0/0"]
  ipv6_cidr_blocks  = ["::/0"]
  description       = "Allow HTTP traffic."
}

resource "aws_security_group_rule" "client_alb_allow_outbound" {
  security_group_id = aws_security_group.client_alb.id
  type              = "egress"
  protocol          = "-1"
  from_port         = 0
  to_port           = 0
  cidr_blocks       = ["0.0.0.0/0"]
  ipv6_cidr_blocks  = ["::/0"]
  description       = "Allow any outbound traffic."
}


resource "aws_security_group" "ec2_security_group" {
  name        = "${var.default_tags.project_name}-ec2"
  description = "Security group for EC2 instances in the target group"
  vpc_id      = aws_vpc.main.id
  tags = {
    Name = "${var.default_tags.project_name}-ec2-sg"
  }

  ingress {
    from_port       = 80
    to_port         = 80
    protocol        = "tcp"
    security_groups = [aws_security_group.client_alb.id]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

But now I am facing this error when accessing the ALB DNS:

ALB Error 502

Target Group Error

Regarding your next ask, I also noticed that the EC2 and ALB are in the same security group. There is nothing intentional as such I just noticed I only created one SG which is utilized by both ALB and EC2, so as a good practice I need to have one SG for each. What do you recommend?

Also one last thing the local ssh key name is not getting copied to my pem file using terraform code, any reason why?

profile picture
answered 7 months ago
  • (1 of 2) There are two ways of approaching this - the first is to have a firm and fixed idea of what the end state needs to be, and then build it by hand in AWS Console. Know how everything fits together - subnets, route tables, security groups, listeners, target groups, internet & NAT gateways, as well as how the load-balancer and EC2 will talk to each other, and to the outside world.

    This might take an hour or two to get fully working, but it will be time well spent, and there are plenty of resources on AWS and the wider internet that explain how to build this far better than I ever could here.

    Once you know every element that is required, and how it all fits together, provision it in Terraform. You could even have separate VPCs, one of them populated by resources in AWS Console and the other populated with Terraform, and you can compare these as you go along.

  • (2 of 2) The other approach, if you just want to solely use Terraform, is to start small and don't try to build the whole stack at once. Have a VPC, one public subnet with an internet gateway attached, and deploy your EC2 in there. Get the User Data script working properly so that Apache displays your updated index.html. Once that works, provision a new private subnet with its routing table pointing to a NAT Gateway and amend your code so that the EC2 is provisioned in the private subnet. Okay you won't be able to hit it directly from a browser any more, but you can spin up a temporary bastion host (or just use Instance Connect) to check that Apache is properly installed, and is listeing on the right port(s) sudo netstat -tulpn

    Once you're happy with that, move onto the target group, then the load balancer and its listeners, and so on.

    Use git to commit your code at each big milestone, and if you make a mistake later you can always terraform destroy and then go back to your previous commit.

    And don't get hung up on things like the SSH key and cert having to be provisioned in Terraform if it's holding you up. Create these normally and just import them into your scripts initially. You can always go back and tweak these settings at the end.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions