Deploying an accessible RabbitMQ container to Fargate

2

I have a Python project that requires a Celery Beat service and I want to use RabbitMQ as the broker. I want to put this all in an ECS Cluster and use Fargate, and hopefully minimize my security risk. I'm using CDK and have the following configuration:

class Cluster(cdk.NestedStack):
    def __init__(self, scope: Construct, id: str, **kwargs) -> None:
        super().__init__(scope, id, **kwargs)

        vpc = ec2.Vpc(
            self,
            VPC_NAME,
            max_azs=2,
            enable_dns_hostnames=True,
            enable_dns_support=True,
        )
        sg = ec2.SecurityGroup(
            self,
            SECURITY_GROUP_NAME,
            vpc=vpc,
        )
        sg.connections.allow_from_any_ipv4(
            ec2.Port.tcp(5672),
        )
        cluster = ecs.Cluster(
            self,
            CLUSTER_NAME,
            vpc=vpc,
            cluster_name=CLUSTER_NAME,
            container_insights=True,
        )

        role = iam.Role(
            self,
            ROLE_NAME,
            assumed_by=iam.ServicePrincipal("ecs-tasks.amazonaws.com"),
            managed_policies=[...],
        )

        beat_repository = ecr.Repository(
            self,
            BEAT_IMAGE_NAME,
            repository_name=BEAT_IMAGE_NAME,
        )
        beat_task_definition = ecs.FargateTaskDefinition(
            self,
            BEAT_TASK_DEFINITION_NAME,
            cpu=1024,
            memory_limit_mib=2048,
            family=BEAT_TASK_DEFINITION_NAME,
            task_role=role,
        )
        beat_task_definition.add_container(
            BEAT_CONTAINER_NAME,
            image=ecs.ContainerImage.from_ecr_repository(beat_repository),
            command=...,
            logging=ecs.LogDrivers.aws_logs(stream_prefix="ecs"),
            port_mappings=[ecs.PortMapping(container_port=8000)],
            # health_check=TODO
        )
        beat_service = ecs.FargateService(
            self,
            BEAT_SERVICE_NAME,
            cluster=cluster,
            task_definition=beat_task_definition,
            service_name=BEAT_SERVICE_NAME,
            security_groups=[sg],
        )

        rabbit_repository = ecr.Repository(
            self,
            RABBIT_IMAGE_NAME,
            repository_name=RABBIT_IMAGE_NAME,
        )
        rabbit_task_definition = ecs.FargateTaskDefinition(
            self,
            RABBIT_TASK_DEFINITION_NAME,
            cpu=1024,
            memory_limit_mib=2048,
            family=RABBIT_TASK_DEFINITION_NAME,
            task_role=role,
        )
        rabbit_task_definition.add_container(
            RABBIT_CONTAINER_NAME,
            image=ecs.ContainerImage.from_ecr_repository(rabbit_repository),
            logging=ecs.LogDrivers.aws_logs(stream_prefix="ecs"),
            port_mappings=[ecs.PortMapping(container_port=5672)],
            health_check=ecs.HealthCheck(
                command=["CMD-SHELL", "rabbitmq-diagnostics -q ping || exit 1"],
            ),
        )

        rabbit_service = ecs_patterns.NetworkLoadBalancedFargateService(
            self,
            RABBIT_SERVICE_NAME,
            cluster=cluster,
            task_definition=rabbit_task_definition,
            service_name=RABBIT_SERVICE_NAME,
            listener_port=5672,
            public_load_balancer=True,
            assign_public_ip=True,
        )
        rabbit_service.target_group.configure_health_check(
            protocol=elbv2.Protocol.TCP,
            port="5672",
        )

The Dockerfile for the Rabbit MQ container exposes 5672 and runs rabbitmq-server. I have a few questions:

  • Is there a way for me to access the containers in the rabbit_service from containers in the beat_service WITHOUT exposing my RabbitMQ to the internet? Can I shrink the security group ingress rule?
  • The RabbitMQ service deploys fine and containers come up in a healthy state. But, the containers get killed and in the console I see
    Task failed ELB health checks in (target-group arn:aws:elasticloadbalancing:...
    
    Why would the health checks for the target group fail? As far as I can tell, they are configured to ping TCP:5672.

Please let me know if there are optimizations I can make. Thanks!

  • I'll leave my answer as a comment because is not a full answer, is more guidance or a direction to follow on your research.

    For the first question, there are two alternatives,

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions