Service Discovery over regional VPCs with Transit Gateway peering for ECS services using Service Connect?


For set up, I've deployed 2 ECS services with Service Connect enabled, one in us-east-1 and the other in us-west-2, each within their regional VPC and the VPCs connected to each other through Transit Gateway peering attachment.

According to the docs, I should be able to get the services to communicate across VPCs if they're "configured to use the same namespace" and that the "two services can resolve every endpoint in the namespace without any VPC DNS configuration". Despite having done this, I've not been able to get the services to communicate, neither through having Service Discovery DiscoverInstances from the other region OR directly calling the service endpoints provided by Service Connect.

My VPC network ACLs already allows traffic on the containerPorts on both sides, subnet route tables of each region pointing to each other's VPC CIDR blocks, Transit Gateway route table configured, am I missing something that's preventing them from communicating?

1 Answer

Wow, that's quite complex. Since you already checked on different things that might go wrong I would suggest these additional checks:

  • Service Discovery Configuration: Ensure that the service discovery configurations for both ECS services are correctly set up. This includes the namespace being the same and correctly configured in both regions. Double-check that the service discovery service name is unique within the namespace and correctly referenced by your ECS services.
  • -Security Groups: Review the security group settings for your ECS service tasks and Transit Gateway attachments. Ensure that the security groups allow inbound and outbound traffic for the necessary ports and protocols used by your services. Sometimes, the issue might be due to restrictive security group rules.
  • -Transit Gateway Peering Configuration: Verify the Transit Gateway peering configuration, including the peering connection itself and the route tables associated with the Transit Gateway in both regions. Ensure that the route tables have routes that direct traffic destined for the other VPC's CIDR block to the Transit Gateway peering connection.
  • -DNS Resolution: Although Service Connect is supposed to handle DNS resolution without additional VPC DNS configuration, it's worth verifying that the ECS tasks are able to resolve the DNS names of the services in the other region. You might need to test DNS resolution from within your ECS tasks to ensure they can resolve the service endpoints across regions.
  • -IAM Permissions: Check that the IAM roles associated with your ECS tasks and services have the necessary permissions for Service Connect, Service Discovery, and any other AWS services they interact with. Insufficient permissions can sometimes lead to communication failures.
  • -Service Connect Configuration: Revisit your Service Connect configuration to ensure that it's correctly set up for cross-region communication. This includes the setup on both the consumer and provider sides, ensuring that the Service Connect endpoints are correctly configured and reachable.
  • -Monitoring and Logging: Utilize CloudWatch Logs and VPC Flow Logs to monitor and log the traffic between the services. VPC Flow Logs can help identify if the traffic is reaching the intended destination and where it might be getting dropped.
  • -Test Connectivity: If possible, test the connectivity between the two regions using simpler resources (e.g., EC2 instances) to ensure that basic inter-region communication over the Transit Gateway peering connection is working as expected. This can help isolate whether the issue is with the ECS/Service Connect setup or the underlying network infrastructure.
profile picture
answered 4 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions