Inquiry Regarding Spark Application Performance Discrepancy Across AWS Accounts in the Same Region

0

Overview: The Spark application in question is deployed within AWS Account A, specifically in the us-west-2 region. This application reads data from and writes data to Amazon S3 buckets hosted in Account B, also located in the us-west-2 region.

Issue: When the Spark application interacts with S3 buckets within the same AWS account (Account A), the job execution completes in approximately 1 hour and 10 minutes. However, when the Spark application accesses S3 buckets residing in a different AWS account (Account B) within the same region (us-west-2), the job execution significantly increases to 4 hours and 40 minutes.

Expectation: My understanding, based on AWS documentation and best practices, was that data transfer within the same AWS region, regardless of AWS account, would not incur additional latency or cost.

Request for Clarification: I am seeking clarification on why such a significant performance discrepancy occurs when the Spark application accesses S3 buckets in a different AWS account within the same region. Specifically, I would like to understand if there are any inherent limitations or factors that contribute to increased latency or reduced performance in this scenario.

I appreciate your assistance in addressing this matter and providing insights into potential reasons for the observed performance difference. Any recommendations or suggestions for optimizing performance in this multi-account scenario would be highly valuable.

Vikrant
已提问 4 个月前231 查看次数
1 回答
3

Hello,

This requires a deep analysis to find out where the bottleneck is. But, you can start with checking the spark UI and compare if the input/output datasets are same and any jobs that makes the significant difference in the job execution.

Further, you can also test a subset of same data to compare the execution time. Also, please check if they are utilizing the same resources/memory & spark configurations provisioned are identical.

AWS
支持工程师
已回答 3 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则

相关内容