Questions tagged with Amazon EMR

Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

317 results
Good morning, As recently, a vulnerability on Resource Manager has been exploited, we are worried and want to confirm with you about the impact. (https://thehackernews.com/2024/01/cryptominers-target...
2
answers
0
votes
400
views
asked a year ago
I am trying to install happybase package on Zeppelin notebook ( or for that matter any package ) . How do I do a pip install from a Zeppelin cell . %pip or !pip is not recognized
2
answers
0
votes
467
views
asked a year ago
Is there a way to check the integrity of files copied with S3DistCp at the end of the copy, like DistCp checksum?
1
answers
0
votes
405
views
asked a year ago
EMR had 1 primary, 1 core and 5 task nodes. All 3 group of nodes were on demand (including task group). I didn't use spot purchasing for task group to avoid unexpected termination. But still EMR term...
1
answers
0
votes
736
views
asked a year ago
In AWS EMR, I encountered the following error message when running a pyspark job, which ran successfully on my local machine. > [System Error] Fail to delete the temp folder Is there a way to troub...
Accepted AnswerAmazon EMR
1
answers
0
votes
371
views
asked a year ago
When using EMR 7.0.0 in EMR Serverless (have not tried EKS or EC2), after connecting to the application through a EMR Studio workspace, the pyspark kernel doesn't work in a notebook. It stays in statu...
1
answers
0
votes
515
views
asked a year ago
Hi, We have an EMR cluster with multiple concurrent steps gets executed seamlessly. Not sure what happened certainly, but the step logs, application logs are not published to s3 from yesterday. Howev...
Accepted AnswerAmazon EMR
3
answers
0
votes
933
views
asked a year ago
I am trying to use aws emr-serverless get-dashboard-for-job-run cli command to pull information from emr-serverless but am stumped. This command returns a url and auth token. If I go to the url, it ...
0
answers
0
votes
157
views
asked a year ago
Hi, after EMR 7.0.0 was released in the previous week, we wanted to start using it. # Problem We have shell script EMR steps that are executed during the start of the cluster. These EMR steps never ...
Accepted AnswerAmazon EC2Amazon EMR
2
answers
0
votes
575
views
asked a year ago
Hi, One of my dev team members, asking to share the emr spark artifacts s3 location for building a Java application. I referred this doc https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-artifa...
Accepted AnswerAmazon EMR
1
answers
1
votes
377
views
asked a year ago
We have a airflow setup runs the EMR jobs daily basis. I noticed an odd behavior that when I resubmit job for calculating the adhoc reports, spark application failed with below error, arguments seems...
Accepted AnswerAmazon EMR
1
answers
0
votes
288
views
asked a year ago
When I try to create a new workspace for an AWS EMR Studio in the AWS Console, I get a blank page and a Javascript error in the console ("Failed to execute 'mark' on 'Performance': Symbol(react.elemen...
0
answers
0
votes
238
views
asked a year ago