Questions tagged with Amazon EMR

Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

317 results
I need to load data from Kinesis Data Streams to EMR via EMR Studio. I Follow this sample but doesn't work: https://github.com/awslabs/spark-sql-kinesis-connector
1
answers
0
votes
1.6K
views
AWS
asked 9 months ago
I want to run EMR On-Premises, no Spark. The question, is possible to run EMR (https://aws.amazon.com/emr/) on EKS Anywhere (https://aws.amazon.com/es/eks/eks-anywhere/) Also, we don't have support ...
1
answers
0
votes
497
views
profile picture
asked 9 months ago
I am working with Step Function, and I have a MAP type step to which I pass an S3 path in which there is a csv on which the MAP has to iterate. In each loop of the map, a script is executed with the c...
2
answers
0
votes
731
views
asked 9 months ago
**Overview:** The Spark application in question is deployed within AWS Account A, specifically in the us-west-2 region. This application reads data from and writes data to Amazon S3 buckets hosted in ...
1
answers
0
votes
315
views
asked 10 months ago
EMR API call? Trying to determine if there an API call to determine if "Automatically apply latest Amazon Linux updates" for EMR cluster was checked..
1
answers
0
votes
435
views
profile pictureAWS
asked 10 months ago
Hi team We want to use an EMR Cluster to process data with spark jobs We have 30,000 files per day and approximately 2Gb of information, later it is planned that this will grow. We have a small cluste...
Accepted AnswerAmazon EC2Amazon EMR
1
answers
0
votes
457
views
asked 10 months ago
I am using Jupyter notebook within Amazon EMR studio. I try to run my Jupyter notebook code and I get a kernel-related error (see attached screenshot). Also, my EMR instance is using an EC2 cluster. I...
1
answers
0
votes
625
views
profile picture
asked 10 months ago
* EMR Version: 6.15.0 * Spark Conf * "spark.sql.catalog.spark_catalog": "org.apache.iceberg.spark.SparkSessionCatalog" * "spark.sql.catalog.spark_catalog.catalog-impl": "org.apache.iceberg.aws...
1
answers
0
votes
611
views
asked 10 months ago
Hi Team, We are trying to setup hive with external metastore running in Aurora MySQL 8 , we are using emr 6.15.0 and we used the instructions from the AWS documentation . We are able to successfully ...
1
answers
0
votes
428
views
asked 10 months ago
The Zero ETL Integration for replicating data to Redshift from Aurora PostgreSQL is currently in "Preview", as [this post specifies ](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/zero-...
1
answers
0
votes
562
views
asked a year ago
Is there a way to use s3-dist-cp to copy files from a bucket that uses Requestor payments?
2
answers
0
votes
444
views
asked a year ago
Upgrading from EMR versions 6.11 to 6.12 (even tried 7.0.0), I'm seeing these errors on the same exact job with the same resources - has something changed with how EMRFS has been implemented? What is ...
1
answers
0
votes
2.8K
views
asked a year ago