Help improve AWS Support Official channel in re:Post and share your experience - complete a quick three-question survey to earn a re:Post badge!
All Content tagged with Amazon EMR
Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.
Content language: English
Select tags to filter
Sort by most recent
449 results
We have an lambda function that creates a cluster on demand , the cluster logs goes by default to S3 bucket specified in loguri, now we have a requirement to get application logs in cloudwatch wheneve...
I have
2. 1. Issue Summary
My EMR cluster fails with errors indicating "data source not found" in logs.
Cluster apps (Spark, Hive, Livy) seem unable to locate input data, but the exact cause is uncl...
Is it possible to reuse a single Load Balancer when deploying multiple interactive endpoints that will be associated with different users using EMR Studio? Implementation is on Amazon EMR on EKS. Refe...
Hello,
I am attempting to write to AWS Neptune using Neo4j Connector for Spark, as stated in the [compatibility document](https://docs.aws.amazon.com/neptune/latest/userguide/migration-compatibility....
In my account, I have two Glue Catalogs (one is the default catalog, AWSDataCatalog, and another catalog is shared from a different account). How can I access the databases in both catalogs from EMR E...
Hi,
I have been looking into a solution option that uses the Athena invoker_principal to get the ARN of the IAM role being used into the SQL query.
Is there a way to do the same if EMR or Redshift...
We're currently running EMR clusters with release version 6.10.0 where instances are patched using SSM "AWS-RunPatchBaseline" during bootstrap. We're experiencing several critical issues: cluster fail...
How much should be approx time taken for EMR batch processing and storing data in Redshift for 1 TB data with simple transformation. I have following characteristics for data
* File size varies from...
I have a use case with
* 60 MB/sec data volume
* Near real time use cases of AI/Data science as downstream applications should be supported
* It's not a ultra-low latency use case, even 60 seconds of...
After upgrading EMR from 6.5 to 7.5 I am getting following error
OpensslCipher: Failed to load OpenSSL Cipher.java.lang.UnsatisfiedLinkError: EVP_CIPHER_CTX_block_sizeBased on the HADOOP-18994
Failed...
I would like to confirm whether it is possible to configure an Amazon EMR cluster with mixed instance types, combining both Graviton-based and non-Graviton instances within the same cluster. I'm going...
I'm trying to run an EMR notebook to create a delta table in S3.
EMR Cluster Version: emr-7.7.0
Installed Applications: Hadoop 3.4.0, Hive 3.1.3, JupyterEnterpriseGateway 2.6.0, Livy 0.8.0, Spark 3.5...