EMR serverless on arm64

0

I want to configure jar (deequ-2.0.1-spark-3.2.jar) on EMR serverless arm64. This works for x86_64 but doesn't work for arm64 architecture. Could you please consider this matter it gave this error :

  https://repos.spark-packages.org/com/amazon/deequ/deequ/2.0.1-spark-3.2/deequ-2.0.1-spark-3.2.jar

	::::::::::::::::::::::::::::::::::::::::::::::

	::          UNRESOLVED DEPENDENCIES         ::

	::::::::::::::::::::::::::::::::::::::::::::::

	:: com.amazon.deequ#deequ;2.0.1-spark-3.2: not found

:::: ERRORS Server access error at url https://repo1.maven.org/maven2/com/amazon/deequ/deequ/2.0.1-spark-3.2/deequ-2.0.1-spark-3.2.pom (java.net.ConnectException: Connection timed out (Connection timed out))

Server access error at url https://repo1.maven.org/maven2/com/amazon/deequ/deequ/2.0.1-spark-3.2/deequ-2.0.1-spark-3.2.jar (java.net.ConnectException: Connection timed out (Connection timed out))

Server access error at url https://repos.spark-packages.org/com/amazon/deequ/deequ/2.0.1-spark-3.2/deequ-2.0.1-spark-3.2.pom (java.net.ConnectException: Connection timed out (Connection timed out))

Server access error at url https://repos.spark-packages.org/com/amazon/deequ/deequ/2.0.1-spark-3.2/deequ-2.0.1-spark-3.2.jar (java.net.ConnectException: Connection timed out (Connection timed out))

:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: com.amazon.deequ#deequ;2.0.1-spark-3.2: not found] at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1494) at org.apache.spark.util.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:185) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:311) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:944) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1090) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1099) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

  • Make sure you've configured your EMR Serverless application with VPC connectivity. By default, EMR Serverless only has access to a few AWS services in the same region.

質問済み 1年前546ビュー
2回答
2
承認された回答

Hi,

The error message you encountered indicates that the required dependency, com.amazon.deequ#deequ;2.0.1-spark-3.2, could not be found in the Maven repositories. The connection timeout error suggests that the dependency resolution process failed to retrieve the required files from the Maven repositories.

To address this issue, you can try the following solutions:

Check Internet Connectivity: Ensure that the EMR serverless arm64 instance has proper internet connectivity to access the Maven repositories. You can test the connectivity by running other commands that require internet access on the instance.

Update Maven Repository Configuration: If the instance has internet access, verify that the Maven repository configuration is correct. Check if the Maven repository URLs are properly configured in the settings.xml file located in the .m2 directory in the user's home directory. Make sure the necessary repositories, such as Maven Central and Spark Packages, are included and accessible.

Try a Different Repository: If the default Maven repositories are inaccessible, you can try using alternative repositories.

Alternatively, you can manually download the required JAR file and its dependencies from a different source and install them on the EMR serverless arm64 instance.

profile pictureAWS
回答済み 1年前
profile picture
エキスパート
レビュー済み 7ヶ月前
0

Thanks.. this was the issue "Check Internet Connectivity " , I didn't configure VPC .

回答済み 1年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン