Glue running in Docker not able to find com.mysql.cj.jdbc.Driver

0

Following along with this blog post I'm attempting to debug/breakpoint my glue tasks running in VS Code using amazon/aws-glue-libs:glue_libs_3.0.0_image_01.

I can get up to the point where the job executes and I can step through the code right up until the point I try and connect to RDS to fetch data. As soon as I do I get back

An error occurred while calling o47.getDynamicFrame.
: java.lang.ClassNotFoundException: com.mysql.cj.jdbc.Driver
	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at com.amazonaws.services.glue.util.JDBCUtils.loadDriver(JDBCUtils.scala:214)
	at com.amazonaws.services.glue.util.JDBCUtils.loadDriver$(JDBCUtils.scala:212)
	at com.amazonaws.services.glue.util.MySQLUtils$.loadDriver(JDBCUtils.scala:490)
	at com.amazonaws.services.glue.util.JDBCWrapper.getRawConnection(JDBCUtils.scala:746)
	at com.amazonaws.services.glue.JDBCDataSource.getPrimaryKeys(DataSource.scala:1006)
	at com.amazonaws.services.glue.JDBCDataSource.$anonfun$getJdbcJobBookmark$1(DataSource.scala:878)
	at scala.collection.MapLike.getOrElse(MapLike.scala:131)
	at scala.collection.MapLike.getOrElse$(MapLike.scala:129)
	at scala.collection.AbstractMap.getOrElse(Map.scala:63)
	at com.amazonaws.services.glue.JDBCDataSource.getJdbcJobBookmark(DataSource.scala:878)
	at com.amazonaws.services.glue.JDBCDataSource.getDynamicFrame(DataSource.scala:953)
	at com.amazonaws.services.glue.DataSource.getDynamicFrame(DataSource.scala:99)
	at com.amazonaws.services.glue.DataSource.getDynamicFrame$(DataSource.scala:99)
	at com.amazonaws.services.glue.SparkSQLDataSource.getDynamicFrame(DataSource.scala:714)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:750)

I'm not sure how to solve this problem. I see in the blog post its mentioned that I can pass in extra libraries, however when I look in /home/glue_user/aws-glue-libs/jars I can see a jar named mssql-jdbc-7.0.0.jre8.jar so I'm not so sure thats the problem. I should mention this job runs without a problem when deployed to AWS.

I'm currently starting up the amazon/aws-glue-libs:glue_libs_3.0.0_image_01 using a very basic docker-compose file

version: "3.8"
services:
  glue:
    container_name: "glue-local-development"
    image: amazon/aws-glue-libs:glue_libs_3.0.0_image_01
    ports:
      - "4040:4040"
      - "18080:18080"
    environment:
      - DISABLE_SSL=true
      - AWS_PROFILE=my_profile
    volumes:
      - ~/.aws:/home/glue_user/.aws
      - ${PWD}:/home/glue_user/workspace/
    stdin_open: true

Then connecting as per the blog post. Is there something else I have to do here?

I don't think I should have to manually load in the mysql jars?

I've been stuck at this point for awhile so would really appreciate any help or suggestions people have

Edit:

Interestingly when I attempt to run amazon/aws-glue-libs:glue_libs_2.0.0_image_01 it fails with a very similar but different error

: An error occurred while calling o49.getDynamicFrame.
: java.io.FileNotFoundException:  (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at com.amazonaws.glue.jdbc.commons.CustomCertificateManager.importCustomJDBCCert(CustomCertificateManager.java:127)
        at com.amazonaws.services.glue.util.JDBCWrapper$.connectionProperties(JDBCUtils.scala:947)
        at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties$lzycompute(JDBCUtils.scala:734)
        at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties(JDBCUtils.scala:734)
        at com.amazonaws.services.glue.util.JDBCWrapper.getRawConnection(JDBCUtils.scala:747)
        at com.amazonaws.services.glue.JDBCDataSource.getPrimaryKeys(DataSource.scala:996)
        at com.amazonaws.services.glue.JDBCDataSource$$anonfun$33.apply(DataSource.scala:868)
        at com.amazonaws.services.glue.JDBCDataSource$$anonfun$33.apply(DataSource.scala:868)
        at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
        at scala.collection.AbstractMap.getOrElse(Map.scala:59)
        at com.amazonaws.services.glue.JDBCDataSource.getJdbcJobBookmark(DataSource.scala:868)
        at com.amazonaws.services.glue.JDBCDataSource.getDynamicFrame(DataSource.scala:943)
        at com.amazonaws.services.glue.DataSource$class.getDynamicFrame(DataSource.scala:97)
        at com.amazonaws.services.glue.SparkSQLDataSource.getDynamicFrame(DataSource.scala:707)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:282)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Thread.java:750)
질문됨 일 년 전1051회 조회
1개 답변
0

Hi,

I see that you are receiving the following error while trying to connect to RDS when you are following the blog post “Develop and test AWS Glue version 3.0 jobs locally using a Docker container” :

==========

An error occurred while calling o47.getDynamicFrame. : java.lang.ClassNotFoundException: com.mysql.cj.jdbc.Driver

==========

I believe the issue here is with the class name “com.mysql.cj.jdbc.Driver”. You can provide a custom driver by uploading the connector "MySQL connector J 5.1.49" to S3 and referencing it in the parameter "customJdbcDriverS3Path". The glue code snippet for example would look similar to :

==========

connection_mysql_options = { "url": "jdbc:mysql://<host>:3306/dbName", "dbtable": <dbTableName>, "user": <userName>, "password": <password, "customJdbcDriverS3Path": "s3://s3bucket/path/mysql-connector-java-5.1.49.jar", "customJdbcDriverClassName": "com.mysql.jdbc.Driver" }

datasource0 = glueContext.create_dynamic_frame.from_options(connection_type="mysql", connection_options=connection_mysql_options,transformation_ctx="datasource0")

==========

If the issue still persist, then please open a support case with AWS providing the connection details and code snippet used - https://docs.aws.amazon.com/awssupport/latest/user/case-management.html#creating-a-support-case

Thank you.

AWS
지원 엔지니어
답변함 일 년 전
  • Seems a bit odd that given that driver already exists in the container, and I'm trying to connect to a catalog via glue_context.create_dynamic_frame.from_catalog which works in production, that I'd have to go and change all my code just to debug a glue job in a docker container?

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠