Glue connection to Redshift Successful, but Glue job timed out

0

Hi, i have setup a connection from Glue to redshift using the connection page. I tested it and it was successful. then i created Visual ETL job and chosen the above-mentioned connection from the dropdown list. When i ran the job, the connection timed out and gave the below error:

2023-05-16 18:16:52,896 ERROR [main] glueexceptionanalysis.GlueExceptionAnalysisListener (Logging.scala:logError(9)): [Glue Exception Analysis] {
    "Event": "GlueETLJobExceptionEvent",
    "Timestamp": 1684261012891,
    "Failure Reason": "Traceback (most recent call last):\n  
	     File \"/tmp/rds_order_to_redshift\", line 275, in <module>\n    transformation_ctx=\"AmazonRedshift_node3\",\n  
		 File \"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py\", line 802, in from_options\n    
		 format_options, transformation_ctx)\n  File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 331, in write_dynamic_frame_from_options\n
		 format, format_options, transformation_ctx)\n  
		 File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 347, in write_from_options\n    
		 sink = self.getSink(connection_type, format, transformation_ctx, **new_options)\n  
		 File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 310, in getSink\n    
		 makeOptions(self._sc, options), transformation_ctx)\n  File \"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py\", line 1305, in __call__\n
		 answer, self.gateway_client, self.target_id, self.name)\n  
		 File \"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py\", line 111, in deco\n    
		 return f(*a, **kw)\n  File \"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py\", line 328, in get_return_value\n    
		 format(target_id, \".\", name), value)\npy4j.protocol.Py4JJavaError: An error occurred while calling o89.getSink.\n: java.sql.SQLException: [Amazon](500150) 
		 Error setting/closing connection: Connection timed out.\n\t
		 at com.amazon.redshift.client.PGClient.connect(Unknown Source)\n\t
		 at com.amazon.redshift.client.PGClient.<init>(Unknown Source)\n\t
		 at com.amazon.redshift.core.PGJDBCConnection.connect(Unknown Source)\n\t
		 at com.amazon.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)\n\t
		 at com.amazon.jdbc.common.AbstractDriver.connect(Unknown Source)\n\t
		 at com.amazon.redshift.jdbc.Driver.connect(Unknown Source)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectionProperties$5(JDBCUtils.scala:973)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$2(JDBCUtils.scala:925)\n\t
		 at scala.Option.getOrElse(Option.scala:121)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$1(JDBCUtils.scala:925)\n\t
		 at scala.Option.getOrElse(Option.scala:121)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.connectWithSSLAttempt(JDBCUtils.scala:925)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.connectionProperties(JDBCUtils.scala:969)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties$lzycompute(JDBCUtils.scala:739)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties(JDBCUtils.scala:739)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper.getRawConnection(JDBCUtils.scala:752)\n\t
		 at com.amazonaws.services.glue.RedshiftDataSink.<init>(RedshiftDataSink.scala:41)\n\t
		 at com.amazonaws.services.glue.GlueContext.getSink(GlueContext.scala:1059)\n\t
		 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\t
		 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\t
		 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\t
		 at java.lang.reflect.Method.invoke(Method.java:498)\n\t
		 at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\n\t
		 at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\n\t
		 at py4j.Gateway.invoke(Gateway.java:282)\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\n\t
		 at py4j.commands.CallCommand.execute(CallCommand.java:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n
		 Caused by: com.amazon.support.exceptions.GeneralException: [Amazon](500150) Error setting/closing connection: Connection timed out.\n\t
		 ... 28 more\nCaused by: java.net.ConnectException: Connection timed out\n\t
		 at sun.nio.ch.Net.connect0(Native Method)\n\tat sun.nio.ch.Net.connect(Net.java:482)\n\tat sun.nio.ch.Net.connect(Net.java:474)\n\tat sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:647)\n\tat sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:107)\n\tat com.amazon.redshift.client.PGClient.connect(Unknown Source)\n\tat com.amazon.redshift.client.PGClient.<init>(Unknown Source)\n\tat com.amazon.redshift.core.PGJDBCConnection.connect(Unknown Source)\n\tat com.amazon.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)\n\tat com.amazon.jdbc.common.AbstractDriver.connect(Unknown Source)\n\tat com.amazon.redshift.jdbc.Driver.connect(Unknown Source)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectionProperties$5(JDBCUtils.scala:973)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$2(JDBCUtils.scala:925)\n\tat scala.Option.getOrElse(Option.scala:121)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$1(JDBCUtils.scala:925)\n\tat scala.Option.getOrElse(Option.scala:121)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.connectWithSSLAttempt(JDBCUtils.scala:925)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.connectionProperties(JDBCUtils.scala:969)\n\tat com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties$lzycompute(JDBCUtils.scala:739)\n\tat com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties(JDBCUtils.scala:739)\n\tat com.amazonaws.services.glue.util.JDBCWrapper.getRawConnection(JDBCUtils.scala:752)\n\tat com.amazonaws.services.glue.RedshiftDataSink.<init>(RedshiftDataSink.scala:41)\n\tat com.amazonaws.services.glue.GlueContext.getSink(GlueContext.scala:1059)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\n\tat py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\n\tat py4j.Gateway.invoke(Gateway.java:282)\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\n\tat py4j.commands.CallCommand.execute(CallCommand.java:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n\tat java.lang.Thread.run(Thread.java:750)\n",
    "Stack Trace": [
        {
            "Declaring Class": "get_return_value",
            "Method Name": "format(target_id, \".\", name), value)",
            "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py",
            "Line Number": 328
        },
        {
            "Declaring Class": "deco",
            "Method Name": "return f(*a, **kw)",
            "File Name": "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",
            "Line Number": 111
        },
        {
            "Declaring Class": "__call__",
            "Method Name": "answer, self.gateway_client, self.target_id, self.name)",
            "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py",
            "Line Number": 1305
        },
        {
            "Declaring Class": "getSink",
            "Method Name": "makeOptions(self._sc, options), transformation_ctx)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
            "Line Number": 310
        },
        {
            "Declaring Class": "write_from_options",
            "Method Name": "sink = self.getSink(connection_type, format, transformation_ctx, **new_options)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
            "Line Number": 347
        },
        {
            "Declaring Class": "write_dynamic_frame_from_options",
            "Method Name": "format, format_options, transformation_ctx)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
            "Line Number": 331
        },
        {
            "Declaring Class": "from_options",
            "Method Name": "format_options, transformation_ctx)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py",
            "Line Number": 802
        },
        {
            "Declaring Class": "<module>",
            "Method Name": "transformation_ctx=\"AmazonRedshift_node3\",",
            "File Name": "/tmp/rds_order_to_redshift",
            "Line Number": 275
        }
    ],
    "Last Executed Line number": 275,
    "script": "rds_order_to_redshift"
}

asked a year ago1131 views
1 Answer
0
Accepted Answer

I've fixed this. my ETL job moves data from RDS to Redshift. Since the redshift connection alone works, and , the RDS connection alone works. I suspected that Redshift should allow RDS to talk to it.

So I've added the security group of the RDS into the inbound rules of the Redshift security group.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions