Glue connection to Redshift Successful, but Glue job timed out

0

Hi, i have setup a connection from Glue to redshift using the connection page. I tested it and it was successful. then i created Visual ETL job and chosen the above-mentioned connection from the dropdown list. When i ran the job, the connection timed out and gave the below error:

2023-05-16 18:16:52,896 ERROR [main] glueexceptionanalysis.GlueExceptionAnalysisListener (Logging.scala:logError(9)): [Glue Exception Analysis] {
    "Event": "GlueETLJobExceptionEvent",
    "Timestamp": 1684261012891,
    "Failure Reason": "Traceback (most recent call last):\n  
	     File \"/tmp/rds_order_to_redshift\", line 275, in <module>\n    transformation_ctx=\"AmazonRedshift_node3\",\n  
		 File \"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py\", line 802, in from_options\n    
		 format_options, transformation_ctx)\n  File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 331, in write_dynamic_frame_from_options\n
		 format, format_options, transformation_ctx)\n  
		 File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 347, in write_from_options\n    
		 sink = self.getSink(connection_type, format, transformation_ctx, **new_options)\n  
		 File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 310, in getSink\n    
		 makeOptions(self._sc, options), transformation_ctx)\n  File \"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py\", line 1305, in __call__\n
		 answer, self.gateway_client, self.target_id, self.name)\n  
		 File \"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py\", line 111, in deco\n    
		 return f(*a, **kw)\n  File \"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py\", line 328, in get_return_value\n    
		 format(target_id, \".\", name), value)\npy4j.protocol.Py4JJavaError: An error occurred while calling o89.getSink.\n: java.sql.SQLException: [Amazon](500150) 
		 Error setting/closing connection: Connection timed out.\n\t
		 at com.amazon.redshift.client.PGClient.connect(Unknown Source)\n\t
		 at com.amazon.redshift.client.PGClient.<init>(Unknown Source)\n\t
		 at com.amazon.redshift.core.PGJDBCConnection.connect(Unknown Source)\n\t
		 at com.amazon.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)\n\t
		 at com.amazon.jdbc.common.AbstractDriver.connect(Unknown Source)\n\t
		 at com.amazon.redshift.jdbc.Driver.connect(Unknown Source)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectionProperties$5(JDBCUtils.scala:973)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$2(JDBCUtils.scala:925)\n\t
		 at scala.Option.getOrElse(Option.scala:121)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$1(JDBCUtils.scala:925)\n\t
		 at scala.Option.getOrElse(Option.scala:121)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.connectWithSSLAttempt(JDBCUtils.scala:925)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.connectionProperties(JDBCUtils.scala:969)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties$lzycompute(JDBCUtils.scala:739)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties(JDBCUtils.scala:739)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper.getRawConnection(JDBCUtils.scala:752)\n\t
		 at com.amazonaws.services.glue.RedshiftDataSink.<init>(RedshiftDataSink.scala:41)\n\t
		 at com.amazonaws.services.glue.GlueContext.getSink(GlueContext.scala:1059)\n\t
		 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\t
		 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\t
		 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\t
		 at java.lang.reflect.Method.invoke(Method.java:498)\n\t
		 at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\n\t
		 at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\n\t
		 at py4j.Gateway.invoke(Gateway.java:282)\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\n\t
		 at py4j.commands.CallCommand.execute(CallCommand.java:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n
		 Caused by: com.amazon.support.exceptions.GeneralException: [Amazon](500150) Error setting/closing connection: Connection timed out.\n\t
		 ... 28 more\nCaused by: java.net.ConnectException: Connection timed out\n\t
		 at sun.nio.ch.Net.connect0(Native Method)\n\tat sun.nio.ch.Net.connect(Net.java:482)\n\tat sun.nio.ch.Net.connect(Net.java:474)\n\tat sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:647)\n\tat sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:107)\n\tat com.amazon.redshift.client.PGClient.connect(Unknown Source)\n\tat com.amazon.redshift.client.PGClient.<init>(Unknown Source)\n\tat com.amazon.redshift.core.PGJDBCConnection.connect(Unknown Source)\n\tat com.amazon.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)\n\tat com.amazon.jdbc.common.AbstractDriver.connect(Unknown Source)\n\tat com.amazon.redshift.jdbc.Driver.connect(Unknown Source)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectionProperties$5(JDBCUtils.scala:973)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$2(JDBCUtils.scala:925)\n\tat scala.Option.getOrElse(Option.scala:121)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$1(JDBCUtils.scala:925)\n\tat scala.Option.getOrElse(Option.scala:121)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.connectWithSSLAttempt(JDBCUtils.scala:925)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.connectionProperties(JDBCUtils.scala:969)\n\tat com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties$lzycompute(JDBCUtils.scala:739)\n\tat com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties(JDBCUtils.scala:739)\n\tat com.amazonaws.services.glue.util.JDBCWrapper.getRawConnection(JDBCUtils.scala:752)\n\tat com.amazonaws.services.glue.RedshiftDataSink.<init>(RedshiftDataSink.scala:41)\n\tat com.amazonaws.services.glue.GlueContext.getSink(GlueContext.scala:1059)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\n\tat py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\n\tat py4j.Gateway.invoke(Gateway.java:282)\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\n\tat py4j.commands.CallCommand.execute(CallCommand.java:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n\tat java.lang.Thread.run(Thread.java:750)\n",
    "Stack Trace": [
        {
            "Declaring Class": "get_return_value",
            "Method Name": "format(target_id, \".\", name), value)",
            "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py",
            "Line Number": 328
        },
        {
            "Declaring Class": "deco",
            "Method Name": "return f(*a, **kw)",
            "File Name": "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",
            "Line Number": 111
        },
        {
            "Declaring Class": "__call__",
            "Method Name": "answer, self.gateway_client, self.target_id, self.name)",
            "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py",
            "Line Number": 1305
        },
        {
            "Declaring Class": "getSink",
            "Method Name": "makeOptions(self._sc, options), transformation_ctx)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
            "Line Number": 310
        },
        {
            "Declaring Class": "write_from_options",
            "Method Name": "sink = self.getSink(connection_type, format, transformation_ctx, **new_options)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
            "Line Number": 347
        },
        {
            "Declaring Class": "write_dynamic_frame_from_options",
            "Method Name": "format, format_options, transformation_ctx)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
            "Line Number": 331
        },
        {
            "Declaring Class": "from_options",
            "Method Name": "format_options, transformation_ctx)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py",
            "Line Number": 802
        },
        {
            "Declaring Class": "<module>",
            "Method Name": "transformation_ctx=\"AmazonRedshift_node3\",",
            "File Name": "/tmp/rds_order_to_redshift",
            "Line Number": 275
        }
    ],
    "Last Executed Line number": 275,
    "script": "rds_order_to_redshift"
}

posta un anno fa1164 visualizzazioni
1 Risposta
0
Risposta accettata

I've fixed this. my ETL job moves data from RDS to Redshift. Since the redshift connection alone works, and , the RDS connection alone works. I suspected that Redshift should allow RDS to talk to it.

So I've added the security group of the RDS into the inbound rules of the Redshift security group.

con risposta un anno fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande