Glue connection to Redshift Successful, but Glue job timed out

0

Hi, i have setup a connection from Glue to redshift using the connection page. I tested it and it was successful. then i created Visual ETL job and chosen the above-mentioned connection from the dropdown list. When i ran the job, the connection timed out and gave the below error:

2023-05-16 18:16:52,896 ERROR [main] glueexceptionanalysis.GlueExceptionAnalysisListener (Logging.scala:logError(9)): [Glue Exception Analysis] {
    "Event": "GlueETLJobExceptionEvent",
    "Timestamp": 1684261012891,
    "Failure Reason": "Traceback (most recent call last):\n  
	     File \"/tmp/rds_order_to_redshift\", line 275, in <module>\n    transformation_ctx=\"AmazonRedshift_node3\",\n  
		 File \"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py\", line 802, in from_options\n    
		 format_options, transformation_ctx)\n  File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 331, in write_dynamic_frame_from_options\n
		 format, format_options, transformation_ctx)\n  
		 File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 347, in write_from_options\n    
		 sink = self.getSink(connection_type, format, transformation_ctx, **new_options)\n  
		 File \"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py\", line 310, in getSink\n    
		 makeOptions(self._sc, options), transformation_ctx)\n  File \"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py\", line 1305, in __call__\n
		 answer, self.gateway_client, self.target_id, self.name)\n  
		 File \"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py\", line 111, in deco\n    
		 return f(*a, **kw)\n  File \"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py\", line 328, in get_return_value\n    
		 format(target_id, \".\", name), value)\npy4j.protocol.Py4JJavaError: An error occurred while calling o89.getSink.\n: java.sql.SQLException: [Amazon](500150) 
		 Error setting/closing connection: Connection timed out.\n\t
		 at com.amazon.redshift.client.PGClient.connect(Unknown Source)\n\t
		 at com.amazon.redshift.client.PGClient.<init>(Unknown Source)\n\t
		 at com.amazon.redshift.core.PGJDBCConnection.connect(Unknown Source)\n\t
		 at com.amazon.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)\n\t
		 at com.amazon.jdbc.common.AbstractDriver.connect(Unknown Source)\n\t
		 at com.amazon.redshift.jdbc.Driver.connect(Unknown Source)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectionProperties$5(JDBCUtils.scala:973)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$2(JDBCUtils.scala:925)\n\t
		 at scala.Option.getOrElse(Option.scala:121)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$1(JDBCUtils.scala:925)\n\t
		 at scala.Option.getOrElse(Option.scala:121)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.connectWithSSLAttempt(JDBCUtils.scala:925)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper$.connectionProperties(JDBCUtils.scala:969)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties$lzycompute(JDBCUtils.scala:739)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties(JDBCUtils.scala:739)\n\t
		 at com.amazonaws.services.glue.util.JDBCWrapper.getRawConnection(JDBCUtils.scala:752)\n\t
		 at com.amazonaws.services.glue.RedshiftDataSink.<init>(RedshiftDataSink.scala:41)\n\t
		 at com.amazonaws.services.glue.GlueContext.getSink(GlueContext.scala:1059)\n\t
		 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\t
		 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\t
		 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\t
		 at java.lang.reflect.Method.invoke(Method.java:498)\n\t
		 at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\n\t
		 at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\n\t
		 at py4j.Gateway.invoke(Gateway.java:282)\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\n\t
		 at py4j.commands.CallCommand.execute(CallCommand.java:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n
		 Caused by: com.amazon.support.exceptions.GeneralException: [Amazon](500150) Error setting/closing connection: Connection timed out.\n\t
		 ... 28 more\nCaused by: java.net.ConnectException: Connection timed out\n\t
		 at sun.nio.ch.Net.connect0(Native Method)\n\tat sun.nio.ch.Net.connect(Net.java:482)\n\tat sun.nio.ch.Net.connect(Net.java:474)\n\tat sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:647)\n\tat sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:107)\n\tat com.amazon.redshift.client.PGClient.connect(Unknown Source)\n\tat com.amazon.redshift.client.PGClient.<init>(Unknown Source)\n\tat com.amazon.redshift.core.PGJDBCConnection.connect(Unknown Source)\n\tat com.amazon.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)\n\tat com.amazon.jdbc.common.AbstractDriver.connect(Unknown Source)\n\tat com.amazon.redshift.jdbc.Driver.connect(Unknown Source)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectionProperties$5(JDBCUtils.scala:973)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$2(JDBCUtils.scala:925)\n\tat scala.Option.getOrElse(Option.scala:121)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$connectWithSSLAttempt$1(JDBCUtils.scala:925)\n\tat scala.Option.getOrElse(Option.scala:121)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.connectWithSSLAttempt(JDBCUtils.scala:925)\n\tat com.amazonaws.services.glue.util.JDBCWrapper$.connectionProperties(JDBCUtils.scala:969)\n\tat com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties$lzycompute(JDBCUtils.scala:739)\n\tat com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties(JDBCUtils.scala:739)\n\tat com.amazonaws.services.glue.util.JDBCWrapper.getRawConnection(JDBCUtils.scala:752)\n\tat com.amazonaws.services.glue.RedshiftDataSink.<init>(RedshiftDataSink.scala:41)\n\tat com.amazonaws.services.glue.GlueContext.getSink(GlueContext.scala:1059)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\n\tat py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\n\tat py4j.Gateway.invoke(Gateway.java:282)\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\n\tat py4j.commands.CallCommand.execute(CallCommand.java:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n\tat java.lang.Thread.run(Thread.java:750)\n",
    "Stack Trace": [
        {
            "Declaring Class": "get_return_value",
            "Method Name": "format(target_id, \".\", name), value)",
            "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py",
            "Line Number": 328
        },
        {
            "Declaring Class": "deco",
            "Method Name": "return f(*a, **kw)",
            "File Name": "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",
            "Line Number": 111
        },
        {
            "Declaring Class": "__call__",
            "Method Name": "answer, self.gateway_client, self.target_id, self.name)",
            "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py",
            "Line Number": 1305
        },
        {
            "Declaring Class": "getSink",
            "Method Name": "makeOptions(self._sc, options), transformation_ctx)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
            "Line Number": 310
        },
        {
            "Declaring Class": "write_from_options",
            "Method Name": "sink = self.getSink(connection_type, format, transformation_ctx, **new_options)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
            "Line Number": 347
        },
        {
            "Declaring Class": "write_dynamic_frame_from_options",
            "Method Name": "format, format_options, transformation_ctx)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
            "Line Number": 331
        },
        {
            "Declaring Class": "from_options",
            "Method Name": "format_options, transformation_ctx)",
            "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py",
            "Line Number": 802
        },
        {
            "Declaring Class": "<module>",
            "Method Name": "transformation_ctx=\"AmazonRedshift_node3\",",
            "File Name": "/tmp/rds_order_to_redshift",
            "Line Number": 275
        }
    ],
    "Last Executed Line number": 275,
    "script": "rds_order_to_redshift"
}

feita há um ano1164 visualizações
1 Resposta
0
Resposta aceita

I've fixed this. my ETL job moves data from RDS to Redshift. Since the redshift connection alone works, and , the RDS connection alone works. I suspected that Redshift should allow RDS to talk to it.

So I've added the security group of the RDS into the inbound rules of the Redshift security group.

respondido há um ano

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas