Glue connection with DocumentDB with ssl enabled get timeout

0

Glue service with SSL enabled document db connection is always timing out. when i disable cert ,my connection is working fine.

I am not sure where to pass the SSL cert when i create a documentdb connection in glue.

As per aws document : https://docs.aws.amazon.com/glue/latest/dg/connection-properties.html#connection-properties-SSL . they support monogodb(I did not see documentdb in this list but documentdb suppot mongo apis) . but i am not able to find any instruction to pass SSL cert. Appreciate any inputs!

Ravi
asked 2 months ago363 views
4 Answers
1

Have you verified that your DocumentDB connection settings are properly configured?

profile picture
EXPERT
answered 2 months ago
  • Yes. My connection are properly configured. Test Connection is working fine. I tried both the option 1) when i use Visual ETL with Glue 4.0 , the data review option is timing out due to SSL cert, 2) When i write pyspark script through editor and then run it, the script fail for the same reason.

  • When you enable TLS/SSL configuration in the cluster parameter group, a cluster reboot is necessary for the changes to take effect. This is because TLS/SSL is a static parameter that requires a reboot to apply. You can find more information about this process here.

    To configure your Glue connection to allow SSL, set "ssl": "true" and "ssl.domain_match": "false". Detailed instructions can be found here.

    After configuring your Glue connection, navigate to your DocumentDB and modify the cluster parameter group to enable TLS/SSL. Remember to reboot your cluster for the changes to take effect.

    Finally, verify if the setup works as intended.

0

To pass a CA certificate you can add the pem to the job extra files and then in the mongodb URL append a parameter ssl_ca_certs=/tmp/yourcert.pem
But I wouldn't think that's the issue since that validates the server cert and what you got there is the server not responding (not clear if it's able to connect (check the port is the right one when using TLS)

profile pictureAWS
EXPERT
answered 2 months ago
  • Here is the documentdb configuration details. I am using default port 27017.

    Port 27017 Instance status available Instance role primary Instance class db.r5.large Promotion tier tier-1 Certificate authority rds-ca-ras12-g1

    do i need to configure any thing in VPC & Security groups to allow specific port to accept glue connections?

0

Here is the write error message : Py4JJavaError - An error occurred while calling o100.getSampleDynamicFrame. : com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting to connect. Client view of cluster state is {type=UNKNOWN, servers=[{address=docdb-2022-07-05-09-03-49.cluster-sss.us-west-2.docdb.amazonaws.com:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketReadTimeoutException: Timeout while receiving message}, caused by {java.net.SocketTimeoutException: Read timed out}}] at com.mongodb.internal.connection.BaseCluster.getDescription(BaseCluster.java:181) at com.mongodb.internal.connection.SingleServerCluster.getDescription(SingleServerCluster.java:44) at com.mongodb.client.internal.MongoClientDelegate.getConnectedClusterDescription(MongoClientDelegate.java:144) at com.mongodb.client.internal.MongoClientDelegate.createClientSession(MongoClientDelegate.java:101) at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.getClientSession(MongoClientDelegate.java:291) at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:183) at com.mongodb.client.internal.MongoIterableImpl.execute(MongoIterableImpl.java:135) at com.mongodb.client.internal.MongoIterableImpl.iterator(MongoIterableImpl.java:92) at com.mongodb.client.internal.MongoIterableImpl.forEach(MongoIterableImpl.java:121) at com.mongodb.client.internal.MongoIterableImpl.into(MongoIterableImpl.java:130) at com.mongodb.spark.sql.connector.schema.InferSchema.lambda$inferSchema$0(InferSchema.java:85) at com.mongodb.spark.sql.connector.config.AbstractMongoConfig.withCollection(AbstractMongoConfig.java:173) at com.mongodb.spark.sql.connector.config.ReadConfig.withCollection(ReadConfig.java:45) at com.mongodb.spark.sql.connector.schema.InferSchema.inferSchema(InferSchema.java:81) at com.mongodb.spark.sql.connector.MongoTableProvider.inferSchema(MongoTableProvider.java:62) at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils$.getTableFromProvider(DataSourceV2Utils.scala:90) at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils$.loadV2Source(DataSourceV2Utils.scala:132) at org.apache.spark.sql.DataFrameReader.$anonfun$load$1(DataFrameReader.scala:209) at scala.Option.flatMap(Option.scala:271) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:207) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:171) at com.amazonaws.services.glue.connections.MongoConnection.getDynamicFrame(MongoConnection.scala:20) at com.amazonaws.services.glue.MongoDataSource.getDynamicFrame(DataSource.scala:701) at com.amazonaws.services.glue.DataSource.getSampleDynamicFrame(DataSource.scala:111) at com.amazonaws.services.glue.DataSource.getSampleDynamicFrame$(DataSource.scala:109) at com.amazonaws.services.glue.MongoDataSource.getSampleDynamicFrame(DataSource.scala:697) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:750)

Ravi
answered 2 months ago
0

I tested with Socket lib and able to communicate with the document db host and port number. but still failed to communicate with server.

I tried with pymongo client as well. Still no luck, With out connection, i notice the pymongo lib is installed and able to pull the pymongo version installed, when i select the connection , it immediately says lib is not installed..

INFO 2024-02-27T20:57:09,175 408849 com.amazonaws.services.glue.PrepareLaunch [main] Checking pymodule installation result for List(pymongo==4.6.2): PythonModuleInstallOutput(1,,WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fdab1db7730>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /simple/pymongo/WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fdab1db7a60>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /simple/pymongo/WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fdab1db7d00>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /simple/pymongo/WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fdab1db7ee0>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /simple/pymongo/WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fdab1db7fa0>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /simple/pymongo/ERROR: Could not find a version that satisfies the requirement pymongo==4.6.2 (from versions: none)ERROR: No matching distribution found for pymongo==4.6.2).exitCode}

Ravi
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions