Glue: SnowFlake table cataloged, send to S3 after some transformations.

0

Hi, i have ran a crawler that connects by jdbc to SnowFlake and it creates a table into the Glue catalog database, that works nice. now i want to use glue studio to take the source (AWS Glue Data Catalog, and the table created by the above crawler) and do some transformations and later write to a S3 bucket. the flow is: AWS Glue Data Catalog | Filter (C_CUSTKEY=1) | S3 in Cloudwathc it's showing the next errors:

2023-05-17T14:12:10.801+02:00

Copy 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] { "Event": "GlueETLJobExceptionEvent", "Timestamp": 1684325530798, "Failure Reason": "Traceback (most recent call last):\n File "/tmp/SFCatalogToS3.py", line 20, in <module>\n transformation_ctx="AWSGlueDataCatalog_node1684308467551",\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 787, in from_catalog\n return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog\n makeOptions(self._sc, additional_options), catalog_id),\n File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call\n answer, self.gateway_client, self.target_id, self.name)\n File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco\n raise converted from None\npyspark.sql.utils.IllegalArgumentException: No group with name <host>", "Stack Trace": [ { "Declaring Class": "deco", "Method Name": "raise converted from None", "File Name": "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", "Line Number": 117 }, { "Declaring Class": "call", "Method Name": "answer, self.gateway_client, self.target_id, self.name)", "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", "Line Number": 1305 }, { "Declaring Class": "create_dynamic_frame_from_catalog", "Method Name": "makeOptions(self._sc, additional_options), catalog_id),", "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", "Line Number": 186 }, { "Declaring Class": "from_catalog", "Method Name": "return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)", "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", "Line Number": 787 }, { "Declaring Class": "<module>", "Method Name": "transformation_ctx="AWSGlueDataCatalog_node1684308467551",", "File Name": "/tmp/SFCatalogToS3.py", "Line Number": 20 } ], "Last Executed Line number": 20, "script": "SFCatalogToS3.py" }

23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] {"Event":"GlueETLJobExceptionEvent","Timestamp":1684325530798,"Failure Reason":"Traceback (most recent call last):\n File "/tmp/SFCatalogToS3.py", line 20, in <module>\n transformation_ctx="AWSGlueDataCatalog_node1684308467551",\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 787, in from_catalog\n return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog\n makeOptions(self._sc, additional_options), catalog_id),\n File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call\n answer, self.gateway_client, self.target_id, self.name)\n File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco\n raise converted from None\npyspark.sql.utils.IllegalArgumentException: No group with name <host>","Stack Trace":[{"Declaring Class":"deco","Method Name":"raise converted from None","File Name":"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py","Line Number":117},{"Declaring Class":"call","Method Name":"answer, self.gateway_client, self.target_id, self.name)","File Name":"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py","Line Number":1305},{"Declaring Class":"create_dynamic_frame_from_catalog","Method Name":"makeOptions(self._sc, additional_options), catalog_id),","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py","Line Number":186},{"Declaring Class":"from_catalog","Method Name":"return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py","Line Number":787},{"Declaring Class":"<module>","Method Name":"transformation_ctx="AWSGlueDataCatalog_node1684308467551",","File Name":"/tmp/SFCatalogToS3.py","Line Number":20}],"Last Executed Line number":20,"script":"SFCatalogToS3.py"}

2023-05-17T14:12:10.934+02:00

Copy 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] Last Executed Line number from script SFCatalogToS3.py: 20 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] Last Executed Line number from script SFCatalogToS3.py: 20

is it possible to do what i try to do ?, cheers

  • Did you solve this issue. Im getting the same error. Can you post your answer here?

Willi5
posta un anno fa463 visualizzazioni
2 Risposte
0

Sounds like you are passing some option to create_dynamic_frame_from_catalog that is incorrect, are you specifying something in addition to the catalog and table?

profile pictureAWS
ESPERTO
con risposta un anno fa
  • Hi, nop i'm not passing nothing, only mi source is "AWS Glue Data Catalog" apply one filter and the target is S3, the table was created by a crawler and exist in the catalog: Name Database snowflake_sample_data_tpch_sf1_customer testsf

    Location Connection
    SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER snowflake-glue-jdbc-connection2

  • Please open a support ticket, sounds there is something in the table or connection configuration that the job is not able to handle correctly

0

@Willi5 Are you able to fix this. Im getting the same error and can you post your answer here?

sb
con risposta un mese fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande