Multiple Catalog Access from an ETL Glue Job

0

Hi all together,

thanks to this article https://repost.aws/de/knowledge-center/query-glue-data-catalog-cross-account I know that with AWS EMR I can access the Glue Catalog from my current account and in addition by setting up the right permissions also the glue catalog from another account at the same time.

My question is if this is also possible with an ETL Glue Job. I know that a cross Account Glue Catalog can be set up by using the --conf spark.hadoop.hive.metastore.glue.catalogid.. Parameter. But if I want to access tables from two other Accounts, I have a problem. Anyone any idea?

Thanks for the help.

Best

已提问 1 年前424 查看次数
1 回答
0

Hi,

I understand that you are trying to access tables from two different glue catalogue accounts using a glue job. We can setup the access policies in source and target accounts and then use two different dynamic frames to access these tables. We don’t need to use the “--conf spark.hadoop.hive.metastore.glue.catalogid” option for your use case. The step by step process to setup for cross account is provided here -

https://repost.aws/knowledge-center/glue-tables-cross-accounts

We can first setup the cross account permissions from Account A [Catalogue account] to Account B and Account C separately. Then in the glue job we can create two dynamic frames [df1 and df2] to access tables from the accounts B and C.

For Example :

df1 = glueContext.create_dynamic_frame.from_catalog(database="doc_example_DB", table_name="doc_example_table", catalog_id=“Account B”, region="us-east-1")

df2 = glueContext.create_dynamic_frame.from_catalog(database="doc_example_DB", table_name="doc_example_table", catalog_id="Account C", region="us-east-1")

Thank you.

AWS
支持工程师
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则