AWS glue to Azure Datalake

0

Hello All,

We need to small POC . In this we need to pick data from salesforce and push to Azure datalake using Glue . Can we connect to Azure datalake from Glue .

  • Hi, @Purnima.

    What do you want to connect to in Azure Data Lake?
    Is it Azure Data Lake Storage Gen2?

  • Hi @iwasa ,

    yes it is Azure data Lake Storage Gen2

Purnima
已提問 1 年前檢視次數 358 次
3 個答案
1

Hi, @Purnima.

You may only be able to connect via JDBC from Glue via Azure Synapse or using a 3rd party product such as CData.
However, it's probably intended for reading by Glue, so it's probably not suitable for writing.

So, for Glue, I think you'll need to write a custom script to send objects directly to Azure Storage using an Azure authentication token, or handle the write workflow with something like Lambda or StepFunction.

In this case, I think you'd be smarter to use Azure Data Factory (ADF) that ETL service on Microsoft Azure.
ADF also supports Salesforce as a source.

https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-overview

profile picture
專家
iwasa
已回答 1 年前
0

Hi,

I understand that you need to build ETL pipeline to copy data files from salesforce and push to Azure Delta Lake using AWS Glue service and you would like to know how to connect to Azure datalake.

I investigated on you concern, and unfortunately I could not find any official document for connecting Glue to Azure Data Lake Storage Gen2 (Azure ADLS) container or any available JDBC drivers. But, I found an official article[1] explaining how to access and analyze on-premises data stores using AWS Glue. Although it doesn't cover your use case specifically, it may give you an idea in setting up the connection. Please refer this article[2] for understanding about setting up jdbc connection and the additional properties that can be set up.

However, I have also found a third party article[3] which explains how to connect to Azure Data Lake Storage Data in AWS Glue Jobs Using JDBC. Although this is not an official document, but I suggest you can give it read and see if that helps.

[1] https://aws.amazon.com/blogs/big-data/how-to-access-and-analyze-on-premises-data-stores-using-aws-glue/
[2] https://docs.aws.amazon.com/glue/latest/dg/connection-properties.html#connection-properties-jdbc
[3] https://www.cdata.com/kb/tech/azuredatalake-jdbc-aws-glue.rst

Thank you.

AWS
支援工程師
已回答 1 年前
0

Not getting option to accept answer but I have upvoted the answer

已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南