AWS glue to Azure Datalake

0

Hello All,

We need to small POC . In this we need to pick data from salesforce and push to Azure datalake using Glue . Can we connect to Azure datalake from Glue .

  • Hi, @Purnima.

    What do you want to connect to in Azure Data Lake?
    Is it Azure Data Lake Storage Gen2?

  • Hi @iwasa ,

    yes it is Azure data Lake Storage Gen2

Purnima
preguntada hace un año358 visualizaciones
3 Respuestas
1

Hi, @Purnima.

You may only be able to connect via JDBC from Glue via Azure Synapse or using a 3rd party product such as CData.
However, it's probably intended for reading by Glue, so it's probably not suitable for writing.

So, for Glue, I think you'll need to write a custom script to send objects directly to Azure Storage using an Azure authentication token, or handle the write workflow with something like Lambda or StepFunction.

In this case, I think you'd be smarter to use Azure Data Factory (ADF) that ETL service on Microsoft Azure.
ADF also supports Salesforce as a source.

https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-overview

profile picture
EXPERTO
iwasa
respondido hace un año
0

Hi,

I understand that you need to build ETL pipeline to copy data files from salesforce and push to Azure Delta Lake using AWS Glue service and you would like to know how to connect to Azure datalake.

I investigated on you concern, and unfortunately I could not find any official document for connecting Glue to Azure Data Lake Storage Gen2 (Azure ADLS) container or any available JDBC drivers. But, I found an official article[1] explaining how to access and analyze on-premises data stores using AWS Glue. Although it doesn't cover your use case specifically, it may give you an idea in setting up the connection. Please refer this article[2] for understanding about setting up jdbc connection and the additional properties that can be set up.

However, I have also found a third party article[3] which explains how to connect to Azure Data Lake Storage Data in AWS Glue Jobs Using JDBC. Although this is not an official document, but I suggest you can give it read and see if that helps.

[1] https://aws.amazon.com/blogs/big-data/how-to-access-and-analyze-on-premises-data-stores-using-aws-glue/
[2] https://docs.aws.amazon.com/glue/latest/dg/connection-properties.html#connection-properties-jdbc
[3] https://www.cdata.com/kb/tech/azuredatalake-jdbc-aws-glue.rst

Thank you.

AWS
INGENIERO DE SOPORTE
respondido hace un año
0

Not getting option to accept answer but I have upvoted the answer

respondido hace un año

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas