How to move data from Redshift to Open Search using AWS Glue ?

0

Hi! re:Post community,

We are trying to figure out a way to automatically move data from Redshift to Open search using AWS Glue.

In our initial research so far we found that connecting to Redshift is possible in Glue but connecting to Open Search is not natively supported.

The suggested workaround is to use this connector from the marketplace: https://aws.amazon.com/marketplace/pp/prodview-v5ygernwn2gb6

However we are seeing that this open source connector only supports legacy elastic-search versions and not open search.

Is it possible to do it in Glue or we should be using some other service altogether ?

Thanks! in advance

2 Answers
0

For the above requirement, there is a relevant AWS article at [1] which can help, with an excerpt as below: "You can use OpenSearch as a data store for your extract, transform, and load (ETL) jobs by configuring the Elasticsearch Connector for AWS Glue in AWS Glue Studio. This connector is available for free from AWS Marketplace" To use RedShift as the data source in Glue job, we just need to create a connection to Redshift, and create a database catalog in advance [2]. For more information about moving data from/to Redshift, kindly also refer to [3].

In summary, you can use the mentioned connector as part of your solution to achieve your target of migrating data from Redshift to OpenSearch.

There is another option which may interest you. That is to use Database Migration Service per at [4]. However, that topic is beyond the Glue scope.

Reference:

[1] https://docs.aws.amazon.com/glue/latest/ug/tutorial-elastisearch-connector.html [2] https://docs.aws.amazon.com/glue/latest/dg/console-connections.html [3] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-redshift.html [4] https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Elasticsearch.html

AWS
answered 2 years ago
  • I have already mentioned in my question that the connector is open source and does not support newer versions or open search.

    DMS service is only for connecting RDBMS services therefore does not help in my case.

    I am looking for an end to end connection between Redshift and Open-Search, which I guess AWS does not have at the moment and also Glue is not robust enough.

0

Hi ,

The glue connector you mention would work, but you need to enable compatibility mode in OpenSearch cluster side.

enable compatibility mode in OpenSearch cluster side

At the same time work has started to create a native opensearch-hadoop connector , once this is ready we will be able to create a Native OpenSearch Glue Connector.

hope this helps

AWS
EXPERT
answered 2 years ago
  • I tried running the connector with all the configuration but Glue job fails saying "An error occurred while calling o98.pyWriteDynamicFrame. scala/Product$class".

    Looks like glue breaks very easily, Is glue robust enough to be used in production with massive data-sets ?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions