Best way to move data from Athena to OpenSearch

0

What is the best way to get the data from an Athena query result to the OpenSearch index?

Right now we use a combination of Athena to S3, then Step Functions, and Lambdas and so on, but it's too fragile and costly.

Are there any better ways to do this? I am surprised there isn't some native service that can do this. Or maybe I am overlooking it.

profile picture
m0ltar
posta un anno fa602 visualizzazioni
2 Risposte
0
Risposta accettata

To ingest data from S3 into OpenSearch , there are two options :

  • Using S3 source plugin. This will require the setup for a SQS queue that receives S3 Event Notification for the arrival of the new objects.
  • Using a Lambda driven approach. Lambda will be triggered every time there is a new object and will load that into OpenSearch. You can see an example of reference architecture here (look at the log analytics use case) and an example of the Lambda function here.

Second option is suitable for the cases that you need to transform incoming data before load into OpenSearch whilst the first option loads the data with no transformation.

AWS
ESPERTO
con risposta un anno fa
  • The S3 source looks great. When we developed our solution Data Prepper was not released yet, had no idea about it. Great pointer! Thanks.

0

Amazon Athena is an ad-hoc/interactive querying service and does not provide machinery to be in the middle of your datapipeline. In short, something should 'trigger' Athena queries (this can be another service through SDK or a human in an interactive manner). If you want to use SQL queries automatically whenever new data arrives at your S3 data source, you can use Glue jobs using SparkSQL commands with the same SQL query to assemble the data set and feed it into S3 and from there you can use native integration of OpenSearch with S3 to pull data. Based on the velocity of your data (batch vs. streaming) you might need to think about different components (e.g. using GlueStreaming instead of Glue jobs or Kinesis Data Firehose delivery to OpenSearch instead of S3 integration).

AWS
ESPERTO
con risposta un anno fa
  • Ok, I think I miscommunicated my question. We do trigger Athena and the results land in S3. That works well, and we have no issues there. The question is what happens after. How do we get these Athena results into OpenSearch? We can output any Athena-supported format.

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande