Best way to move data from Athena to OpenSearch

0

What is the best way to get the data from an Athena query result to the OpenSearch index?

Right now we use a combination of Athena to S3, then Step Functions, and Lambdas and so on, but it's too fragile and costly.

Are there any better ways to do this? I am surprised there isn't some native service that can do this. Or maybe I am overlooking it.

profile picture
m0ltar
已提问 1 年前604 查看次数
2 回答
0
已接受的回答

To ingest data from S3 into OpenSearch , there are two options :

  • Using S3 source plugin. This will require the setup for a SQS queue that receives S3 Event Notification for the arrival of the new objects.
  • Using a Lambda driven approach. Lambda will be triggered every time there is a new object and will load that into OpenSearch. You can see an example of reference architecture here (look at the log analytics use case) and an example of the Lambda function here.

Second option is suitable for the cases that you need to transform incoming data before load into OpenSearch whilst the first option loads the data with no transformation.

AWS
专家
已回答 1 年前
  • The S3 source looks great. When we developed our solution Data Prepper was not released yet, had no idea about it. Great pointer! Thanks.

0

Amazon Athena is an ad-hoc/interactive querying service and does not provide machinery to be in the middle of your datapipeline. In short, something should 'trigger' Athena queries (this can be another service through SDK or a human in an interactive manner). If you want to use SQL queries automatically whenever new data arrives at your S3 data source, you can use Glue jobs using SparkSQL commands with the same SQL query to assemble the data set and feed it into S3 and from there you can use native integration of OpenSearch with S3 to pull data. Based on the velocity of your data (batch vs. streaming) you might need to think about different components (e.g. using GlueStreaming instead of Glue jobs or Kinesis Data Firehose delivery to OpenSearch instead of S3 integration).

AWS
专家
已回答 1 年前
  • Ok, I think I miscommunicated my question. We do trigger Athena and the results land in S3. That works well, and we have no issues there. The question is what happens after. How do we get these Athena results into OpenSearch? We can output any Athena-supported format.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则