Time series Feature extraction in AWS

0

Hello everyone!

I'm seeking advice on architecture design using AWS, specifically regarding the feature store process. Currently, I'm in the prototyping phase and using the tsfresh library for feature extraction. My goal is to incorporate this process into a deployment pipeline on AWS. If anyone has experience with tsfresh, I would greatly appreciate your recommendations on the most suitable AWS resources to use. I've considered using Lambda functions or Glue, but both seem to have limitations that may not be the best fit for my needs. Here's the AWS architecture I'm planning to deploy. However, I'm unsure if using Glue for Tsfresh is the ideal choice due to slow boot time and difficulties in installing additional libraries. On the other hand, Lambda has a payload limitation. For now, I'm looking for an easy and fast deployment solution to validate the process, even if it may not be the most optimal one.

Thank you in advance!

Enter image description here

Ali
已提問 1 年前檢視次數 239 次
1 個回答
1

tsfresh is not library built for Spark, it won't distribute the processing and will default the point of using a Glue ETL cluster.
You have an option in the middle, using a Glue shell you can run a single process native Python libraries like in lambda but with more resources and no time constraints.

profile pictureAWS
專家
已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南