how to save a file while running an etl job in a fargate task?

Question

I have a python code for an etl job , i am planning to run as a fargate job. I am saving a docker container in ecr and pulling that to  run my task, but i need to temporarily save a text file while running that container. i know in lambda, i think we can save a file to temporary folder as such /tmp/somefile.txt, i assume as lambda or a fargate task will run in some ec2, it should be the same?

Answer

To add to the answer, you can definitely indeed store files locally on the filesystem in similar way to lambda / EC2. Your image + local content can go up to 20GB "for free" with every task. You can go up to 200GB of that NVMe goodness, but you have to pay for the storage above 20GB (also you have to define that at the task definition level, it doesn't magically add storage for you).

But instead of EFS I'd much more recommend something a little more modern and use S3, as if you needed to do any form of automation (i.e. trigger a lambda when that temporary file is created/updated) then that's easy to do, whereas EFS won't give you that. But that very much depends on your IO pattern.

EFS requires also a little more involvement in the infrastructure. Not too much but substantially more than S3.

Answer

Yes, you are right. You can store the files temporarily in /tmp folder like you do in any EC2 instance, but it cannot exceed 20GiB with default settings. Please read this doc here.

Also, if you want to persist the files, you could use EFS mounting.

支持工程师

Venkat Penmetsa

已回答 2 年前

Also, if you want to persist the files, you could use [EFS mounting](https://aws.amazon.com/premiumsupport/knowledge-center/ecs-fargate-mount-efs-containers-tasks/).

how to save a file while running an etl job in a fargate task?

相关内容