Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I need this specifically as a part of a state machine. Most of the steps involve a Lambda loading and unloading csv data between S3, Redshift, and Aurora where no local storage is needed. The last step where we had to download the files locally and compress multiple files together was done manually via a script because they were greater than 512Mb.

We were just about to put the script in Fargate (Serverless Docker) and run a ECS task as part of the state machine. Now we don’t have to.



This - if you have to fetch data from or output data to outside of the AWS ecosystem, the 512mb /tmp limit pushes you into the additional (relative) complexity of having to run on Fargate pretty quickly. Just had to deal with this for a content ingest job involving pulling a couple GB of data from an FTP server, processing it and pushing it into an RDS database on an hourly basis. Would have been super simple if the file was on S3 already.


Where do you need a distributed file system in this use case? Sounds like all you need is some local scratch space?


Lambda only gives you 512MB pf local “scratch space”. If it had to provision and de provision gigs of space at each invocation, it would probably cause longer start times and shut down times.


Why do you need a distributed file system to solve that problem? It sounds like all you need is some scratch space?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: