Thanks for watching! If you like this content we need your support to grow our channel. Please subscribe and share it with your friends. If you have any suggestions, please share with us too 🙂
Hi Manoj , It is not clear , how the big file is handled by the Lambda function - we bypass API Gateway and sync invoke of Lambda for the big file using s3. But when file is uploaded in s3 , we send the metadata of the file to Lambda to work on it - but not the big file itself . But how lambda will work on the Actual Big file (parsing the entire file and get some information and save it to db or send to sqs or sns) even it is kept in to s3 ?
He still uses API Gateway. He is using that to return a signed S3 URL for upload. API Gateway is not directly using for uploading of that payload. After uploading to S3, if you still want to process that big file I will give you 3 options. 1. Don't use lambda. Use custom solution that directly process the big file from S3. 2. if you have to use lambda is to process the big file, you may want to process that file chunk by chunk without reading entire content into memory. See if this serves your purpose medium.com/swlh/processing-large-s3-files-with-aws-lambda-2c5840ae5c91 3. Use lambda to invoke Amazon Batch job. Use batch job to process the file.
To me it is not clear as well. You cannot just ASSUME your payload is large if the problem you wanna solve is handling large payloads. From what I inspected on your code there is no proof showing that it is capable of handling large payloads. Could you prove that to us please?
i use lambda to process nearly 600 mb csv files from s3 and write it back to another csv, with the configuration of 3gb memory, it takes 10 minutes to process the file, is there any way to speed up the process?
Hi This architecture is amazing I need help on lambda function which executes download a large file from s3 and send via post method (axios) to cloud application Here I am facing an issue with memory and execution time it automatically increase my billing Can you please help me regarding this
We can see that the author uses a single PUT operation to upload file to S3, According to this docs : docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html : "Upload an object in a single operation using the AWS SDKs, REST API, or AWS CLI-With a single PUT operation, you can upload a single object up to 5 GB in size. " -> Answer is 5GB.
Generally the lambda will time out for 15 minutes, while the executionLambda process the large data, it will also talk with the thirdparty service, what if it will time out?
I guess we can process the data in another lambda and send the data items to SQS queue after upload data in S3, then the SQS message will one by one (or batch by batch) trigger the excutionLambda. But in this way, how to notify the client that the backend all has processed the data?
Hi Bhisham, AWS Lambda will not be the best case in such scenario. Better to think on containerized approach. Another approach would be to decompose processing into multiple Lambdas. If you can provide a scenario I will be able to give a better answer.
@@EnlearAcademy Thanks for replying, basically we want to perform an ETL logic i.e. read an excel file from S3 havin around 50k records, process these records, nd pass on this data to another set of lambda in batches.. For container based approach, can we use fargate for this..?
Thanks for watching! If you like this content we need your support to grow our channel. Please subscribe and share it with your friends. If you have any suggestions, please share with us too 🙂
dear Manoj.. small request, can you do an SLS or CDK based how-to video for MultiPart Uploads with pre-signed URL to S3 with Lambda (nodejs or python)
Awesome teaching. Thanks for sharing!
Just what I wanted. concise and informative! thanks
Amazing preservation, keep it up Manoj . This is great
Wow, this is very helpful for my work!
Glad it was helpful!
Thanks for this great explanation.
Glad it was helpful!
Thanks a lot Manoj for a very nice vedeo, You have always been a nice teacher :)
Hi Manoj , It is not clear , how the big file is handled by the Lambda function - we bypass API Gateway and sync invoke of Lambda for the big file using s3. But when file is uploaded in s3 , we send the metadata of the file to Lambda to work on it - but not the big file itself . But how lambda will work on the Actual Big file (parsing the entire file and get some information and save it to db or send to sqs or sns) even it is kept in to s3 ?
He still uses API Gateway. He is using that to return a signed S3 URL for upload. API Gateway is not directly using for uploading of that payload. After uploading to S3, if you still want to process that big file
I will give you 3 options.
1. Don't use lambda. Use custom solution that directly process the big file from S3.
2. if you have to use lambda is to process the big file, you may want to process that file chunk by chunk without reading entire content into memory. See if this serves your purpose medium.com/swlh/processing-large-s3-files-with-aws-lambda-2c5840ae5c91
3. Use lambda to invoke Amazon Batch job. Use batch job to process the file.
To me it is not clear as well. You cannot just ASSUME your payload is large if the problem you wanna solve is handling large payloads. From what I inspected on your code there is no proof showing that it is capable of handling large payloads. Could you prove that to us please?
Excellent Session. Just one suggestion, if you could add the DynamoDB interaction as depicted in the diagram, it would complete the structure
Hi, One doubt, Lambda Asyn support only 256kb payload only right, what would be a solution for more than 256kb?
i use lambda to process nearly 600 mb csv files from s3 and write it back to another csv, with the configuration of 3gb memory, it takes 10 minutes to process the file, is there any way to speed up the process?
Awesome video Manoj! Are you planning to make a similar video but using Cognito instead of pre-signed URLs?
How do you limit the size of the file uploaded by the user through the signed url
Hi
This architecture is amazing
I need help on lambda function which executes download a large file from s3 and send via post method (axios) to cloud application
Here I am facing an issue with memory and execution time it automatically increase my billing
Can you please help me regarding this
Decent solution
What is the maximum file size that I can upload using this solution?
We can see that the author uses a single PUT operation to upload file to S3,
According to this docs : docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html :
"Upload an object in a single operation using the AWS SDKs, REST API, or AWS CLI-With a single PUT operation, you can upload a single object up to 5 GB in size. "
-> Answer is 5GB.
Generally the lambda will time out for 15 minutes, while the executionLambda process the large data, it will also talk with the thirdparty service, what if it will time out?
I guess we can process the data in another lambda and send the data items to SQS queue after upload data in S3, then the SQS message will one by one (or batch by batch) trigger the excutionLambda. But in this way, how to notify the client that the backend all has processed the data?
Will the browser get freeze in this solution?
But, AWS Lambda has a runtime limit of 15 minutes, what if it is not able to process the huge payload within this time span ?
Hi Bhisham, AWS Lambda will not be the best case in such scenario. Better to think on containerized approach. Another approach would be to decompose processing into multiple Lambdas. If you can provide a scenario I will be able to give a better answer.
@@EnlearAcademy Thanks for replying, basically we want to perform an ETL logic i.e. read an excel file from S3 havin around 50k records, process these records, nd pass on this data to another set of lambda in batches..
For container based approach, can we use fargate for this..?