This aws lambda issue was impossible to debug

Web Dev Cody

Просмотров 27 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 9 янв 2024
Got any suggestions of how we could have debugged this better, leave a comment.
My Courses
📘 T3 Stack Tutorial: 1017897100294.gumroad.com/l/j...
My Products
📖 ProjectPlannerAI: projectplannerai.com
🤖 IconGeneratorAI: icongeneratorai.com/
Useful Links
💬 Discord: / discord
🔔 Newsletter: newsletter.webdevcody.com/
📁 GitHub: github.com/webdevcody
📺 Twitch: / webdevcody
🤖 Website: webdevcody.com
🐦 Twitter: / webdevcody

Комментарии • 177

@festusyuma1901 6 месяцев назад ⁺³¹
I tend to create dev environments on AWS so we can test on the exact same environment rather than running locally. that way we're able to eliminate a lot of the issues. I've spent the last year building different pipelines for different types of projects that enables my team to deploy their projects with one click and they can actually test it on their code on the proper environment
The issue with the package is it was probably declaring and using global variables, global variables are not redeclared on warm starts so it also has the same value as the previous run
@marwanaouida 6 месяцев назад ⁺²²
I like to always use `npm ci` instead of `npm i` on the CD script to make sure the same version I used locally is used on the server. Many package maintainers ship breaking changes in minor releases.
@WebDevCody 6 месяцев назад ⁺⁶
I left my space heater running on accident because it was cold in my office, sorry for the background noise for the first 5 minutes 🤣
@gamingbud926 6 месяцев назад
Didn't even notice :)
@WebDevCody 6 месяцев назад ⁺¹
@@gamingbud926 ok good cause that audio noise really bothered me, but I decided to publish anyway 😂
@marinajordan8939 6 месяцев назад ⁺¹
your debugging videos are always great 🔥 keep it up, it's awesome that you're making this information more accessible for everyone
@TizzyD 6 месяцев назад ⁺¹⁵
Memories...I was the architect for a District Attorney's Office in a major US city a little less than a decade ago. I lead the creation of their discovery management system that collected information provided by the PD's investigators (including docs and multimedia file types), ran it through a review and redaction process for that discovery, then provided a portal for the indicated representation of a defendant to be able to download that content. Of course, full records management was implemented along side digital fingerprinting, case-by-case security, workflow management and tracking, access logging, etc. It had about 500 active lawyers and paralegals managing the redaction and publishing workflow. So, your technical discussion reminded me well of the project and the transformation that took place.
@eshw23 5 месяцев назад
Hey Cody I want to say these AWS videos are so good. Cloud computing is a skill even entry level job seekers like me are sometimes required to know. I learn alot from you talking about your app. I usually just go the easy route and deploy my apps to Vercel but watching you talk about this is very helpful. Keep doing it!
@rembautimes8808 6 месяцев назад
This is a very insightful video. So important to emulate production in test - thanks for helping to reinforce this point.
@h4nto772 6 месяцев назад ⁺³
man second time almost missing the video cause of the new thumbnails😂
they fire tho and content on top as always🙏🏼
@sfulibarri 6 месяцев назад ⁺⁴
At work we got on the habit of generating a uuid on each cold start, holding it module scope, and attaching it to the context object that gets written out with every logger call. Makes it really easy to tell if errors are instance specific.
@WebDevCody 6 месяцев назад ⁺¹
Smart idea!
@rafavieceli 6 месяцев назад ⁺²³
I always try to avoid puppeteer because of the weight, it’s easier to transform html into pdf, but it has a price. About the upgrade version, I am the guy who do that and it’s hard do to, I think every project should have a logbook about each dependency, explaining why need need it, know issues. At the end of the day every dependency is a technical debt, some cost a little and others a lot.
@WebDevCody 6 месяцев назад ⁺¹⁰
Yeah we keep a running document about what dependencies cause major issues and what type of errors they throw on update. It helps know when to avoid updating certain packages
@manchuratt8900 6 месяцев назад ⁺²
Very important point. I am anal about ensuring there is never any dynamic versioning for dependencies. Ran into an issue due to dynamic versioning once and it was such a headache to deal with I made a decision to forge utilizing dynamic versioning as there are often cases where a patch version leads to breaking changes. Since then haven't faced an issue relating to dependency versioning. It is very critical that once you finalize your cargo that nothing is allowed to change unless you do it intentionally and with consideration for possible side effects.
@renanlisboa123 6 месяцев назад
Your videos have great value. Great job! 👏
@reynerloza1630 6 месяцев назад
Great video! You're right about containers. A couple months I had to work with that sparticuz package with lambdas but in local I had to use the regular version of puppeteer . It was hard to debug!. Do you use CDK for the project? and how you manage your stages? every branch per stage? Do you create ephemeral environments for features? Thanks
@WebDevCody 6 месяцев назад ⁺¹
Terraform for everything but I’d love to be able to refactor to sst
@Dalamain 6 месяцев назад ⁺¹
hey cody great video - at 11:21 what was the magical keyboard combo you used to highlight the carets like that... I'm aware you can select multiple lines vertically but only in a straight line instead of character matching per line? Thanks
@WebDevCody 6 месяцев назад ⁺²
cmd + d on mac
@IronJmo 5 месяцев назад ⁺²⁴
I save countless hours by not using AWS
@socialtraffichq5067 5 месяцев назад
What do you use
@leandroalbero 5 месяцев назад
Works great until you have to scale into millions of users 😅
@IronJmo 5 месяцев назад
@@socialtraffichq5067 I've tried the big 3 and I prefer Azure.
@jalako8592 5 месяцев назад ⁺⁶
@@leandroalberoyeah that's the main argument I hear regarding AWS. Maybe you and everybody else really does need to spontaneously scale between thousands and millions users dynamically. For 95% of companies and projects that I know use AWS we/they still get the best results with adequately sized dedicated servers. It's also way cheaper than AWS.
@leandroalbero 5 месяцев назад
@@jalako8592 Completely valid response, but in my case, this happened
@OetziOfficial 6 месяцев назад
I had issues creating pdfs from (altered) docx files on lambda, so after 4 days of reading and looking for all sorts of third-party libraries and toolings I use a container (docker) that has all the needed binaries and tools that I could use locally. then I had to fix a few bugs due to aws-lambda and it worked. Took me several days to hunt that bug down, but it was worth it. Now I don't have to hunt it down anymore.
Oh and yes, I tend to get rid of the `^` symbol on almost any package. Then I check if there is updates on any of them and I update only those where I see no/little differences in prod and update them. Overall great overview of the aws-lambda issue and good that you guys figured out how to solve it / how to maintain the same environment for prod / dev / test / staging. it's always good to know that the code running on dev is similar (or even better: same) as prod.
@giuppidev 6 месяцев назад ⁺²
I hope to bring on my channel the same clarity, you're always INSANELY clear! 🚀
@Bl1nkSt3r 5 месяцев назад
Hey @Cody, quick question where can I find documentation to what you mentioned at 12:36 ? If you can point me to the was documentation it would be great! I’m curious if you mean the container has to be the aws base image or is it another image from docker? Thanks!!
@WebDevCody 5 месяцев назад ⁺¹
You can build an image and upload to ecr, then point the lambda to use that image
@verte2193 6 месяцев назад ⁺³
Chromium on lambda layer is a tough topic. We were thinking about container lambda, because it will be possible to run chromium on AL2023, and if im not mistaken lambda is already using that image under the hood.
@TizzyD 6 месяцев назад
I'd of set up a separate Puppeteer server or dockerized instance that is effectively proxied by the lambda services.
@hawkmne 6 месяцев назад ⁺³
Couldn't you have used a docker container only locally so you can run sparticuz and stress test the app? My point being there's no need to deploy the container to Lambda.
@WebDevCody 6 месяцев назад
yeah now that you point it out maybe that could have been a solution to help debug 🧠
@samupert 6 месяцев назад
Hi Cody, very clear explanation of the thought process! I was wondering if you did introduce testing for this type of issue in the cicd pipeline. I think that provisioning the function with the execution concurrency limited to 1 would give you the possibility to test the lambda in the warm state. What do you think about it?
@WebDevCody 6 месяцев назад ⁺¹
that's a cool idea, I didn't think about that, but not we didn't add any additional tests. I think we'd need some type of load testing to re-simulate the issue. We plan to just refactor to use containers when we get time so that testing locally would be the same as prod
@samupert 6 месяцев назад
@@WebDevCody Sounds to be a good plan, thank you for the message Cody 🤙
@latiotech 5 месяцев назад
I'm curious if you've looked into an ECS migration to address some of the lambda shortcomings?
@lysendertrades 6 месяцев назад
In our case, since we are using Docker, we have to match the alpine version that has a specific chrome version that is compatible with a specific puppeteer version. Good thing everything can be reproduced locally :D, sort of
@siddhanttripathi5224 6 месяцев назад
so, how much time did it actually took you guys to debug this issue and resolve it? I did find it interesting that you told which methods didn't work, but i wanna know what actually worked? ( if possible)
@WebDevCody 6 месяцев назад ⁺²
I think at some point we ended up just down grading everything to the previous commits until it started passing, then we knew it was a version update issue. this took maybe a few days of full time debugging unfortunately
@viralgupta7636 6 месяцев назад
hey cody, woudnt stress/load testing caught this issue in test enviorment?
@WebDevCody 6 месяцев назад
Yes I think so. That is a gap in our testing for sure. It’s funny because we use AWS so that we don’t have to worry about scaling because lambda and DynamodB and s3 auto scale easily so you wouldn’t think that load test be very necessary but I guess in this case it would help.
@Rizhiy13 5 месяцев назад
5:45 Well here is one problem, what is the reason for using different dependencies in dev vs prod? Also, Docker or pinning versions?
@roku_on_it 6 месяцев назад
This was a really good and simple explanation.
@cas818028 6 месяцев назад ⁺⁴
Hey Cody, Got some feedback for you. I am not sure if serverless function are the best solution for PDF generation. Serverless is great at doing short term/cost executions. Maybe a db lookup or API call. There are a ton of limitations on Lambda's that you are well aware of, you pay for the cost of execution. Also in PDF generation there are a ton of variables just in that process. To me that is more of a long running process. I probably would have designed the system, in a more async fashion. I probably would send a request to generate a PDF to an SQS queue, where I have a long running Lightsail instance (poor man version of EC2) polling the queue and processing each message to generate a PDF. On a always running lightsail instance you set the hardware specs to whatever you need, based on cost + usage. And you can also install whatever binaries you need, big or small. Its also easier to match this for local dev. You can also horizontally scale this solution out by adding more lightsail instances to poll the queue. Also you would have the advantage that if the process fails, it would end up in.a DLQ which you can setup monitors for, OR a redrive/retry policy. This would make your system more resilient. Also in the case that the VPS goes down, all your messages would be backed up into a queue up to 14 days. This gives you a level of fault tolerance, So if you have an issue with your VPS, it can go down and messages will just back up into the queue. Once you bring the box backonline they should start processing as normal and there would not be any data/processing lost. If you need a way to alert the user that a PDF is "ready to download" you can combine this approach with AWS Appsync also using subscriptions to push an event back to the front end (this is where you and I disagree on long polling), or you cold long poll s3 or a db to know when its ready. If you want to dive in further, let me know and I can jump in discord and we can chat. With regards to distruted tracing, we use glitchtip which is an open source version of Sentry. We self host that with another lightsail instance within out platform (this is to reduce/control costs as Sentry cloud is expensive). Sentry/Glitch tip also has alerting into it as well. So it makes it a little easier to hunt down these messages, you can also use X-ray + Cloudwatch querying to try and drill into the logs. But this has always been a painful endeaver in my experience.
@WebDevCody 6 месяцев назад ⁺⁵
we do decouple our PDF generation process from our entire api by using SQS. We connect a lambda as a consumer of that SQS resource and process the pdf events. The use of SQS allows for retries, dead letter queues, throttling if needed.
I agree with your feedback, but in regard to EC2 or lightsale (dedicated vm or container) the concern is the PDF processing uses 3rd party libraries which can sometimes use a lot of memory. Lambda allows us to create each PDF in isolation compared to your dedicated VM suggestion which means if couple bad rouge PDF events could potentially eat all the memory on the machine. Lambda prevents the need to "bring the box back online". If someone tries to generate a huge pdf on accident, only their process is affected, not the entire box.
I'm not sure I understand your feedback for using a dedicated machine over lambda. A dedicate machine imo, always involves more engineering costs, it requires a consistent monthly processing fee to have it running, it requires more monitoring, and has higher risk for when something goes wrong which could affect multiple users at the same time. The last thing we want is 1 failed PDF to corrupt or kill the processing of 20 other users.
The only argument I could see to using a dedicated VM is faster processing times. Lambda can be provisioned up to 10gb of memory which also gives it a decent amount of CPU processing. Scaling is instant compared to any other solution using docker where the autoscaling group needs time to provision a new box, or scaling rules to bring scale down during nighttime hours. Lambda handles all of this.
These lambdas do not take much time to generate a PDF. Usually it's 1-4 seconds depending on the size of the PDF.
@cas818028 6 месяцев назад ⁺³
@@WebDevCody A lot of fair arguments there. I am curious what your AWS costs are per month though? Having fixed boxes give you a way to control those costs. Versus variable based on lambda execution. duration, memory size etc. With regards to a memory leak and potentially corrupting an entire box/environent. That is a fair point. My challenge to that is, you could start with a basic setup of 2 boxes polling the queue. So there is a bit of redundancy there along with fixed/controlled costs (I work for a startup so this is hyper critical). If a message is not processed successfully it would just back up into the queue and the queues retry mechanism would kick in, or it would go to the DLQ. Both provide a means to recover. Since you have 2 boxes polling the one queue there is the likelihood that one of those boxes will end up processing the message successfully. You would implement processes to try and dispose of problematic libraries over a period of time to try and minimize and issues. But yes, memory leak/corruption could be a real thing. You can also use worker threads or child processes in node to create a bit of process islotion, but it would still share the box's memory. The lambda does give you a bit of a Sandbox, but you have to deal with other trade off;s of Cost, Execution duration limit, Memory Size (like you mentioned) and others. Pros VS Cons to both I suppose. As much as I am concerned about high system availability I am also very concerned about run away cloud bills, and this can easily happen if your not careful with serverless. I default to serverless for everything btw, unless there are to many risks for processes to chew up a ton of resources and be long running. Then I switch to lightsail instances.
@parlor3115 6 месяцев назад
Saving this for the future 😁
@adomicarts 6 месяцев назад
I ran to this same bug with firebase cloud functions , to run puppeteer I had to setup the chromium and after setting that it exceeds the cloud function size limits
@ryanway9928 5 месяцев назад
I have a few questions about some things mentioned:
1. What is the motivation for having someone update all the packages weekly? I would kind of expect things to break often if library versions are changed often. Is this used to stay ahead of security concerns?
2. You mentioned that it was hard to pinpoint which library bump was causes the issue. But shouldnt the logs have indicated that it was some sparticus call that was failing?
3. How was it actually determined that sparticus was the issue? You mention that it was a caching within the lambda runtime with that particular library, but what test finally determined that? Or maybe there was a bug reported against the new version of sparticus
4. Lastly, I think if your prod runs in aws and you are not able to develop against an aws account, you would run into tons of issues like this, where you cant reproduce bugs occurring in live environments, or having issues where something developed locally doesnt deploy properly to your testing environment, etc.
@roguesherlock 5 месяцев назад
You explained the architecture and the bug so elegantly!
1. it's weird that it started breaking for a minor version. Does it follow a semver policy? I assume you also reported the bug upstream?
2. Do you have a checklist you go through when you start getting errors? Say you get a email that you've got 1k errors today, what next steps do you take ? I get that essentially we want to a. put the server into a known state, and b. figure out what went wrong. But there are so many steps in between. check logs, check error traces, check db status, check package updates, check git commits, ask the team, etc
@dareolumide8287 6 месяцев назад ⁺¹
I had the exact same issue with Lambda and puppeteer around new year's eve. My solution was to have the frontend render the PDF instead.
@danddro 6 месяцев назад ⁺²
I am using the same setup, and every few thousand conversions html to pdf i get a puppeteer "error creating the browser process", but fortunately only once, so I've just added a retry since is too difficult to track.
@elmalleable 6 месяцев назад
modern problems, modern solutions
@jackweaver5840 6 месяцев назад
Hey, I have a small question since you are familiar with AWS, I have an app hosted directly on an EC2 instance, what exactly do I need to do in order to update that same instance with the new application code after adding new features, connecting to it and running git pull master is enough? I would appreciate if you could elaborate on this matter, Thanks a lot.
@WebDevCody 6 месяцев назад
Yes a manual process (or simple automated script) like that will work fine usually. Aws also has elastic beanstalk which automates this whole process. I personally would use that if you really want an ec2 instance running your stuff
@jackweaver5840 6 месяцев назад
@@WebDevCody I was curious how to update the new code inside the ec2 directly, I prefer working with ecs since it is more manageable imo.
@WebDevCody 6 месяцев назад ⁺¹
@@jackweaver5840 id just write a script that you can ssh into a machine and run. It should probably stop your service, run npm ci, then run pm2 reload. I’d use pm2 to keep your services always running.
@SeibertSwirl 6 месяцев назад
Good job bub❤
@camilocoelho7097 6 месяцев назад
Nice video. I also use a lot of Lambda. To run my tests locally I use AWS SAM. It spins a container with the same image that will be used on the Lambda, regardless of your own OS. Did you try that?
@WebDevCody 6 месяцев назад
I haven't tried that, our system is 5 years old so we need to re-evaluate our approach to some things at this point. there are many new tools that make developing with lambda much easier, such as SST, etc.
@camilocoelho7097 6 месяцев назад
@@WebDevCody I understand. But, give AWS SAM a shot. I also use its `sam local generate-event`
@YehiaAbdelmohsen 6 месяцев назад
@@WebDevCody Was about to comment this. I strongly recommend you consider AWS SAM. One of the systems I'm working on generates word documents from markdown strings and I use a package called pandoc for that. So I just install it in the Docker image and I'm good. Removes the headache of lambda layers, makes the code more manageable, and way easier to just test locally with predefined events.
@TheWhoIsTom 6 месяцев назад
Woe awesome content. Im a beginner. But at sometime I will remember this if I ran into issues.
@jimmygore8214 6 месяцев назад ⁺²
This is unrelated to the actual bug but I often find the using headless browsers comes with alot of risks. It seems like the state of the webpage is super unpredictable.
@WebDevCody 6 месяцев назад ⁺³
possibly, but we have a suite of tests which generate the pdfs and do an image to image comparison to make sure we didn't accidentally break anything
@MichaelMourao 5 месяцев назад
You mentioned that you saw errors in the production logs. Do these errors not contain information about the original input i.e. the pdf itself, or the associated user data + metadata that brought that particular invocation and all the ones using the same warm lambda afterwards to fail? It's rare that bugs in code (even if it's third party) sometimes error and sometimes run correctly assuming you run them with the same input and a "cold" lambda. So it sounds to me that you need good telemetry in order to find the original call that broke the rest and then hopefully also have access do use that pdf in your test/staging environments in order to reproduce?
Also AWS provides docker images and docker files for all their lambda runtimes and you should be able to mock everything up locally, using the same stuff that runs on production, even if you are on a Mac. But to my previous point, if you don't have some sort of information on the input that started it all it's basically like looking for a needle in a haystack, regardless if it's locally or on the cloud. Even if things got "fixed" when changing the version of the third party library, there's no guarantee if you don't know the root cause.
@yaaaayeet745 6 месяцев назад ⁺¹
thumbnail is fire 🔥
@WebDevCody 6 месяцев назад ⁺¹
thanks man!
@rsfllw 6 месяцев назад ⁺¹
08:25 unknown underlying cached resource breaking the otherwise working code sounds a bit off to me, not used sparticuz so can't guess if it's an issue in the npm package or the client code but definite idempotency (i.e. it should always work and give the same result no matter how many times you call it) concern here
@rsfllw 6 месяцев назад
ok so we're talking about a bug in the new version of the sparticuz npm package? for a business critical pdf generation service this seems like a team issue as much as the library (to be fair node ecosystem is a piece of shit and introduces breaking changes with glee)
@okerror1451 5 месяцев назад
Is serverless actually worth it ? That is my question
@ryanquinn1257 6 месяцев назад
Cool new thumbnail theme!
@andrewjknott 5 месяцев назад
Having all staging environments be local and not include all packages used in prod is a recipe for bugs. Have a temporary pre-prod staging environment that is very close to production -- in this case run on AWS with all of the same packages and a copy of prod data.
@aloiscrr 6 месяцев назад
I have a question. Why don't you use browserless? Is it a requirement to have Puppeteer tightly coupled to the codebase?
@WebDevCody 6 месяцев назад
What library are you recommending I use to generate a pdf without puppeteer?
@aloiscrr 6 месяцев назад
@@WebDevCody It's called Browserless. It's open source and allows you to use a Puppeteer or Playwright instance as an API
@aloiscrr 6 месяцев назад
It's called browserless, think of it as a wrapper of puppeteer or playwright that exposes an API @@WebDevCody
@g.c955 6 месяцев назад
Sounds like that pdf function shouldn't be using lambda, and instead should have its own instance and use puppeteer?
@pralay1991 6 месяцев назад
Is the issue not because the code promoted to PROD is not the code you are working with?
@wes7bg 6 месяцев назад ⁺¹
could someone suggest other channels like this one, I'm hooked and I watched all of his content..
@FidelisOjeah 6 месяцев назад ⁺¹
My 2 cents here:
You can set up your stack so that each engineer can deploy their own stack into AWS (instead of running locally).
The idea behind this is: think about effects that cannot exist locally (like your writes to S3, events from SQS etc).
To save resources, you could have shared stacks (your S3 etc only deployed to shared stack)
You do have `--save-exact` when running `npm install` too (you covered this).
@WebDevCody 6 месяцев назад
I do think thats an option, but idk how I feel about depending on real aws services just to add features locally. Right now our entire application can be ran with a locally running s3 service, elastic search instance, dynamodb database, etc. this allows us to easily run an entire suite of tests without depending on any external api requests and developers can contribute with zero aws knowledge or access. Obvious there are downsides to our approach (this bug in the video), but spinning up an entire isolated environment for each developer would require a lot of costs, on boarding, and time waiting (our project takes maybe 50 minute to deploy from scratch using terraform waiting on clusters to provision, etc). Not saying your idea isn’t worth trying, I’m just having a hard time imagining how well it would work I guess.
@FidelisOjeah 6 месяцев назад
Id say: give it a try and see.
fwiw, each developer does not need to actually deploy everything more expensive services would live in a shared dev stack and get referenced from dev stacks
@magicsmoke0 5 месяцев назад
We do this in my team. It is more costs, but we had so many bugs and live sites that we really needed to eliminate variables and have confidence in our code. IaaC + dev stamps works really well
@jeffreyt999 5 месяцев назад
@@WebDevCody Our staging/test env(s) are deployed to aws the same way in prod. So any QA or dev can use this shared space for tests or bug reproductions.
Also, I am curious about the 250mb limit. I get that one can limit the size for cost purposes, but it's not very expensive to bump up to double that size.
Surprised at the 50-minute deployment durations. We use cdk for aws deployments and it takes less than half that time to deploy.
@mokusatsu8899 6 месяцев назад
I thought the AWS lambda memory limit was 10,240MB, not 250?
@WebDevCody 6 месяцев назад
250mb is the deployment size limit. If your code and binary disk space exceeds 250 you can’t deploy without using a container
@chebrubin 5 месяцев назад
Why would you use node instead of Java for secure document generation?
@WebDevCody 5 месяцев назад
because java sucks
@chebrubin 5 месяцев назад
@WebDevCody lol using a npm library to generate legal documents. Dangerous. Java dependency and this type of work belongs on the server side. Keep the lambda drop the js.
@WebDevCody 5 месяцев назад
@@chebrubin dangerous because of why? It’s not useful to make claims that java is more secure than node without any evidence. I recall a log4j issue happened a while back that was huge issue in the java ecosystem
@chebrubin 5 месяцев назад
@WebDevCody lets talk about this publicly. No comments hidden. In the legal world they will be much more happier with NIST and IBM FIPS based certificatiable JVM. No NodeJs express runtime can say that. Baffles me why a standard vanilla requirement like legal document PDF generation would not be done on Java 17. Call it a day. No mysterious node runtime breakdowns. NPM dependencies exception bringing down the whole runtime is a fail.
We can debate this all day. Talking about some old Java 1.4 library vulnerability is misleading. No one is recommending building lambdas with Java and tons of dependencies. To the contrary the modern resilient JVM can probably generate a PDF without 3rd party libraries. Which is what your legal customers want. The want strong industry standard key 🔑 stores and JVMs certified by the government.
@_Aarius_ 6 месяцев назад
i had to do some pdf generation from DOCX files at work, and i used a docker image that used libreoffice headless to do it all, because otherwise that + some other binaries we needed never wouldve fit within lambdas normal limit.
lambda docker is great tbh
@WebDevCody 6 месяцев назад ⁺¹
yeah I'm glad they added support for containers in lambda; game changer
@neociber24 6 месяцев назад
Didn't knew Lambdas can now use docker image, good to know but for sure you will use a lot of mb of disk
@deepak1660 6 месяцев назад
Hey Cody, your video couldn't have come at a better time. I'm trying to find a decent solution for a problem I'm facing. I allow the user to upload images, the images gets directly uploaded to s3 from the client (using presigned urls). I also display a list of uploaded images to the user in the client. Now my problem came when I created a lambda function that would generate a low quality and a high quality image of the image uploaded by the user. The image that I used to show to the client was the original image, but now I want to show the low quality (thumbnail version) of the image but that takes ~4-6 seconds to generate using lambda function. How can I know when the lambda function completes its execution? The only solution I can see is polling but imo that's a poor solution because I don't know how long to poll for (depending on if lambda function has a cold start or hot start the duration varies). Would love a suggestion from you, thanks!
@WebDevCody 6 месяцев назад ⁺¹
aws has websocket support using api gateway. that's what we do at work. basically what you could do is, your client connects to your websocket endpoint, then your client uploads the file to the bucket (with ws connection id metadata), then your lambda processes on that bucket object to scale it, then you need get the connection id from the original image object and send an event to the user who original uploaded the object.
but, that's a lot more complicated than just doing polling. You could just keep polling until your scaled images finish. I'm assuming your scaled images are on a public s3 bucket, so you could just keep doing a fetch request every 2-3 seconds until you get back a 200 status code, then display it and stop polling.
@deepak1660 6 месяцев назад
Thanks for the solution! Yeah definitely doable but I think for now I'll just go for polling as implementing ws is relatively time consuming.
The scaled images are on a private s3 bucket behind a cdn and I fetch them using a cloudfront signed url, I can poll using the signed url every 2-3 seconds until I get a 200. Thanks alot, greatly appreciate your response and love your content! 🙏
@WebDevCody 6 месяцев назад ⁺¹
@@deepak1660 this might be a fun video content idea. If I get time I may implement this using websockets
@deepak1660 6 месяцев назад
@@WebDevCody I would love that! 😄Btw, according to your solution, how would the websocket endpoint know when the lambda function completes? My lambda function triggers on upload to my primary bucket, the lambda then generates and puts the new images on a secondary bucket. The endpoint and the lambda function are disconnected so not sure how the endpoint can keep track of when the function completes
@WebDevCody 6 месяцев назад ⁺¹
@@deepak1660 the lambda invokes the aws websocket sdk to send the event to the connectionId
@itsanishjain 5 месяцев назад
Why you are using Puppeter to create pdfs why not some external APIs or pdfShift I heard of they have npm package too, Curious to know Because I am also working on some canvas-pdf related project.
@WebDevCody 5 месяцев назад ⁺¹
Puppeteer lets us write our pdfs using jsx and css, which is the same tech our entire project uses. It Makes it much easier to manage our pdfs
@phaneendhraajaythota1025 6 месяцев назад
good insight..
@justintefteller2780 6 месяцев назад
Why don't you just use the lambda to control turning on/off of an ec2 instance that runs the full puppeteer package which does pdf generation with s3 storage?
@WebDevCody 6 месяцев назад ⁺¹
Adding an ec2 instance feels like extra complexity and maintenance to me
@justintefteller2780 6 месяцев назад
I completely agree. My interest is to keep the environments similar. The different packages across test/staging to prod is a red flag to me. I think you could simplify it to something like what GCP has. Use a pubsub message to control a cloud run instance that runs the full Puppeteer, pdf generation and s3 storage api call capabilities ( I'm more familiar with GCP than AWS so translate accordingly ), this will give you the same package across all three environments including local all inside a docker image that isn't environment specific ( such as prod is running in an lambda and can only use x version of y package )
@chisomprince5345 6 месяцев назад
Google cloud run is my goto for pdf generation
@iiNoorbd 6 месяцев назад ⁺¹
Why you are using puppeteer to generate PDFs, bruh...
@WebDevCody 6 месяцев назад
what do you use in your projects? we create the templates of all our PDFs using React, then we convert those to html to generate PDFs. This approach makes managing the PDFs much easier than having to learn some PDF specific language, especially since our entire application uses React + JSX.
@iiNoorbd 6 месяцев назад ⁺¹
@@WebDevCody I am using Maroto, its built with Golang and pretty barebones but gets the job done.
I guess for your use case of creating templates via React + JSX (making fancy pdfs), it would require a browser to render and extract.
The only suggestion I could make is to build or use something less bloated than puppeteer, for example you could try something like go-rod which is more lightweight and will result in shorter execution times, shorter cold starts, etc. but then again, it really depends how much this service is valued within the entire context of your business.
Also maybe you want to consider ECS with fargate instead of lambda
@shubhamb6131 6 месяцев назад ⁺¹
Good catch😦🫡👍🏻🙌🏻
@genechristiansomoza4931 6 месяцев назад
Managing dependency is such a nightmare
@nuttygold5952 6 месяцев назад ⁺¹
Would have been easier if it was a monolith (aka not drinking the lambda coolaid)
@WebDevCody 6 месяцев назад ⁺¹
what happens when one pdf process eats up all the memory on the machine? I think you're oversimplifying the issue
@nuttygold5952 6 месяцев назад ⁺¹
@@WebDevCody just because something is a monolith doesnt mean you cant delegate to another machine, which have timeouts, other constraints.
Dont know what to tell you, the world is much more of a bigger place than node or lambdas
@WebDevCody 6 месяцев назад
@@nuttygold5952 the world is also a bigger place than thinking putting everything inside a single go server that does everything is the ultimate solution.
Real issues like resource contention, monitoring, scaling, orchestration, versioning, load balancing, come into play when you want to run a single monolith. I think lambda has really good use cases and it isn’t necessarily drinking the coolaid.
@gilneyn.mathias1134 6 месяцев назад
"minor versions shouldn't break things, but who knows"
Yeah... I also just suffered with a vue 3 minor version update that broke another package, luckily I caught the problem while testing the update 😂
@nikako1889 6 месяцев назад
Can you talk about internationalization in nextjs ? next-intl
@MurderByProxy 6 месяцев назад
What do you mean when you say binaries? aren't you using JS/TS?
@WebDevCody 6 месяцев назад
Node has the ability to call lower level languages, and often those require compiled binaries to exist on the system
@seiuwatches 6 месяцев назад
Good work on the thumbnail! I prefer this over something that Theo is doing, way too clickbaity.
@CaimAstraea 6 месяцев назад ⁺¹
lambdas cause more issues than they solve. just have an always on server to run requests .. it's like some entity gaslighted and overpromised us into this lambda ecosystem
@uzair004 6 месяцев назад
Don't you use something that emulates lambda-like environment on local? i.e lambda-local, serverless offline
@WebDevCody 6 месяцев назад ⁺¹
we used serverless offline at one point, but it became too slow since we have over 200+ endpoints, so now we just run an express node server locally, and we deploy it to a mono lambda when running in aws.
@eigbit 6 месяцев назад
don't you think weekly package.json updates is a bit much? I can understand monthly, but feels like updating package.json especially w/ dependencies that only have 1 or 2 maintainers is a recipe for disaster
@WebDevCody 6 месяцев назад
yeah maybe once a month might be a good middle ground
@elmalleable 6 месяцев назад
right, manually is uncomfortable for me, if i ain't broke or having some security issues, out of sight, out of mind
@MurderByProxy 6 месяцев назад
Holy shit look at all those dependencies... no wonder why shit is breaking lol
@WebDevCody 6 месяцев назад
It’s pretty common to have a lot of dependencies in larger projects; this isn’t even my work code I’m sharing
@meni181818 6 месяцев назад
the main thing is that the local\stage env should be identical to prod
@WebDevCody 6 месяцев назад
100%
@raident29 6 месяцев назад
this is an underlying problem with js packages, they're not stable.
@Drewyurr 6 месяцев назад
Last slack to my boss literally 10 min ago: "Can you send over a couple of the urls for the failing pdf gen’s when you get a chance"... lol
@WebDevCody 6 месяцев назад
🤣 if someone wants to make a successful saas buisness, provide a simple and reliable PDF generation endpoint that takes in HTML
@MarkJKellett 6 месяцев назад
Renovate is good tool to auto update deps
@yassinesafraoui 6 месяцев назад
Why don't you use a vps instead lambda since you're running into all these limitations, I'm not saying this because it's cheaper, in that case you simply won't have the 250mb limit, is it because of the limited bandwidth users will get since the vps won't be close to them on average while lambdas are always close? Or is it that a vps is harder to manage? Because in cases in which you're really limited by the lambda's capacity, that feels to me like enough justification to afford the overhead of managing a vps. Again, I know you already talked about why you don't always use VPS' as they're cheaper, I'm asking what made you still avoid them even in this case. Thanks
@WebDevCody 6 месяцев назад ⁺¹
We try to use lambda for everything so that we don’t have to do any of the management ourselves of the machines. At the very least, I’d settle for a container running on lightsail or some other type of auto scaling container host to achieve our needs, but if we can stay within the walls of lambda I personally think it’s less overhead. Maybe I’m just lambda red pilled; that’s a possibility
@yassinesafraoui 6 месяцев назад
@@WebDevCody I understand, thank you
@someguyO2W 5 месяцев назад
But it works on my machine!
@karlostj4683 6 месяцев назад
Here's the failure of open-source in one quote: "...although it's like maintained by like one person..."
@WebDevCody 6 месяцев назад ⁺¹
Absolutely, all these packages are abandonware
@avwie132 5 месяцев назад
TLDR: staging is useless if it isn’t the same as prod
@SubZero101010 6 месяцев назад
May I ask how one can earn money by creating PDF documents from websites? The use case is not entirely clear to me. Since the question is likely related to your job, you don't have to disclose this, of course.
@WebDevCody 6 месяцев назад ⁺¹
we don't make money from generating PDFs, we make money by working with a client who needs a system for their judicial system; it just happens their workflow includes a lot of PDF related processing.
@over1498 6 месяцев назад
Lol AWS was a mistake. 2024 is the year of the great “damn this is expensive to maintain, back to on prem!” migration
@aburt79 5 месяцев назад
One person a week going through and updating dependencies? Renovate bot solves that for us
@WebDevCody 5 месяцев назад
Only works if you fully trust update won’t break your app, which sometimes they do
@aburt79 5 месяцев назад
Yeah, that's going to happen even with one developer you are dedicating each week. Start small, have it just do minor patches or something. I dunno, it just seems insane that one developer is spending their time on that.
@WebDevCody 5 месяцев назад
@@aburt79 im not sure I understand your argument. If you let a tool automatically update your dependencies, do you just merge them into your branch and pray it doesn’t break anything or do you actually manually verify the changes?
@aburt79 5 месяцев назад
We have tests, we don't pray
@aburt79 5 месяцев назад
My point is if you're dedicating a developer to this, you're focusing on the wrong thing
@daithi007 5 месяцев назад
Test one thing, release another. Derp.
@WebDevCody 5 месяцев назад
We released the exact same code to three identical environments (dev, test, stg) and tested before prod. Not sure what else you would expect us to do. The bug didn’t show up until a specific situation we didn’t have automated tests to verify
@SegevKlein 6 месяцев назад
you shoulve used aws sam
@vyrwu 6 месяцев назад
you can easily reach 250Mbs - yeah right, using Node xD
@WebDevCody 6 месяцев назад
🤣 I will say you can deploy a single bundled node api in a couple of MB. Puppeteer really tacks on the heavy binaries.
@hocky-ham324-zg8zc 5 месяцев назад
The company that uses the free software maintained by 1 guy to generate revenue is complaining that the free software maintained by 1 guy is maintained by 1 guy… make some PRs then?
@karlostj4683 6 месяцев назад
So in your test and staging environments, you never used all the components your app uses in the production environment. And I'm going to guess that you likewise never scaled your testing to hundreds or thousands of "test users". So the lessons you hopefully learned were: (1) Always test all the components of your software "system" and (2) Find a way to test at User Scale.
@WebDevCody 6 месяцев назад ⁺¹
In test and staging we use the same as prod, but yes I think testing with user scale was the key and might warrant a load test in the future
@karlostj4683 6 месяцев назад
@@WebDevCody It's not easy to do. It's not like AI can create AIusers. Although maybe that's something a Ph.D candidate could do as a thesis...
@micha6204 6 месяцев назад ⁺¹
clickbait title
@WebDevCody 6 месяцев назад
hate the game not the player; I talked about a bug with AWS lambda in this video, so I'm not sure what makes it click bait.
@discoRyne 6 месяцев назад
replybait comment

Следующие

Автовоспроизведение

This is the real purpose for react context