Two Pieces of Code That Nearly Got Me Fired From Amazon (Principal Engineer)

Поделиться
HTML-код
  • Опубликовано: 8 янв 2025

Комментарии •

  • @ALifeEngineered
    @ALifeEngineered  21 день назад +7

    Use coupon code engineered at nordpass.com/engineered to get a free 3-month trial of NordPass Business, no credit card required.
    🚀 Get promoted in 2025 by taking my FREE 5-Day Promotion Accelerator Challenge - geni.us/9P7CAM
    💥 Continue the conversation on my Discord server with like-minded ambitious tech professionals. #accountability is *chef's kiss* and #wins is motivating - discord.gg/HFVMbQgRJJ
    📈Transform your tech career with my free weekly newsletter - alifeengineered.substack.com/

  • @benfung9571
    @benfung9571 20 дней назад +25

    Actually, This is so encouraging, executing an equivalent rm-rf command and still able reaching principal level.

    • @OP3Beats
      @OP3Beats 14 дней назад

      I would go so far as to say that he didn’t get promoted in spite of this but it probably helped his career.
      1. Highlighted the importance of the system he created.
      2. He figured out a way to recover and save face while under immense pressure
      3. Did some x org coordination that shows leadership qualities.
      4. Everyone probably knew who he was after that and getting the spotlight helps you get promoted.

  • @lurnt5763
    @lurnt5763 20 дней назад +12

    that passive aggressive interaction with s3... definitely came from experience

  • @aben62
    @aben62 20 дней назад +6

    Folks, this is how Steve would answer in an interview. It is really valuable in: which failure you pick, how you describe in clear manner. Thank Stave!

  • @Phanboy
    @Phanboy 20 дней назад +22

    That interaction with services team is 💯. Make no assumptions, treat everyone like AI 😂

  • @StretchyDeath
    @StretchyDeath 20 дней назад +5

    I love a great SEV story. Thanks for sharing!

  • @drew_echo
    @drew_echo 20 дней назад +16

    The learning from the mistakes part is important. Once we had a principal level engineer with a prod breakage rate that was measurably 10x higher than anyone else in the company. The company took the blameless culture too far & each time the answer was what can we do to prevent this from happening rather than addressing the elephant in the room. We ended up spending significant resources babyproofing everything for one engineer rather than surgically operating on the root cause. "Lightsaber night is cancelled. Thanks Todd!": If you aren't willing to act on gross recklessness, then the organization will build layer on top of layer of bureaucracy that punishes everyone. We extended procedural due diligence by two weeks or more to release changes for the entire organization to blamelessly prevent one engineer from breaking prod nonstop.

  • @Rammcesh
    @Rammcesh 18 дней назад +1

    This is very valuable. Thank you for sharing!

  • @tamasbalint1597
    @tamasbalint1597 20 дней назад +1

    Thank you, Steve for openly sharing these experiences. I enjoyed the video. Well crafted.

  • @placeholder-k9n
    @placeholder-k9n 20 дней назад +4

    I had a feeling right when you mentioned the chunking logic that this was going to be a case of a script gone rogue due to a character. Everyone loves Little Bobby Tables, after all :)
    Seriously though - I've only seen SEV2 in my time so far. I can't imagine being at the center of a SEV1.

  • @steelplexfyro
    @steelplexfyro 20 дней назад +2

    Great storytelling, explanations, and video!

  • @NeoPsMj
    @NeoPsMj 15 дней назад +1

    Now this is what we call great content 🙌

  • @BitCloud047
    @BitCloud047 20 дней назад +1

    Awesome video as always man!

  • @jhors7777
    @jhors7777 20 дней назад +1

    Great video and channel Steve. Thanks so much. You have a wonderful gift for communication.

  • @randxalthor
    @randxalthor 20 дней назад +1

    Great insights! Thanks for sharing your experiences so that we can all avoid making the same mistakes.

  • @souravpoddar6739
    @souravpoddar6739 14 дней назад +1

    Me watching this right before I go on-call tomorrow, having dealt with a customer impacting issue last Friday :D.
    PS : I am part of the the Media Ingestion and Processing team in Prime Video.

  • @infini.tesimo
    @infini.tesimo 20 дней назад +2

    Subbed for the thumbnail meme, stayed for the knowledge.

  • @halloyves
    @halloyves 19 дней назад +1

    Thanks for sharing! Yes todays world offers a lot more possibilities to prevent such issues. Our infrastructure for example is fully event-driven and even when something breaks, we still have the dead letter queue. Great time to be alive! :)

  • @fairnut6418
    @fairnut6418 20 дней назад +1

    Great video!

  • @jiwa-f8s
    @jiwa-f8s 20 дней назад +1

    Great video!!

  • @bstancel12
    @bstancel12 20 дней назад +1

    Your describing between you and the S3 department sounds just a tad better than every interaction with AWS Business Support.

  • @salernod2812
    @salernod2812 19 дней назад +1

    The swiss cheese analogy is widely used in aviation to explain that accidents are never the result of a single error.

  • @IkraamDev
    @IkraamDev 19 дней назад +2

    3 years as a software engineer and the worse thing I have done was a css styling bug that hid an add to cart button on mobile viewports.

  • @eglobalsystems2554
    @eglobalsystems2554 18 дней назад

    For the 2nd disaster shows that that’s why SDETs are important part of the application

  • @kane_lives
    @kane_lives 19 дней назад

    I really don't understand how the 2nd issue made it to the production environment. Boundary-value analysis is testing 101, virtually any testing book covers it circa chapter 1.

  • @skjoldgames
    @skjoldgames 12 дней назад

    That first example must have left the deepest pit in your stomach! I would have been fighting back tears personally.

  • @anjunzhouJack
    @anjunzhouJack 20 дней назад +1

    awesome vid. as a mid level engineer, i echo with what's in the video. and hoepfully i'll not be at the end of a sev1.

  • @nikhiljain1113
    @nikhiljain1113 20 дней назад +2

    By any chance, did you work with Ethan Evans? He shared a similar story in another podcast.

  • @silv3rArrow
    @silv3rArrow 16 дней назад

    Would it not have been possible to provide X supported workflows to the distribution companies so the script could include all the possible combinations?

  • @factorfitness3713
    @factorfitness3713 15 дней назад +1

    When you caused a sev 1 and need to do a COE, your better CYA.

  • @kcnl2522
    @kcnl2522 20 дней назад +1

    9:33 you did not hesitate even a little bit before entering that quantity? 😂

  • @AndrewSunada
    @AndrewSunada 15 дней назад

    That seems like a problem of documentation on S3 side

  • @H3110W0rd-j
    @H3110W0rd-j 18 дней назад

    How can the teams justify the importance of having a testing environment? The work isn't available to our customers. Test engineers are often treated as a 2nd class citizen in terms of career paths, salary and visibility.

  • @l.1416
    @l.1416 14 дней назад

    But this bigger than 5G file case and this path of delete code never got tested?

  • @nhienle5137
    @nhienle5137 20 дней назад +1

    So could you please explain what did you do to solve the last incident? Just want to understand what had you guys done to fix it.