Why Is My App SLOw? Defining Reliability in Platform Engineering • Jez Humble • GOTO 2023

Поделиться
HTML-код
  • Опубликовано: 2 авг 2024
  • This presentation was recorded at GOTO Aarhus 2023. #GOTOcon #GOTOaar
    gotoaarhus.com
    Jez Humble - SRE at Google Cloud & Lecturer at UC Berkeley ‪@JezHumble‬
    RESOURCES
    continuousdelivery.com
    github.com/jezhumble
    / jez-humble
    / jezhumble
    sre.google/resources
    ABSTRACT
    Platform engineering is all fun and games until platform customers start complaining about their apps running slowly. Is it the app code or the platform?
    This talk looks at how Google’s Serverless SRE team detects platform-level latency regressions before users, measures the impact of regressions, and tracks performance over time. We’ll discuss the limitations of SLOs in this context and how to take a statistical approach that gives a customer-centric picture of the performance of our platform instead. [...]
    TIMECODES
    00:00 Intro
    02:08 Serverless platform is amazing
    05:56 "My app is slow"
    08:14 The platform is slow
    09:29 Total (end-to-end) latency distribution
    10:54 Request delivery latency
    12:37 Goal
    13:39 Reliability in practice
    17:09 Applying to the model
    18:39 Stationarity
    19:31 2-Sigma Technique
    24:30 Mechanics
    26:41 Overload score
    28:08 Impact analysis
    29:03 FAQ
    31:56 Backtesting
    33:21 Limitations
    35:13 Other applications
    35:20 Streamlined diagnosis
    37:06 Approximate cohort A/B testing
    37:33 Conclusions
    40:02 Outro
    Download slides and read the full abstract here:
    gotoaarhus.com/2023/sessions/...
    RECOMMENDED BOOKS
    Nicole Forsgren, Jez Humble & Gene Kim • Accelerate • amzn.to/442Rep0
    Kim, Humble, Debois, Willis & Forsgren • The DevOps Handbook • amzn.to/47oAf3l
    Jez Humble & David Farley • Continuous Delivery • amzn.to/452ZRky
    Jez Humble, Joanne Molesky & Barry O'Reilly • Lean Enterprise • amzn.to/47pcOXD
    / gotocon
    / goto-
    / gotoconferences
    #SLO #SRE #ChaosEngineering #Serverless #PlatformEngineering #2Sigma #GoogleCloud #JezHumble #DevOps #Accelerate
    Looking for a unique learning experience?
    Attend the next GOTO conference near you! Get your ticket at gotopia.tech
    Sign up for updates and specials at gotopia.tech/newsletter
    SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.
    ruclips.net/user/GotoConf...
  • НаукаНаука

Комментарии • 2

  • @allanwind295
    @allanwind295 11 месяцев назад +5

    You make it sound like the 2 sigma is deviation is something Google came up with by this is control theory concept is from the 1950s (i.e. en.wikipedia.org/wiki/Western_Electric_rules ). There are value here and enjoyed the presentation.

    • @JezHumble
      @JezHumble 11 месяцев назад +3

      Thanks for providing a reference to the Western Electric rules and for your kind words about the presentation. I did not mean to imply that Google invented the idea of 2 sigma, that of course is just basic statistics (and I refer to its use in other domains). Rather, the application of this technique to platform reliability (and it's certainly possible other people have applied it here too)