CrowdStrike Outage Explained by Keith Barker CCIE

Поделиться
HTML-код
  • Опубликовано: 6 сен 2024
  • CBT Nuggets trainer Keith Barker explains the recent global CrowdStrike IT outage.
    Everyone can benefit from Microsoft training: training.cbt.g...
    Not a CBT Nuggets subscriber? Sign up for a FREE trial today: training.cbt.g...
    -----------------
    Connect with CBT Nuggets for the latest in IT training:
    • LinkedIn - / cbt-nuggets
    • X - / cbtnuggets
    • Instagram - / cbtnuggets
    • Facebook - / cbtnuggets
    #crowdstrike #microsoft #cybersecurity #keithbarkerccie #ittraining #itcertifications #itprofessional #adept #cbtnuggets

Комментарии • 101

  • @JackMyers-br2vi
    @JackMyers-br2vi Месяц назад +111

    This global internet outage is insane! All airlines grounded and i was stock the airport and even banks, media, and offices from the U.S. to Australia. How can CrowdStrike have such a monopoly that could help restore such a massive amount of tech?

    • @MattMiller211
      @MattMiller211 Месяц назад +1

      It's pretty concerning. If they can fix this, what other control do they have over our infrastructure? or are we truly in the matrix?

    • @MelissaHobbs-qm8wi
      @MelissaHobbs-qm8wi Месяц назад

      Right? It makes you think about the stability of our systems. But hey, I barely spend time online. When I checked my portfolio with Desiree Ruth Hoffman, we were still in the greens. That’s been the case for 16 months straight!

    • @JoyceMuller-xv6kh
      @JoyceMuller-xv6kh Месяц назад

      Wow, really? I've seen the name Desiree Ruth Hoffman before but can't figure out where.

    • @MattMiller211
      @MattMiller211 Месяц назад

      Probably from her forecast on Nvidia before the pump. But how are you in the greens with all the fluctuations due to the election and everything else? Can you share her strategy?

    • @MelissaHobbs-qm8wi
      @MelissaHobbs-qm8wi Месяц назад

      Honestly, just schedule a call with her. She has vast knowledge in finance and really knows how to navigate these times. I handed over my portfolio to her so I can focus on my family. These days, things just get scarier and scarier.

  • @adamweah8037
    @adamweah8037 Месяц назад +28

    An amazing explanation of the crowd strike incident thank you! off topic I recently sold my condo for $400k and i want to invest the money in the stock market. However, it appears the market is at an all-time high. Should I invest elsewhere or wait for a market correction?

    • @smithdavis1362
      @smithdavis1362 Месяц назад

      The stock market is risky But staying on the sidelines is riskier. Missing the next bull run will be far more costly to your long-term wealth than getting in at the "right price". Consult a financial advisor if you're unsure how to proceed.

    • @ryanthompson8256
      @ryanthompson8256 Месяц назад

      You're right, I and a few Neighbours in Bel-Air Area work with an advisor who prefers we DCA across other prospective sectors. Instead of a lump sum purchase, Following this, my portfolio grew 37% in the last quarter

    • @floydchusset3143
      @floydchusset3143 Месяц назад

      Mind if I look up your advisr please? I've worked in real estate for over 25 years and have neglected a major stock portfolio. This served me well when I was flipping and renting houses, however I need a different plan now

    • @ryanthompson8256
      @ryanthompson8256 Месяц назад +1

      I've stuck with the popularly ‘’Laura Grace Abels” for about five years now, and her performance has been consistently impressive. She’s quite known in her field, look her up

    • @JasonDinero
      @JasonDinero Месяц назад

      thank you for putting this out, it has rekindled the fire to my goal... was able to spot Laura after inputting her full name on the web, she seems highly professional with over a decades of experience

  • @pppam
    @pppam Месяц назад +17

    Keith , stay with us , we want more interventions from you.❤

  • @NealKlein
    @NealKlein Месяц назад +13

    To see this video show up from my favorite CBT Nuggets instructor was beyond wonderful. Last week sucked. This video offsets that a lot.

  • @ianmzatimboola9199
    @ianmzatimboola9199 Месяц назад +12

    Well explained! It’s clear now. Thanks a bunch, Keith!

    • @kittykat9825
      @kittykat9825 3 дня назад

      He’s being paid to lie to you why are you thanking him?

  • @engattiaali
    @engattiaali Месяц назад +5

    You were , are ,will always be the best who can explain complex topics in a very efficient way and fully straight to the point , i like the video so much and thrilled to watch you again Keith

  • @jeraldbottcher1588
    @jeraldbottcher1588 Месяц назад +7

    This boggles my mind as an IT professional. I was part of a team that deployed patches and software for years. This included OS deployment patch deployment, software deployment the whole thing on both Workstations and Servers. We tested our patches extensively before pushing them out to the entire population of the environment. This 1st included a sandbox environment, then a select user / system environment, then we would stage our patches out over several hours so if something happened we could back out before catastrophe struck. And honestly sometimes we would find problems with the patches, and we would be able to immediately stop, suspend and even back out.
    Yes we would use 3rd party vendor solutions to help with this, and any time we changed ANYTHING we would follow our testing procedures and matrix, normal business. We would never shirk our procedures to test 1st, then deploy. To me this is a total failure of IT Governance and failure to maintain standards. (IT Governance is setting and maintaining standards and policies for the IT Infrastructure)

    • @user-rr3fo6hy9q
      @user-rr3fo6hy9q Месяц назад

      I work as a Network Engineer now but it helps to have come from the help desk because you feel their pain when you have to run around fixing issues like this. When I was help desk I always preached to our team TEST TEST TEST before deploying. There are so many resources you can use to test something before deploying it there is no excuse why this should occur.
      For example I just used GNS3 to test a new firewall before deploying it (pfSense). Like you said there should be several layers of testing. Maybe start virtual with something like GNS3, then physical sandbox environment, then one user, then one department, then once you’re confident it’s safe, do a global deployment. Just like Keith said this issue could be resolved with proper Quality Assurance, which includes proper testing. TEST TEST TEST!

  • @user-rr3fo6hy9q
    @user-rr3fo6hy9q Месяц назад +1

    It’s really nice to have a resource like Keith to explain what happened. Reading the news is like getting 1% of what actually happened.

  • @jasgarcha4783
    @jasgarcha4783 Месяц назад +4

    Keith, great video. Would they have not tested the file prior to deployment into the production environment? Sandbox possibly? Keep up the great work❤.

  • @AbdulAziz-by1wj
    @AbdulAziz-by1wj Месяц назад +1

    CBT nuggets was my first love and still the same. Thank you for being part of my list since 2012.

  • @dbwillt1
    @dbwillt1 Месяц назад +3

    You would think as a one of the top Cybersecurity companies out there with a large footprint as they do, would no better to prioritize QA on the code, sandbox in a test environment, and gradually deploy critical updates before deploying them on a large scale. Especially when it involves ring 0. This process would involve change management and be a part of their written standards and procedures. Makes you wonder if this was done intentionally by an insider threat or just plain incompetence and neglect. Thanks for the content Keith Barker.

  • @searchbug
    @searchbug 21 день назад

    Thanks for this explanation Keith! The issue stemmed from a mismatch in the input parameters used in the update, which caused widespread crashes across millions of devices. Since businesses and offices are the most affected, companies should implement rigorous verification, such as an email verification API, to protect company assets.

  • @fgrion
    @fgrion Месяц назад +2

    I don't want to sound malicious, but in my opinion there are 2 possible factors why this is happening: many companies around the world during summer hire interns. This wasn't a major update, it was a routine one so they gave some "power" to the interns to run it. Other thing could be this push of "diversity" which means you don't hire the best candidate but the one that matches certain criteria, so maybe the one in charge wasn't the most competent but just matched some boxes. It's a bit difficult finding out what happened because those are company internal infos

    • @pisofff277
      @pisofff277 Месяц назад

      This was predicted so id say it was intentional

  • @jhlmata
    @jhlmata Месяц назад +2

    Hi Keith. Do you think crowdstrike customers should also test these updates before being automatically deployed? Especially for critical operations (airlines, hospitals, etc), I wouldn't want any updates to go out to my organization without me testing them first. Thanks

    • @user-rr3fo6hy9q
      @user-rr3fo6hy9q Месяц назад

      Kind of similar to having your own onsite WSUS sever. My last data center engineer had one which meant no Windows Updates were applied to our PC’s without it first being filtered through him and analyzed to make sure it was safe.

  • @TheUrbanGeekinTECH
    @TheUrbanGeekinTECH Месяц назад +3

    Thanks for sharing Keith!

  • @freebk161
    @freebk161 Месяц назад

    Yes Keith. You are right. Better QA plus implementing self correcting mechanism to prevent the issue from occurring again avoiding BSOD.
    One more simple explanation from you on such a complex issue!! ---Thanks again!

  • @user-xr5sf1zo8h
    @user-xr5sf1zo8h Месяц назад

    As a technical writing prof, I applaud your presentation. It's a good example of "show, don't tell." The castle analogy was very effective. It helps frame the issue for a non-technical person. I will share the video with my family.
    I'd like to see a video on the QA part you mentioned. How should Crowdstrike have done the QA? What's the normal QA procedure? Are smoke and/or regression tests normally done?

  • @NeekRusher
    @NeekRusher Месяц назад +1

    Here is what happened for the organization I work for. I am apart of the desktop support group. Mixed messages were sent to our clients. Some people said NO, I am going to leave the computer alone. Others got caught up and did either a system restore or computer reset. Clients ended up putting their desktops back to factory. Others did regain access but lost critical applications to their job duties. Management then retracted what they sent out to the clients, warning them not to do anything let IT support handle the problem. Keith, it was such an easy fix. Before Microsoft sent out their fix tool. Me and my team were using Hiren Boot CD. I was banging out at least 10 to 15 computers under an hour. No problems. No loss of data. No loss of apps. DO NOT DO A SYSTEM RESTORE OR RESET. You will delay your uptime by doing either of the two. Just sharing my experience with this. Oh and if you do a system restore, you will see a link on your desktop that say removed apps. Thanks for your knowledge on this topic Keith.

    • @user-rr3fo6hy9q
      @user-rr3fo6hy9q Месяц назад +1

      Wow, Hirens boot CD, I haven’t heard that name in years I used to have one when I was in school. It’s good to know this is still a useful tool, I will look into it again. Good job keeping your cool and thinking critically to resolve the issue. I’m sure it saved a lot of headaches. Sometimes it’s best to simply isolate an affected computer from the rest of the network and take the wait and see approach with issues like this. Nice work!

    • @NeekRusher
      @NeekRusher Месяц назад

      @@user-rr3fo6hy9q Thank you. Hirens is still the best swiss tool to have.

  • @leniotsiou
    @leniotsiou Месяц назад

    Thank you! Once again Keith nailed the explanation!

  • @falgunpatil2372
    @falgunpatil2372 Месяц назад +2

    Does Falcon update its minor version when they make any updates? There has to be daily *automated* sanity to run against all Windows flavors with latest Falcon updates *before* it's pushed out to end user. Feels like I'm providing suggestion to high school kid who is just venturing into programming! 😄😄😄

  • @jlam3927
    @jlam3927 Месяц назад +1

    Remoting into every computer would be a nightmare. Having to touch each one to apply the fix is incomprehensible, only to be surpassed by a longer fix or having to rebuild the computer.

  • @katyh2650
    @katyh2650 21 день назад

    Hey Keith, can you explain what a channel file is and an input field as talked about in Crowdstrike's RCA? Thanks.

  • @fredsmith2277
    @fredsmith2277 Месяц назад

    ring zero is the foundation you cant see of the building ring 1 is the house you can see, if a window fall out you still have a functional house, if the foundations collapse the whole house collapses !!!

  • @prebsi8603
    @prebsi8603 Месяц назад

    Thanks so much, Keith.
    FINALLY, someone in the comments is asking: Why was this not tested at Crowdstrike customers before the update was deployed to the client computers?
    Is this because there are many updates daily, so there is no time to do proper Change Management, and it very important with bleeding edge updates all the time?

  • @RigTV.
    @RigTV. Месяц назад

    What kind of legal exposure are they facing? Are they liable?

  • @thomasclark631
    @thomasclark631 Месяц назад +1

    It’s hard to believe that Crowdstrike clients allow software updates to be downloaded directly to their information systems without any gatekeeping.

    • @Medicranger
      @Medicranger Месяц назад

      To protect against zero day attacks. They outsource their security to CrowdStrike.

  • @damonabets3779
    @damonabets3779 Месяц назад

    This was an amazing explanation of the crowd strike incident thank you! I love the visual aids!

  • @alycewheeler972
    @alycewheeler972 Месяц назад

    Fantastic work, Keith. Thank you! This was a great explanation of what happened. Great Work!

  • @richardasabilla3827
    @richardasabilla3827 Месяц назад +1

    THanks very much, u have just gotten yourself a new subscriber

  • @Southpaw07
    @Southpaw07 Месяц назад

    no question a crowdstrike issue but there is need for improvement on MS WHQL certification process

  • @nicka8774
    @nicka8774 Месяц назад

    Great explanation Keith! You are the best!

  • @welpen2006
    @welpen2006 Месяц назад

    Mostly the big three US Airlines were affected, so it was a local problem. ❤❤

  • @pieterpretorius1014
    @pieterpretorius1014 12 дней назад

    MS really needs to put a pad lock on the nt kernel to prevent these kinds of system crashes. i think the area in which the kernel sits should really by only for the kernel and nothing else. messing with the core code of the OS is a really bad idea and is only more trouble for the I.T. people who have to maintain the systems and servers that have to run basically non stop

  • @harveypaxton1232
    @harveypaxton1232 Месяц назад

    This comes back to the IT managers at each company?

  • @majiddehbi9186
    @majiddehbi9186 Месяц назад

    thx keith u got a simple expaination as always thx for u re simplicity

  • @charlesromney
    @charlesromney Месяц назад

    So the falcon software goes over all the local policies and update controls? What I'm missing here?

  • @RonaldBartels
    @RonaldBartels Месяц назад

    What unit testing was done? Was it automated? Did the test environment thunk as well?

  • @aigoaizporietis9275
    @aigoaizporietis9275 Месяц назад

    Still does not really make sense.
    Once Falcon software changes (detail 3), it is not the same software what went through the WHQL.
    As soon as the “detail 3” was changed, it is “detail 3.1 version” and it means that the whole Software technically is different.
    Why this particular version with this “detail 3.1 version” was not tested?

    • @NYYstateofmind
      @NYYstateofmind Месяц назад

      The actual falcon process was signed. Falcon loads whatever signatures it needs for threat detections from the content updates it receives probably nightly from the server.
      The falcon process was never modified. The libraries the process uses, and the actual falcon process are decoupled.

  • @ian230187
    @ian230187 Месяц назад

    Very well explained ❤

  • @flaxengoose
    @flaxengoose Месяц назад

    Great video, cheers Keith

  • @kyrpichko
    @kyrpichko Месяц назад

    Very nice video. But still there is this little question unanswered. How was the Falcon driver allowed to load a faulty .sys file? What does a faulty .sys file even mean?
    It cannot be by human error. This would mean that the entire world hangs on the afternoon shift of one guy at CS.
    In order to load a .sys file into the Kernel ring, surely it has to pass multiple hash checks. cryptographic checks etc.
    What checks does CS perform. At this point how can it or asamof any other AV company that utilizes boot drivers be trusted from now on?
    Not only due to negligence but also intentional attack. If it's that easy to load a faulty file in its Falcon driver than what is there to stop a bad actor to do the same?

    • @Medicranger
      @Medicranger Месяц назад

      Easy. The .sys driver that has gone through the validation and certification is actually csagent.sys. That driver works flawlessly. However it uses definition files that are separate. It literally is the same as putting diesel fuel in a perfectly good gasoline engine.

  • @KDOERAK
    @KDOERAK Месяц назад

    a great explanation👍

  • @RalphSmith-cj5he
    @RalphSmith-cj5he Месяц назад

    The Windows 8, 9, 10 x series have Always been crap, trash even. Managers opted for a one point of failure with no contingency. I do my DTPublishing still on Win7 workstations. Am currently watching this Video on one of my Dells.

  • @TheJpmuzz
    @TheJpmuzz Месяц назад

    I really think Keith Barker needs to cover the Google BGP outage from last year.

  • @dustcore
    @dustcore Месяц назад

    More videos Keith 👏🏽👏🏽👏🏽

  • @ibizenco
    @ibizenco 12 дней назад

    Or how too much power with one company is a bad thing.

  • @bernardsimsic9334
    @bernardsimsic9334 Месяц назад

    Funny to find out the CEO with another company as Their CEO was the cause of another national computer black out due to a software FU there also!!!

  • @Jim-zs5bi
    @Jim-zs5bi 19 дней назад

    Don’t give access to the kernel.

  • @jesonladra3882
    @jesonladra3882 Месяц назад

    What if my computer turn in blue screen and I already turn it off can my computer able to fix?

    • @jesonladra3882
      @jesonladra3882 Месяц назад

      Cause when I turn on, my system unit turn on but no output in my monitor

  • @megmucklebones7538
    @megmucklebones7538 Месяц назад

    Great explanation - someone really, REALLY screwed up here. How on earth did this get past QA. Additionally, most updates should, in a perfect world, be applied to a pre-patch test environment within any critical business before being pushed out, bet that didn't happen or couldn't happen.

  • @alexmarchant4277
    @alexmarchant4277 Месяц назад

    Nice. A real CBT Nut :)

  • @tbone0785
    @tbone0785 Месяц назад

    Keith you missed the opportunity to use a moat in your castle diagram and analogy 😂. Fire breathing dragon too

  • @luzvimindaalvarez-ve3to
    @luzvimindaalvarez-ve3to 2 дня назад

    n0t inc0rrect file.... hindi k0 lng itinul0y ang pirma.....

  • @PowerOfOne-u4h
    @PowerOfOne-u4h Месяц назад

    At least the screen is a nice blue colour....... there's that.

  • @Medicranger
    @Medicranger Месяц назад

    Actually it’s called a stop screen.

  • @Freek3143
    @Freek3143 28 дней назад

    This is surface level BS.... The mistake they made was a mistake only the greenest of rookie programmers would ever make, and they likely had to rewrite unit test code in order to even push this out to customers

  • @fredsmith2277
    @fredsmith2277 Месяц назад

    this is a long and complicate explanation ??? they basically uploaded an update file full of zero's and the executing program down at the most base level crashed because it could not make sense of the file full of empty zero's, this locked up the whole operating system and the error would repeat it's self even if you rebooted the system, it would reload the error file 0000291 automatically and crash the system again every time, so the solution was to boot up in safe mode that only loads the absolute bare minimum files to run basic systems and avoid loading the offending file again, and then navigate to the crowdstrike folder and delete the update file 0000291 and reboot, this fixed it because the offending file was gone and not reloaded so the system would run the same as before the offending file was downloaded with no loss of data, besides any unsaved work that was running on the system before it crashed !!!

  • @EmperorShang
    @EmperorShang Месяц назад

    "Haha, hospital and emergency services no worky because CrowdStrikey updatey. Oh well" - US Govt

  • @oOrbitZz
    @oOrbitZz Месяц назад

    sneaky falcon ....

  • @Redwolfxx
    @Redwolfxx Месяц назад

    Heh, Blue Falcon

  • @CrazyWhiteBoomer
    @CrazyWhiteBoomer Месяц назад

    This is the FIX in case Kamala Tanks in November...

  • @Epicinver
    @Epicinver Месяц назад

    wtf i got one during this video

  • @kittykat9825
    @kittykat9825 3 дня назад

    You have all bots commenting!! Also, Why are you running cover for them? You know there was no excuse for this!!

  • @anon837
    @anon837 Месяц назад

    Y2K4

  • @dkpick
    @dkpick Месяц назад

    ...whoops! should have used a GOSUB instead of a GOTO. Sorry... 😉

  • @anonymoususer6786
    @anonymoususer6786 Месяц назад

    Crowdstirke needs to go bankrupt and payout