How regexes got catastrophic

Поделиться
HTML-код
  • Опубликовано: 31 дек 2024

Комментарии • 189

  • @focando-lol
    @focando-lol Месяц назад +220

    i understand nothing but lowkey enjoying what is happening

    • @Musikvidedo
      @Musikvidedo Месяц назад +10

      if you keep at it you'll understand most of it pretty soon

    • @NithinJune
      @NithinJune Месяц назад +6

      kay is so pretty how am i supposed to focus 😵‍💫

    • @JordanManfrey
      @JordanManfrey Месяц назад +4

      That’s most developers experience using RegEx in a nutshell

    • @v0id_d3m0n
      @v0id_d3m0n 20 дней назад

      It makes more sense if you've watched the first two videos :)

  • @yaboi1525
    @yaboi1525 Месяц назад +136

    Please don't stop uploading. I found this channel by chance and it has to be one of the best things happening this year

    • @moosehunter248
      @moosehunter248 Месяц назад +3

      Sounds like you should consider sponsoring the channel so it doesn’t happen 😊

  • @LeeDanielCrocker
    @LeeDanielCrocker 24 дня назад +62

    FYI, Rust's standard regex library uses an O(mn) algorithm without backreferences.

    • @GregWhiteley
      @GregWhiteley 16 дней назад +7

      Puzzled by this. Rust has _no_ standard regex library, and rust-lang discussion groups trumpets this as a feature, not a bug. Or has something changed.

    • @albertweber1617
      @albertweber1617 16 дней назад +6

      The de-facto default regex crate "regex" does not implement look-around or backreferences

    • @LeeDanielCrocker
      @LeeDanielCrocker 16 дней назад +3

      @@GregWhiteley The regex crate is not de jure standard, but de facto the one everyone uses.

  • @Julienraptor01
    @Julienraptor01 Месяц назад +126

    I now understand why i'm good at regex where most people fail
    I never use backreferences
    Like idk why but they never felt right logically
    Even when i made crazy things like a youtube URL parser to clean those in regex, i've found ways to just do it without backreference when i could have used some
    And it's kinda cause when i'm building the regex, i'm running it in my mind and backreference just makes that impossible
    Like tracking what it does become too complex
    So big thanks for this vid, very informative and great !

    • @Eunakria
      @Eunakria Месяц назад +17

      for most CFGs I've always found it easiest to just build a tokenizer with regex and a parser with other techniques
      leaving the abstract computer science world and entering practical software engineering world - it is possible to express any CFG as just that, a parser, but the tokenizer/parser distinction often yields faster, more readable code with advisable properties for most practical applications of parsers (e.g. DSLs such as SQL or serialization formats such as JSON)

    • @JordanManfrey
      @JordanManfrey Месяц назад +1

      I’ve never used them because when I learned them in school they would bring up stuff like that as being dangerous bullshit

    • @Eunakria
      @Eunakria Месяц назад

      @@JordanManfrey I've had teachers and profs like this before. totally disconnected from the real world. they treat regexes like the devil, while pretending that memory safety errors in languages like C/C++ are a non-issue.
      regexes are an incredibly powerful tool if you know how and where to use them. definitely, part of "how and where" is knowing when you as a programmer, or your team, may be too error-prone to use the tool in confidence. I definitely would avoid using regex, or at least seriously consider and verify my use of it, in any high-availability or high-reliability context; but it's not so bad, all things said and done

    • @AySz88
      @AySz88 Месяц назад +5

      ​​@@EunakriaBizarrely enough, I've more often encountered new learners going the other direction - they "in practice" want to abuse regex features, because it's built in, and looks powerful, so it must be fast. (Think "I built a calculator in Minecraft" style shenanigans, and Parker's usual stuff for that matter.) And parsing and tokenizing is more of an abstraction for them, especially if they never got a sense for how to use the toolkit of functional programming.

    • @Noname-67
      @Noname-67 27 дней назад +1

      Maybe your brain is simulating a finite automaton when doing regex.

  • @wojciechostrowicz
    @wojciechostrowicz 26 дней назад +20

    YT algorithms just decided that my constant work with regex deserves this video. Thank you algorithm.
    It was very pleasant to watch.

  • @rkvkydqf
    @rkvkydqf 23 дня назад +4

    I can't believe this video has only 31K views. All this work the amazing visualizations, the quality of the explanations, the lined exercises in the description. I truly hope all this work would get rewarded some day. Thank you so much.

  • @JasonWalter
    @JasonWalter 18 дней назад +8

    Reminded of what JWZ said: "Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems."

  • @AstonishedByTheLackOfCake
    @AstonishedByTheLackOfCake 25 дней назад +17

    I've always had a sneaking suspicion that regexes were too versatile and did too many things to actually be performant, turns out I was right
    though I did not expect back references to break the entire concept of a language being regular
    will probably use Google's regex engine in the future since I barely ever use backreferences anyway

  • @9_-_-_-_-_swo
    @9_-_-_-_-_swo Месяц назад +43

    i got a (.*?) tattoo when i barely knew what regexes were lol. thank you so much for this series, it has done wonders for my ability to live with that decision (and is also some of the best comp sci content I've ever seen on this platform

    • @NickiRusin
      @NickiRusin Месяц назад +1

      that's a really good idea for a tattoo

    • @TVDaJa
      @TVDaJa Месяц назад

      Someone please explain for my lil brain

    • @SimonBuchanNz
      @SimonBuchanNz Месяц назад +18

      ​@@TVDaJaaccepts anything, but makes everything after it backtrack as much as possible if it fails. An incredibly efficient way to be as inefficient as possible!

    • @9_-_-_-_-_swo
      @9_-_-_-_-_swo 28 дней назад +2

      @@SimonBuchanNz u can just put it between stuff

    • @SimonBuchanNz
      @SimonBuchanNz 28 дней назад

      @@9_-_-_-_-_swo yeah, it's fine for input you know is going to be a reasonable size.

  • @Niki1A_
    @Niki1A_ Месяц назад +39

    Why not use the lockstep algorithm, when the regex has no backreference? It would be easy to just store a boolean with the automaton, that indicates whether it has backreferences, and pick the algorithm accordingly. This would limit catastrophic cases to regexes that actually use backreferences, which could be taught as something to be avoided.

    • @Rock6Sixes
      @Rock6Sixes Месяц назад +9

      wanted to comment this, but looked through the comments first. Seems very obvious, but I can't think of the reason why. Maybe the automata structures get reused when possible and you can't know where to use which engine.

    • @rngwrldngnr
      @rngwrldngnr Месяц назад +10

      It's possible some systems do, these days. The problematic cases all have backreferences, so it wouldn't actually help in those edge cases.

    • @Ziraya0
      @Ziraya0 27 дней назад +1

      The only reason I can imagine is that with and without backreferences would be two literally different algorithms and it's going to be hard to make the algorithms otherwise have exactly the same outputs for the same patterns & inputs. I've never used a backreference before, I can't really conceptualize why I would, but in part that's because I already know that regex can't parse html or email addresses, so I've never tried to do a thing you would use them for with regex; the obvious answer to me is basically what you said, but instead of a flag, the language has two literally distinct regex implementations that you're responsible for picking between, and writing regex that produces the result you want for the one you're using. One is fast, one has this weird trick.
      I do a lot of manual regexing stuff in text editors, spreadsheets and microsoft's PowerRenamer, so I'm already used to tailoring patterns and methodology to the environment I'm in. PowerRenamer in particular has an alternate engine it can use that's supposedly faster, but they're not 100% congruent in features and behaviors. In particular, PowerRenamer's default, a wrapping of .Net regex does something wrong with lookbacks so if you want lookbacks to work, you have to use the other one.

    • @PizzaRollExpert
      @PizzaRollExpert 25 дней назад +3

      This might arguably be the best solution, but it's still far from ideal. Adding a backreference would completely change the performance characteristics of the regex, which is pretty unintuitive. It's better to have lockstep by default and force the programmer to use some special syntax for regexes with backtracking to make it clear that they have different performance characteristics. You might also want to use the ones with backtracking in some special cases anyway, since they can have a better best-case performance on certain input strings.

    • @ThomasWilde
      @ThomasWilde 25 дней назад

      They can and do. That boolean flags the difference between O(n) and O(2^n). That's the problem.

  • @JamesChurchill
    @JamesChurchill Месяц назад +8

    I use a regex to finite state machine code generation tool occasionally for tricky problems, and it's always violently obvious when I've accidentally added nondeterminism to my input regex - the state machine that comes out of the other side blows up spectacularly.

  • @sardonyx001
    @sardonyx001 14 часов назад

    I studied this 4 years ago at uni and totally forgot this is how we reason about regexes! Super helpful video thank you!

  • @erik_with_a_kay
    @erik_with_a_kay 20 дней назад +5

    Where [regex] blood makes [regex] blood unclean! First-time watcher and this was great.

  • @somdudewillson
    @somdudewillson Месяц назад +8

    My personal position is that once one has spent a few hours trying to convince various HTML parsing libraries to _only_ parse the input string and stop reformatting it, parsing with a regex starts looking pretty good.

  • @faldarith
    @faldarith 20 дней назад +3

    Thank you for sharing the stackoverflow answer, I’ve never felt so seen

  • @dr.strangelove5622
    @dr.strangelove5622 Месяц назад +19

    This is amazing!! I read about Thompson's algorithm last week when I was studying non-deterministic automata and the fact that regex engines in most modern and/or popular programming languages are slower than it and suffer from exponential blowup for longer expressions (if I remember correctly). The visualizations of algos was great and helpful in understanding them. Thumbs up for that!!
    All this increases my respect for these giants: programming all these using ed, the standard editor, on a teletype connected to a computer which was much slower than our present day handheld gadgets.

  • @4l6ag3
    @4l6ag3 19 дней назад

    What a great overview of this! Great refresher of things I haven't though much about since college, and explained more concisely than any of my professors managed to.

  • @nolanmccarthy3335
    @nolanmccarthy3335 20 дней назад +2

    I'm really drunk and I don't know what you're talking about but I'm enjoying it

    • @mmmhorsesteaks
      @mmmhorsesteaks 18 дней назад +1

      Best way to enjoy regexes, really.

  • @Tumbolisu
    @Tumbolisu Месяц назад +6

    I first learned about the concept of regular expressions at university. (yeah, my interests are not in parsing text, so i literally never saw them until then.) I was taught about the whole hierarchy, with context-free grammars and all that. a year later, i am working on a python project made by other students across the years, and it uses regular expressions all over the place. i sit down and properly learn about them, immediately notice that almost all of the code can be improved (such as some regexes potentially matching the wrong things), and of course, notice that the entire concept of matching groups and back references just... doesn't make sense. it's called a regular expression, but it actually isn't? the fact that every regex engine is forced to use the backtracking algorithm Precisely Because modern every-day regex isn't a regular language is just the cherry on top.
    this video mentions that people want to use regex to match html tags... just write a parser!? html is dead simple, it's the easiest thing on the planet to parse. i have manually written parsers for more complex things for fun.
    besides, i never liked the way regex strings look. they are just this ugly mess of punctuation and backslashes. i know some people who exclusively let ChatGPT write regex strings for them because they trust an LLM more with that mess than themselves.

  • @yash1152
    @yash1152 Месяц назад +5

    0:40 lemme guess, backtracking is that least efficient method most Rgex engines use to support the backreference matching, with exception of RE2 (in c++) made by google's engineer with doesnt support that feature.
    Fun fact, backref matching is not a regex capability in first place.

  • @stephenJpollei
    @stephenJpollei Месяц назад +1

    This was a fantastic video and I did the first exercise and found strings that caused issues that made them go over 5 seconds. Most of the strings were well under 80 characters and I think the max was around 300. I likely wasn't efficient with the 300 one and could have made a shorter one.
    I think that this really made me understand the issue much better, but I will have to look into some more.
    Thank you, for making such an excellent set of resources.

  • @bntegor
    @bntegor Месяц назад +6

    10:25 this transition cracked me up haha great video!

  • @websiteuser7926
    @websiteuser7926 Месяц назад +2

    great video, i read that russ cox page years ago and it was fascinating. its surprising that so many languages still use slow regex algorithms

  • @fugoogle_was_already_taken
    @fugoogle_was_already_taken Месяц назад +30

    Chomsky hirearchy is not a guideline. If you adhere to proper definitions, it is provable. My formal languages teacher would tear you apart for this 😂

  • @ouroya
    @ouroya 28 дней назад +1

    watching this video is making me very glad i've spent the past few months reimplementing my friend's 32-line sed across multiple hundred json files into a program that actually parses and understands json properly. who would've guessed that this works infinitely better and is infinitely more capable?

  • @PatFarrellKTM
    @PatFarrellKTM 19 дней назад +1

    I want to meet Kay Lack Turing Award Winner. I'll wait a bit. So much fun watching early 'go' videos about how Russ Cox, Rob Pike, Ken Thompson and Rob Griessimer used their decades of experience. Plus lots more refugees from Bell Labs....

  • @checkmate080
    @checkmate080 20 дней назад +1

    video got me to start using this option in C#:
    [GeneratedRegex("[^a-zA-Z0-9]", RegexOptions.NonBacktracking)]
    from microsoft docs:
    Enable matching using an approach that avoids backtracking and guarantees linear-time processing in the length of the input.

  • @chrispy2117
    @chrispy2117 20 дней назад +1

    Do i know enough to understand this video? No, absolutely not. Did your voice and explanations make this a really calming video? Yes, 100%. So despite the fact I'm probably not your target audience, I'm absolutely dropping a sub 😄

  • @Jalae
    @Jalae Месяц назад +6

    seems like plan9 is the branch of reality where ken thompson and ross cox fixed their regexps. I was confused why it was always greedy and didn't have back references.

  • @MichaelLouard
    @MichaelLouard Месяц назад +1

    I can only validate you by experience and not the full level of theory you know, but I’ve had to refactor SEVERAL notebooks containing regex once I learned about what you’re talking about (albeit not as well as you’ve explained it here)
    please take this as a sign you are not just teaching people who know nothing, but also people who know some, but need to be better (people like me!)
    Thanks!

  • @NithinJune
    @NithinJune Месяц назад +10

    yikes i think i missed some prerequisites

  • @drelephanttube
    @drelephanttube Месяц назад +17

    Haven't even watched this yet but a new Kay Lack video is automatic thumbs up.

  • @JackSlinger10
    @JackSlinger10 Месяц назад +2

    Really interesting stuff, thank you for sharing.

  • @BlueIsLeet
    @BlueIsLeet Месяц назад +20

    Use the KISS approach when creating your regexes and you'll be fine. For anything extremely complicated, use a parser library.

    • @mage3690
      @mage3690 28 дней назад +5

      Or just parse. It's not that hard, really.

    • @AnimeGIFfy
      @AnimeGIFfy 15 дней назад

      Using regex is like the opposite of KISS. Parsing is as simple as you can get

    • @bensalemi7783
      @bensalemi7783 8 дней назад

      Some things are problems that need solutions. Some things are solutions that are looking for a problem.
      RegExes are problems trying to convince you they are a solution.

  • @fugoogle_was_already_taken
    @fugoogle_was_already_taken Месяц назад +4

    Why are nondetermenistic automata not compiled to deterministic ones? Wouldnt it solve the algorithmic complexity?

    • @imacds
      @imacds Месяц назад +4

      I haven't done the math so I am not sure about this, but my guess would be that the resulting deterministic automata would be the same as traversing with backtracking through the nondeterministic one. The memory consumption of a deterministic automata can also grow quite quickly, so it would likely let you write OutOfMemory dos attacks.

    • @Vaaaaadim
      @Vaaaaadim Месяц назад +10

      Given an NFA you can make a DFA that is equivalent, however the number of states in the DFA may be exponentially large.

    • @schwingedeshaehers
      @schwingedeshaehers Месяц назад

      how can it grow quickly? you just need to store the state and the rest string​@@imacds

    • @JordanManfrey
      @JordanManfrey Месяц назад

      @@Vaaaaadimyeah this, it just sets off the “I’m creating a monster” alarm bells so you don’t do it

    • @moarjank
      @moarjank 19 дней назад

      It's mathematically impossible, as far as we're aware

  • @blacklistnr1
    @blacklistnr1 9 дней назад

    regex defaults always seemed weird to me, the usual case is N serial chunks of: start parsing when x, stop parsing when y, check that what is captured along the way follows some rules.
    I feel like this idea could be expressed so much better if regex didn't try to be so smart, it would also make the runtime faster while avoiding the troublesome edge cases.
    I'd be really interested to find out if there's a legitimate production use case for the smarter regexes (even the 'a repeated pattern from start to end' one you mentioned seems far-fetched to me)

  • @gronki1
    @gronki1 Месяц назад +7

    I love your channel! Just recently wrote my first parser in Fortran, language everyone tells me that I should not waste my life on, but it was possible!

    • @boombeet678
      @boombeet678 Месяц назад

      Actually a really good idea by the way, many people would recommend more modern alternatives like Python or Julia in terms of languages (even though they have enough differences not to necessarily be considered straight up replacements) for it but considering the time in which it came out, I would say FORTRAN is more akin to Lisp than COBOL in terms of it's use cases. Alike Haskell, probably wonderful to look at not necessarily for it being a swiss army knife (sometimes even getting outclassed in terms of specialization when it comes to which language is "the right tool for the job") but for what it does theoretically and technically, as a construct that holds up throughout time in that fashion.
      I'm wishing I also get to using it sooner or later alike Python and Julia, although the latter two from what I've gathered are probably going to come first for me. Doesn't change the fact that it's a language to take notes from nonetheless.

  • @MaxG628
    @MaxG628 11 дней назад

    Setting aside the maintenance of two code paths, it seems like a regex library should be able to detect if back references are used, and if not, use the more efficient algorithm. Are there any of that do that?

  • @brendand3765
    @brendand3765 16 дней назад

    Is there a way to analyze regexp to use the backtracking algorithm only when back references are used ? It doesn't to seem to be used that often.

  • @Sandromatic
    @Sandromatic 18 дней назад

    It seems like a good idea to do lockstep by default and then go to backtracking if a back reference is included in the regex? Most of the time you don't use back references, but if you do it's very useful, but obviously then causes the inability to use the efficient algorithm. Since regexes are usually compiled ahead of time, the algorithm can be chosen ahead of time depending on if back references are included. Just include a warning in your documentation that backrefs can cause major slowdown.

  • @FROZENbender
    @FROZENbender День назад

    yeeah this is why I avoid regexes in most languages. as soon as they break the limits of their chomsky hierarchy I can't seem to wrap my head around them. they turn into this brutally terse unreadable embedded language that I find impossible to maintain.
    great explanations in this video. I was under the assumption that regex parsing already has exponential time complexity but learning that that is only actually tied to the "bonus" features that are the very reason I avoid regex was enlightening!

  • @nik90001
    @nik90001 25 дней назад

    I think it's pretty common in search stuff to evaluate regexes as finite automata. This lets them intersect the automata with the search index to iterate all marching strings. That's nice because you can OR together the list of matching documents for all strings and get the list of matching documents for the whole regex.
    Also interesting, there is another sort of attack. Usually these algorithms want a DFA but regex make an NFA. Converting between the two *can* use a ton of memory. I accidentally crashed search on Wikipedia using it. That was exciting.

  • @yoctometric
    @yoctometric Месяц назад +2

    Thank you so much for producing this kind of content! Seconding those who enjoyed the transition to the html srackoverflow post. You explain things very well

  • @peterc-s6423
    @peterc-s6423 20 дней назад

    1. use a regex that matches if a regex causes catastrophic backtracking
    2. if it doesnt, match to see if it uses backreferences
    3. if it uses backreferences, use backtracking
    4. if it doesnt use backreferences, use lockstep

  • @brandonyong301
    @brandonyong301 Месяц назад

    Cool gem of a channel

  • @hugmynutus
    @hugmynutus Месяц назад +2

    14:30 appreciate this story. I never realized it was Stream Ed, which is cool as I use sed a lot.

  • @JohnSmall314
    @JohnSmall314 Месяц назад

    Has anyone done the quantum computation version of this?

  • @benarcher372
    @benarcher372 Месяц назад +1

    Great video! I liked very much "... from somebody called Rob Pike..." 🙂I'm sure he doesn't mind!

  • @Leadbraw
    @Leadbraw 29 дней назад

    keep up the great work

  • @HerbieBancock
    @HerbieBancock 13 дней назад +1

    "I object to binary searches because I'm non-binary" -Pasty Zoomer

  • @Lucas-pj9ns
    @Lucas-pj9ns 27 дней назад

    thanks for the interesting video! i used to hate regex cause it looked like arcane magic but the theory is so interesting. I've been trying to figure out how to implement capturing groups and I'm a bit stuck. I have some ideas about storing more state and I've optimized it so that it only takes O(capture groups^2) more memory but I don't know if that's good or not. I skimmed Russ Cox's papers you linked and I can't seem to find the part about implementing capture groups. Could you point me to which section talks about capture groups or some other link about that?

    • @neoeno4242
      @neoeno4242  27 дней назад

      Hi Lucas! This is the one in which he talks about it: swtch.com/~rsc/regexp/regexp2.html - roughly the idea is to treat each active state as a ‘thread’ and then maintain a list of all the unique threads active. You don’t have to go the whole way implementing a virtual machine to get something to work but it’s an interesting exercise. If that doesn’t help though drop me an email at hello@0de5.net and I can share some code.

  • @Flourish38
    @Flourish38 18 дней назад

    Being unable to negate generation in the same way you can negate recognition is reminiscent of how SAT is in NP, but UNSAT is in EXP because there’s no way to read a sufficient proof in non-exponential time.

  • @rogo7330
    @rogo7330 22 дня назад +1

    I just avoid regexes as first thing when I parse something. If you ever thought to parse XML *or* HTML (they are quite not the same and I hate it) with regexes, consider to read specs on them, realise that people who wrote all that crap are just sociopaths and hate you, and go back to just splitting file and writing your automata. At least in that case you will find a bug right away or find it more easielly, almost always you will know what parts of the spec you are ignored, and in best case it will be much simpler and faster since you can say «nagh, I know I parsing this xml from that program, I just gonna be sure to preserve information that program produces and just skip parsing of the XML all-together».

  • @sommanker
    @sommanker 22 дня назад

    Ehat about lookaheads and lookbehinds?

  • @bjorgemeulemeester1398
    @bjorgemeulemeester1398 Месяц назад +3

    Stumbled here somehow. Quality of production and depth of content are absolutely unmatched.

  • @hanpham5790
    @hanpham5790 Месяц назад

    I love your videos and contents so much Kay!! :^)

  • @maxmustermann5590
    @maxmustermann5590 18 дней назад +2

    I love you kay

  • @Lantalia
    @Lantalia 26 дней назад

    The presence of backreferences are a signpost that you have moved beyond the lexer and should be writing a parser

  • @neilsvedberg
    @neilsvedberg 24 дня назад

    I think raku's approach to more verbose regexes is a step in the right direction.

  • @NillKitty
    @NillKitty Месяц назад +1

    I love your hair and your accent. But to echo others, how do you do your animations?

  • @dungeonrobot
    @dungeonrobot Месяц назад

    Great video! Not sure how it’s only at 1k views.
    Very disappointed to discover after this video that the game engine I was using only supports PCRE2

  • @bensalemi7783
    @bensalemi7783 8 дней назад

    If I wanted to right unreadable code, I'd enter an obfuscated C contest.
    I don't care what the theory is. I don't care how efficient you can make a regEx implementation. If you want to make things as efficient as possible you can always write in assembler, but people don't do that unless they absolutely have to (or for fun) and the reasons are obvious. For those same reasons, try to avoid regEx and realize they are from another time and solved a different problem than the ones we need to solve today.

  • @fugoogle_was_already_taken
    @fugoogle_was_already_taken Месяц назад

    Did not know about redos, thx

  • @zweitekonto9654
    @zweitekonto9654 Месяц назад +5

    what's a regular language tho. I don't think you ever got into the definitions in the series.

    • @drdca8263
      @drdca8263 26 дней назад

      A regular language is one that can be recognized by a finite state automaton

  • @Il_Dilettante
    @Il_Dilettante Месяц назад +1

    I'm a feynman bro except instead of feynman it's regex
    Thank God for RegEx

  • @i_am_lambda
    @i_am_lambda Месяц назад

    Why not scan the regex for back references and use lockstep if none exist?

    • @rngwrldngnr
      @rngwrldngnr Месяц назад +2

      Some tools probably do, but the specific website-taking-down cases were all backreferences related, so it wouldn't help with the really icky edge cases.

  • @rich_in_paradise
    @rich_in_paradise 13 дней назад

    Maybe because I learnt compilers (Dragon book, yay!) before I programmed in a language (or used an OS) that had regexes, I always thought people coverted their NFAs into DFAs before recognition.

  • @02052645
    @02052645 Месяц назад +1

    OK, but I need to know: can you parse HTML with a regex?

    • @stephenJpollei
      @stephenJpollei Месяц назад +3

      I think at best you can do a poor-man's tokenizer somewhat sanely using regexps. The whole grammar includes full recursion etc so no you can't properly parse arbitrary html just using a regexp. You can scrape stuff from html that follows known pattern or template but that can be extremely fragile when even minor changes to the format is done.

  • @ms-fk6eb
    @ms-fk6eb Месяц назад +18

    I hate this, the kind of "yes, but actually no" type stuff that, like the SO answer, consumes all that is pure and beautiful

    • @neoeno4242
      @neoeno4242  Месяц назад +3

      would you mind saying a little more about this @ms-fk6eb?

    • @logickedmazimoon6001
      @logickedmazimoon6001 Месяц назад +3

      nothing is black and white, nothing is known for sure, our concept of objective is how repeatable something is in one or more cases but there will always be one or more cases that repeatable thing is not... repeatable. When we say "yes" in response to a "does this happen?" actually means, "More than likely yes"

    • @Criz454
      @Criz454 Месяц назад +2

      @@logickedmazimoon6001 things literally are known for sure but ok

    • @logickedmazimoon6001
      @logickedmazimoon6001 Месяц назад +1

      @@Criz454 In what sense?

    • @wumi2419
      @wumi2419 18 дней назад +1

      ​@@Criz454no. Everything has it's limits of applicability. Any physical model breaks apart or is too complicated to be computed. Sometimes both.

  • @jammin023
    @jammin023 Месяц назад +1

    You keep saying "in big-O notation..." but then giving the time complexity's *name* rather than its big-O notation... e.g. quadratic time would be O(n²)

  • @viniciusrolandcrisci272
    @viniciusrolandcrisci272 Месяц назад +6

    how do you create your animations?

  • @studogYT
    @studogYT 17 дней назад +2

    Perl predates Linux by 3 or 4 years.

  • @Klosterhasi
    @Klosterhasi 20 дней назад

    10:25
    no way its actually been there the whole time holy shit

  • @mysteryman7877
    @mysteryman7877 25 дней назад

    Wait, do you issue regex licenses?

  • @ricardojunior4334
    @ricardojunior4334 10 дней назад

    "Someone called Rob Pike"

  • @v0id_d3m0n
    @v0id_d3m0n 20 дней назад

    That stackoverflow answer is hilarious. Thanks for sharing xD

  • @bellaF
    @bellaF 3 дня назад

    i agree

  • @caniggiasyabil470
    @caniggiasyabil470 26 дней назад +2

    bro you are GORGEOUS omg

  • @yxyk-fr
    @yxyk-fr Месяц назад +1

    0:35 yeah let's hack !!!

  • @omegahaxors9-11
    @omegahaxors9-11 Месяц назад

    I still don't know what a regex is.

    • @williamdrum9899
      @williamdrum9899 13 дней назад

      Regular Expression. It's a computer program that lets you tell if a piece if text fits a common format, such as an email address, a phone number, etc. Theoretically it works with any kind of text pattern. The downside is that it looks nonsensical to a human reader

  • @gasun1274
    @gasun1274 19 дней назад +1

    or do you just like looking like a medieval english lord

  • @gasun1274
    @gasun1274 19 дней назад +1

    r u enby

  • @stupidteous
    @stupidteous Месяц назад

    u look like finlay christie

  • @Tobiky
    @Tobiky 29 дней назад

    banger

  • @randomchannel-px6ho
    @randomchannel-px6ho Месяц назад

    How they got catastrophic xD

  • @leeschumacher8285
    @leeschumacher8285 Месяц назад

    G … N … U ?!?

  • @Sub0x-x40
    @Sub0x-x40 Месяц назад

    thank the Gods for gpt doing all my regex when I need it

  • @killerwrox
    @killerwrox 25 дней назад

    Why do you know so much about regex?

  • @SolaceCafe
    @SolaceCafe Месяц назад

    This video, while I would love to learn about regex efficiency as I use regex a lot, kinda lost me in talking a lot about different Algs and backtracking and such and not really relating it back to regex.
    I basically just went "ok, so just don't use lookahead or lookbehind probably?"
    Edit: I just noticed that every graph you show past like, halfway into the video, is showing what regex expression it is. Now I feel dumb.

  • @ducemano
    @ducemano Месяц назад +2

    Love me some diversity in the mathmetics field

  • @randomchannel-px6ho
    @randomchannel-px6ho Месяц назад

    Yo draw a line between 1 and 5 and 3 and 7 and flip 1:45 90 degrees clockwise. If you recognize it, wtf

    • @spaghettiking653
      @spaghettiking653 Месяц назад

      What, it looks like a cube?

    • @JadeVanadiumResearch
      @JadeVanadiumResearch Месяц назад

      Is this Tunic?????????????????

    • @volbla
      @volbla Месяц назад +1

      I'm guessing the tree of life. It turns out judaism was a finite state machine all along!

    • @JadeVanadiumResearch
      @JadeVanadiumResearch 7 дней назад +1

      @@volbla Yeah I guess that's more likely than Tunic... Good game though.

  • @tommihommi1
    @tommihommi1 25 дней назад +1

    most things done with regexes should instead be done with plain, readable, testable, debuggable code.

  • @JOHNSMITH-ve3rq
    @JOHNSMITH-ve3rq Месяц назад

    Ugh. Can’t do it.

  • @RifazNahiyanFWS
    @RifazNahiyanFWS 25 дней назад

    I *have* to ask a question that's just on my mind. Are you a trans? It's just your voice and how you look I cannot add up.

    • @humanfactorsio
      @humanfactorsio 23 дня назад +1

      Why does it matter at all? Stop transvestigating her. She's a woman, you don't need to know more than that. It's just you satiating your curiosity about the privacy of her life. This is the problem with society, you say we shove it down your throats - but you bring it up every opportunity you can and treat it like a dumb detective act. Let people have privacy.

    • @RifazNahiyanFWS
      @RifazNahiyanFWS 23 дня назад +1

      @@humanfactorsio honestly if you can't answer that question but also are showing your appearance, you're not doing privacy right.
      I think it's a natural question to ask when you can look at her but then after hearing her you can't make the match.

    • @AnimeGIFfy
      @AnimeGIFfy 15 дней назад +1

      You lookin for a date or something lmao

  • @ccriztoff
    @ccriztoff Месяц назад +2

    what a handsome young lady

    • @jvo1464
      @jvo1464 28 дней назад

      😂

    • @humanfactorsio
      @humanfactorsio 23 дня назад

      Why do you need to an an absolute bigot on here? Show some respect FFS.

  • @dadisuperman3472
    @dadisuperman3472 Месяц назад +6

    What are you???
    Man or woman???

    • @flyingzeppo
      @flyingzeppo Месяц назад +39

      Why does it matter? You're focusing on the wrong thing in this video.

    • @dinkc64
      @dinkc64 Месяц назад +5

      Isn't it obvious? 😞

    • @alysphacker
      @alysphacker Месяц назад +1

      @@flyingzeppo he wants to know obviously if he asked

    • @robinforbes9197
      @robinforbes9197 Месяц назад +15

      M8 we're just trying to live. Can you leave her alone?

    • @alphago9397
      @alphago9397 Месяц назад +2

      I honestly would have preferred the video a lot more if it were just narrated.

  • @ЕвгенийРезвей
    @ЕвгенийРезвей Месяц назад

    зачем он вырядился как баба?

    • @JordanManfrey
      @JordanManfrey Месяц назад +6

      Because they wont be jailed for it unlike in your country

  •  Месяц назад

    If You're so smart, just write better and be the hero. Huh? I guess not a chance, right? The complexity and unefficiency probably has its purpose. Well. But You just needed views, right?

    • @rngwrldngnr
      @rngwrldngnr Месяц назад +12

      The video literally explains the use of backtracking, and how removing it could lead to safer, faster code, admitting that doing so would involve removing back references and their expressive power. Besides, it's comparing the performance between two algorithms, both of which were written by the same person, so it's hardly a callout.

    • @JordanManfrey
      @JordanManfrey Месяц назад +1

      Let me guess, self-taught

  • @humanfactorsio
    @humanfactorsio 23 дня назад

    You are genuinely one of the greatest video creators I've seen in so long (also you are goals in general girl 🥹💕). I have a massive holiday watchload now. Thank you so so so so much for your hard work