this regex identifies prime numbers (reaction)

Поделиться
HTML-код
  • Опубликовано: 23 июн 2023
  • Check out the post here: www.noulakaz.net/2007/03/18/a...
    To play with this regular expression yourself, try this: regex101.com/r/N0qvOm/1
    My Links
    shaky.sh
    shaky.sh/tools
    / andrew8088
  • НаукаНаука

Комментарии • 346

  • @ccgarciab
    @ccgarciab Год назад +1424

    I think you should clarify that it works with numbers in unary. It took me until the examples to realize this, and it's the key to understand how matching is used as a stand in for division.

    • @wlockuz4467
      @wlockuz4467 Год назад +77

      Yeah that was a big oversight. I only realized what was going on when he go to the code.

    • @scottcallaway9586
      @scottcallaway9586 Год назад +105

      Yeah became super obvious to me when he had four 1's on the screen and said that was "4". Should be clarified in the title otherwise it's clickbait.

    • @Raiment57
      @Raiment57 Год назад +48

      Yes! Just wasted several minutes waiting to see how on Earth this would work. Annoying.

    • @Primeagen
      @Primeagen Год назад +13

      Ohh! You are right, group matching is doing what division would do

    • @forsyth5793
      @forsyth5793 Год назад +2

      Ohhhhhhhh I wish I’d read through the comments before watching

  • @markzuckerbread1865
    @markzuckerbread1865 Год назад +175

    Sieve of Eratosthenes? Trial division? Just use Regex bro 🗿

    • @GuruEvi
      @GuruEvi Год назад +14

      This seems to me to be the sieve of erastothenes method (mark every multiple of 2 and subsequent matches of the regexp as not prime until n), just using RegExp as the language (which is in itself as a language Turing complete)

    • @markzuckerbread1865
      @markzuckerbread1865 Год назад +3

      @@GuruEvi interesting, because honestly I didn't understand how it worked :))

    • @TerrexoDesign
      @TerrexoDesign Год назад +9

      @@GuruEvi this is actually just dividing by all numbers...

    • @ccgarciab
      @ccgarciab Год назад +7

      @@GuruEvi standard regex is not Turing complete. Do they reach completeness with Pearl extensions and so on? Sounds implausible to me
      Also, the algorithm used in the video is just trial division.

    • @ilangated
      @ilangated Год назад

      @@GuruEvi regex aint turing complete

  • @megaing1322
    @megaing1322 Год назад +260

    I think it should be mentioned that this "regex" are not actually regular expressions in that Computer Science sense: They don't define a regular language. This is because of the backreference which brings it up to a context free grammar, and then actually a repeated backreference which brings it up to a context sensitive language.
    Using this as an example as to why you should always be careful with your regex is slightly weird for that reason: If you are using backreferences, especially repeated backreferences, you probably shouldn't be using a regex engine for that.

    • @seiryn3082
      @seiryn3082 Год назад +10

      Yeah, regex + back reference are powerful enough to solve NP-completes problems which is way more powerful than they should be

    • @megaing1322
      @megaing1322 Год назад +1

      @@seiryn3082 Really? How do those get around the polynomial space constraint? Do you have a link for that? I do believe that they cover context sensitive and even LBAs (although I am not sure about that), but isn't NP > PSPACE?

    • @seiryn3082
      @seiryn3082 Год назад

      ​@@megaing1322 You can find an article if you search for "perl regular expression NPC-3SAT"

    • @antonf.9278
      @antonf.9278 Год назад +3

      ​@@megaing1322NP ⊂ PSpace
      A simple PSpace algorithm for solving any NP problem is to do a depth first search over the possible solutions to the NP problem. The space needed will only ever be polynomial as the solution will only be polynomial and the only other state needed is the current depth.

    • @mad_vegan
      @mad_vegan Год назад

      I didn't know any of that. Do you recommend any video or article explaining the different types of "languages" you're talking about? And why should we avoid using regex for that kind of stuff?

  • @henrycgs
    @henrycgs Год назад +414

    Having once proven mathematically for a computer theory assignment that it was literally impossible for regular expressions to find prime numbers, this immediately caught my eye. Luckily, it doesnt break maths: this has much more memory of the string than a formal regex should have, as it remembers the number of ones in the first match.

    • @marv4054
      @marv4054 Год назад +4

      can you share a link of that proof? seems like an interesting read

    • @IStMl
      @IStMl Год назад +61

      @@marv4054 you prove the language of prime numbers is not regular

    • @ApplepieFTW
      @ApplepieFTW Год назад +15

      I don't exactly know how this would have much more memory of the string than a formal regex (what is a "formal" regex) should have

    • @user-qm4ev6jb7d
      @user-qm4ev6jb7d Год назад +38

      I think this is at least a context-sensitive grammar, if not worse than that. I mean, the operation "repeat the previously captured string" is literally *sensitive* to the *context.* But maybe it doesn't even fit the CSG class.

    • @henrycgs
      @henrycgs Год назад +53

      @@marv4054 I don't have one right now, but it's not too difficult.
      First, you need to understand what a regular language is. A "language" is any set of strings. A "regular language" is a language in which all of its strings can be "generated/recognized" by simple state machines. (omitting a lot of things for simplicity)
      Formal regular expressions are a way to describe those languages/state machines. What we commonly refer to as "regex" is a specific syntax/implementation based on the theoretical background of these formal expressions.
      One can, after some thought, prove a few things about regular languages. One of them is that if you have an infinite language (one with infinitely many unique strings), there must be a point at which they're all just repeating one identical middle bit forever. In other words, they can't get more complicated than a starting bit, some repetition, and an ending bit. More importantly, for any number after some point, there exists a string in this language with this middle bit repeated this number of times.
      Well, think of the language of the prime numbers. It could be {"2","3","5"...} or {"11","111","11111"...}, it doesn't matter how you represent it. It would be very convenient if there was a point after which every single prime number was just an identical beginning, end, and a repeating middle bit, right? Imagine how easy it would be to generate prime numbers! just add another repetition and you got the next. Sadly, it's very obvious that this isn't the case, and there are many ways to prove this.
      Therefore, it's impossible for a regular expression to identify prime numbers, because that would mean that the language of prime numbers is regular, and that would mean that the primes follow a very obviously wrong dumb pattern.
      Does the regex in this video show that I'm wrong? No, it just shows that regexes aren't REALLY regular.

  • @szhzs6121
    @szhzs6121 Год назад +102

    this is a great example of a problem which you should NOT solve using regular expressions. not incidentally, that's true for 99% of problems out there.

    • @SlimThrull
      @SlimThrull Год назад +30

      "If you have a problem and you used regex to solve it, you now have two problems."

    • @rudyNok
      @rudyNok Год назад

      @@SlimThrull * now

    • @rafa_br34
      @rafa_br34 11 месяцев назад +1

      Yeah I rather use something "normal" other than allocate a 4gb string to test a number like 4.294.967.295(not to mention it would still need to compute the regex)

    • @Skarpo89
      @Skarpo89 11 месяцев назад

      Sadly

  • @AScribblingTurtle
    @AScribblingTurtle Год назад +44

    Didn't know you could use references to capture groups with in the expression itself. 😳 Interesting to see, how it interacts with the lazy/not greedy quantifier. Thanks for sharing this. 🙏

  • @aelolul
    @aelolul Год назад +23

    I am reminded by the book _Gödel, Escher, Bach_ which uses a technique similar to this to discuss the nature of mathematical proofs using nothing but typographic symbol manipulation. If you haven't heard of it, I highly recommend the book.

    • @Slgjgnz
      @Slgjgnz 11 месяцев назад +2

      That was the first book that kinda broke my mind in an interesting way. I need to find it again.

    • @fllthdcrb
      @fllthdcrb 11 месяцев назад +1

      @@Slgjgnz If it helps, the full title is _Gödel, Escher, Bach: An Eternal Golden Braid,_ and the author is Douglas Hofstadter.

  • @22dunix3
    @22dunix3 Год назад +17

    There is also tool to visualize regex, called regexper. Visually more understandable diagrams of what is going on if youre analyzing regex and not writing it yourself

  • @blockmath_2048
    @blockmath_2048 11 месяцев назад +2

    TL/DW:
    The regex matches a composite number of "1"s. It does this by looking for A and B, where A and B are >= 2, A*B = N, and constrains them by A being (11+?) and B being \1+

  • @israelssantanna
    @israelssantanna Год назад +8

    Absolutely fantastic! Please do more regex videos 😊

  • @NostraDavid2
    @NostraDavid2 Год назад +10

    I took a minor where I learned to parse text via Haskell (learning Functional Programming along the way), and learning about context-free languages, regular languages, NFA, DFA, State Machines and how they relate to Regular Expressions really blew my mind. I can heartily recommend anyone take a related course.

    • @NostraDavid2
      @NostraDavid2 Год назад

      No videos for 6.005, but at least there's a book and website.

  • @pwhqngl0evzeg7z37
    @pwhqngl0evzeg7z37 Год назад +53

    This is pretty sick, and surprisingly intuitive. I didn't know about the +? quantifier, but after that it all made sense.

    • @davidh.4944
      @davidh.4944 Год назад +9

      Support for it depends on the flavor of regex you are using, but most of the more advanced variations have some form of non-greedy syntax.

    • @isomeme
      @isomeme Год назад +4

      I wish more people knew about the non-greedy quantifier. It makes some kinds of matches much easier to implement, and many work-alike matches far more efficient.

    • @xcoder1122
      @xcoder1122 11 месяцев назад

      Actually there are three official REGEX standards defined by POSIX which itself is a IEEE standard: BRE (Basic Regular Expressions), ERE (Extended Regular Expressions) and SRE (Simple Regular Expressions). But SRE is deprecated, so just forget that you've ever heard it. UNIX `grep` usually works with BRE, unless you call it as `grep -E`, then it works with ERE (`egrep` is a shortcut for `grep -E` BTW).
      Then came PERL and PERL created its own flavor or REGEX. Most REGEX features you see today that were not part of the POSIX standard are in fact PERL extensions. This includes non-greedy (aka lazy) matching for quantifiers (*, +, ?). Most implementations of other programming languages copied the PERL implementation usually with just a bit less features (they are all subsets of the PERL 5 implementation from 1994). Some added a few new features on their own (e.g. Python), which usually were later backported to PERL. All of them are backward compatible to ERE.
      Problem with lazy matching is that it is way harder to implement than greedy matching and also requires way more resources, that's why some implementation leave it out on purpose. Even modern `sed` does not support it. Note that POSIX only requires `sed` to support BRE and already supporting ERE (with `sed -E`) is a non-POSIX extension. `grep` on some systems does support `grep -P`, where P stands for PERL and in that mode it will actually support PERL extensions, including lazy matching.

  • @psriniv1
    @psriniv1 11 месяцев назад +1

    I completely missed that we were talking about binary representations of digits here until over halfway through. Suddenly the whole thing makes so much more sense lol

    • @ddichny
      @ddichny 11 месяцев назад +1

      Not binary, unary.

  • @pepkin88
    @pepkin88 Год назад +24

    In this case of this regexp, the greediness of the quantifier doesn't matter for the purpose of finding a match, but matters for performance.
    For example, take 12 ones.
    On regex101 you'll see that the original version goes 11 steps to find a match. But the version with a greedy quantifier `(11+)` will find a match after 23 steps. (it also matched 6 ones in the group 1, instead of 2 ones)
    But in case of prime numbers, it doesn't matter, because no matter from where it starts looking, it has to go through all of the combinations. That's why the string of 11 ones will return no matches after 35 steps, regardless if we use `+` or `+?`.
    Also, in your JS, you could use something, I think, more suiting, instead of padStart :)
    "1".repeat(num)

  • @tom-kz9pb
    @tom-kz9pb Год назад +2

    It is easy to write a logically correct script that can identify prime numbers. What is extremely difficult is to get the script to run fast enough to be useful, and/or not to blow away available memory, once that the numbers get very big.

  • @antoinecaron8915
    @antoinecaron8915 Год назад +2

    It reminded me of a similar conversion to roman numeral process where the input numbers were a expressed as a series of "I" and you're map/replace groups in sequence.. "IIIII" would become "V".. then "VV" would become X.. etc.

  • @catcatcatcatcatcatcatcatcatca
    @catcatcatcatcatcatcatcatcatca Год назад +37

    This is also a good example how regular expressions can be unintuitively expensive computationally. I’m sure you could use similar iterative process but for all n-length substrings of characters just to ultimately match every possible string with only alphanumeric characters.
    Someone should make that an npm package.

    • @TheBiggreenpig
      @TheBiggreenpig Год назад +5

      Yeah, and hopefully many lazy people will use it as a real prime finder.

    • @akam9919
      @akam9919 Год назад

      @@TheBiggreenpig you love chaos...don't you.

    • @darrennew8211
      @darrennew8211 Год назад +3

      Regular expressions aren't unintuitively expensive computationally. A regex can be, but a regex is much more complex than a regular expression. Regular expressions can be evaluated in linear time.

    • @halimkbas2883
      @halimkbas2883 7 месяцев назад

      I checked 13163, which is a prime. It exploded. Hahhaha

  • @jpleveille
    @jpleveille Год назад +13

    Tried with PHP and it will only detect primes below 49152 (48k); past that limit the subject is either too long or the capture loop exceeds some internal limit (it won't crash, it will just stop detecting primes)

    • @StefanReich
      @StefanReich Год назад +2

      Damn it. I was hoping to find the biggest prime ever detected using this method

    • @SlimThrull
      @SlimThrull Год назад

      Out of curiosity, did it state how many steps it took for the last prime below 49152?

    • @jpleveille
      @jpleveille Год назад

      ​@@SlimThrull I don't know about number of steps, it should be easy to calculate. Starting with 49146 it returns an error "JIT stack limit exhausted" (PHP 8.0.29); to say the regex is inefficient is an understatement. I didn't try disabling JIT but read somewhere the JIT stack size for PCRE in PHP may vary between 32k and 192k (although PCRE lib recommends between 512k and 1M so :shrug:)

  • @jonathansaraco
    @jonathansaraco 11 месяцев назад

    This regex is essentially a sieve of Eratosthenes. Pretty clever.

  • @FrankHarwald
    @FrankHarwald Год назад +3

    Could you maybe put somewhere in the description that this one is supposed to match an unary number of "1"s & NOT decimal/binary numbers? Thanks

  • @chazits
    @chazits Год назад +1

    I solved a problem with regex , now I have two problems.

  • @xHomu
    @xHomu Год назад

    Donno why RUclips recommended your video, but you're absolutely right, this is super cool!

  • @gameplaydosabao
    @gameplaydosabao Год назад +3

    It is in fact fetching the entire string trying to find multiple of 2, then multiple of 3, and so on until it runs the entire string trying to mach 111… with exact repetition until the end. This means that the complexity of such method is exponential minus 1. In other words, is a brute force method with exponential growth. Don’t try to do it with larger numbers. 😂

  • @tensorflaw
    @tensorflaw Год назад +2

    I look forward to seeing this incorporated into a widely used public repo on npm

  • @ThankYouESM
    @ThankYouESM Год назад +1

    I long ago had in mind to create a Reverse Regex algorithm in which we create test examples alongside expected results in which an A.I. would try to provide a minimalist prediction of what the Regex should be if a pattern is found.

  • @WinterHoax
    @WinterHoax Год назад

    If i do like a 100 digit number will say whether it is prime in like no time like less than 1 ms?

  • @Schnolle
    @Schnolle Год назад +27

    After figuring out the unary representation of numbers, it is immediately obvious that:
    1. This is an implementation of the Sieve of Eratosthenes
    2. It can be simplified by removing the first half, as 1 is NOT a prime
    3. Regular expressions rule!

    • @isomeme
      @isomeme Год назад +14

      You do need the first half, since the regex as a whole returns true for "not prime". Without the first half, 0 and 1 would return false, indicating that they are prime.

    • @TheJamesM
      @TheJamesM Год назад +9

      1. No, this is a purely naïve search for divisors. It first tests for 1, then very painstakingly checks whether any integer from 2 to n-1 divides cleanly into n. It essentially does so by iterative addition (by way of repeated backreference).
      2. The pattern tests for divisibility, so the primes are the non-matches. Hence the need for the special case for 1.
      3. Absolutely correct, though they should be wielded judiciously - certain combinations of quantifiers can require the engine to walk back and forth through the string recursively before it can definitively declare that no match is possible.

    • @samtux762
      @samtux762 Год назад +1

      111 = 37 * 3 is not a prime. But the regex sais it is.

    • @isomeme
      @isomeme Год назад +3

      @@samtux762 , the strings aren't decimal numbers. It's a string of '1' characters, and what the regex determines is whether the length of that string is prime (returning false for primes, true for non-primes). The use of '1' as the repeating character was arbitrary, and I think it was a bad choice. I'd prefer something like 'x' to make it clear that the string doesn't represent a number.

    • @felixmerz6229
      @felixmerz6229 Год назад

      @@samtux762 1 = 1, 11 = 2, 111 = 3, etc. This isn't a decimal representation, it's a *unary representation*

  • @ajinrenfire
    @ajinrenfire Год назад

    hey 👋, what is the font used for commenting the code?

  • @IantoCannon
    @IantoCannon Год назад

    I tried this as a search in vim and I got "E65: Illegal back reference". Is this only in perl? What type of value does it return?

  • @AnthonyFammartino
    @AnthonyFammartino Год назад +1

    You should reupload and explain in the first minute of the video that this works in unary. It was not apparent until you mentioned that "1111" was the value "4". Either way, interesting video and find so thank you for the breakdown.

  • @askegg
    @askegg Год назад

    I understood next to none of this, but feel smarter for having watched it.

  • @lainverse
    @lainverse Год назад +1

    I guess it's cool you can do this with regular expressions.
    However, this is a perfect example of how to misuse a regular expression. Not only this is a horrendously slow and memory inefficient way to determine is number a prime. Not only this is an example of using regex for tasks they are not intended to do. But in billions range it start to fail with "RangeError: Maximum call stack size exceeded" since RegExp engine have its own limits and end up going too far with all these back-references. Furthermore, at larger numbers padStart/padEnd start to fail too and even wouldn't they fail you'll hit memory limit of your PC just a bit later down the line.
    > Some people, when confronted with a problem, think "I know, I'll use regular expressions."
    > Now they have two problems.

  • @MultiUltimater
    @MultiUltimater Год назад +1

    For those of you still wondering how this works, the regex expects a unary representation of a positive integer. The real magic is \1+ which matches the first group again one or more times which emulates trying to count by 2's, then by 3's, etc.
    For example 9 is not prime since it is divisible by 3. And the regex would return true indicating it's not prime.
    Why? When the regex gets up to counting by 3's, and group 1 would look like this: (111), the \1+ would match group 1 an additional one or more times. When the \1+ is repeated twice, we have the whole expansion (111)111111 and thus the regex returns a match, indicating 9 is not prime.
    In practicality, you wouldn't use a regex for this as it's very expensive and would hit a memory limit much sooner than a non-regex approach. Similar logic could be implemented in JavaScript like so which doesn't rely on a regex:
    function isPrime(n)
    {
    if(n < 2 || Math.floor(n) !== n) {
    return false;
    }
    for(let i = 2; i < n; i++) {
    if(n % i === 0) {
    return false;
    }
    }
    return true;
    }

  • @jacksims8018
    @jacksims8018 Год назад +7

    Definitely needed the explanation that this is neither base 10 nor binary, but a string of 1's n long where n is the number you're checking! Otherwise awesome video :)

  • @tanchienhao
    @tanchienhao Год назад

    Can somebody with theoretical cs knowledge explain what regex can and cannot do (type 2 grammar or sth) from a theoretical standpoint?

  • @nyxcode2818
    @nyxcode2818 Год назад +8

    dont wanna be too pedantic, but these are not regular expressions. with the back-reference, the resulting grammar is context-free iirc.

    • @andrew-burgess
      @andrew-burgess  Год назад

      Woah, never heard of context-free grammars before, looked them up briefly just now and will explorer them more later. Thanks for pointing this out!

    • @nyxcode2818
      @nyxcode2818 Год назад +1

      @@andrew-burgess it's a cool rabbit hole, enjoy. It's the cornerstone of theoretical computer science. Formal languages are categorized in the chomsky hierachy, and "regular" languages are the most basic ones. I suspect RegEx started out as a regular language, but got progressively more powerfull with features like backtracking.

    • @NostraDavid2
      @NostraDavid2 Год назад +1

      ​@@andrew-burgessregex part of bigger whole, like regular languages. Very cool.
      Fuck RUclips for removing the longer comment I wrote.

    • @AlFredo-sx2yy
      @AlFredo-sx2yy Год назад +2

      ​@@NostraDavid2​ Your comment must have been too offensive for youtube's liking. How dare you talk about regex in depth?

    • @megaing1322
      @megaing1322 Год назад

      Not context free, but context sensitive because the backreference is repeated.

  • @nikolaymatveychuk6145
    @nikolaymatveychuk6145 Год назад

    what is the reason to use a regular expression here? :) It might work much slower than a common division test. MUUUUUCH slower :)

  • @reginaldfrank656
    @reginaldfrank656 Год назад +1

    Bro the pipe is not a Boolean or, it's a set union. Each regular expiration defines a language which is a set of strings. The pipe combines two separate expressions into one whose language is the union of the two expressions.

  • @ManSubhu
    @ManSubhu 11 месяцев назад

    Taking a prime and seeing if it dvides into every possible lower prime starting from 2 seems a very inefficient way of finding primes, what am I missing?

  • @kennethbeal
    @kennethbeal Год назад

    Thank you!

  • @integerdivision
    @integerdivision Год назад +2

    Quick definition of primes: A number with exactly two divisors. That excludes one since it only has one divisor. No caveats necessary.

    • @samazon52
      @samazon52 Год назад +2

      Exactly two positive divisors?

    • @SlimThrull
      @SlimThrull Год назад +1

      @@samazon52 Exactly two positive integer divisors?

    • @spicynoodle7419
      @spicynoodle7419 Год назад

      Prime numbers are integers that are divisible only by 1 and themselves. So for 1 to be prime, it has to be divisible by 1 and itself. In language it sounds correct that it should be prime because the letter 1 and the word "itself" are different but "itself" refers to 1.
      This means that 1 has to be divisible by 1 and 1, that's only one divisor repeated twice. So it doesn't really have 2 divisors. If we consider 1 to be prime then that would mean that 1 also has infinite divisors.

    • @felipevasconcelos6736
      @felipevasconcelos6736 Год назад +2

      @@spicynoodle7419 until the early 20th century, the most common definition of prime numbers did include 1. That made the Fundamental Theorem of Arithmetic more complex, since you had to specify every integer greater than 1 could be represented as an unique product of *non-unitary* primes.

  • @kanuos
    @kanuos Год назад +2

    Which font and neovim theme are you using? Your editor looks amazing!

    • @andrew-burgess
      @andrew-burgess  Год назад +2

      Catppuccin theme with Mono Lisa font!

    • @israelssantanna
      @israelssantanna Год назад

      @@andrew-burgess I love catppuccin (all variants), but latelly I'm using a modified version of latte with bg color of noctis lux. Have you used any noctis theme yet?

  • @freedom_aint_free
    @freedom_aint_free Год назад +1

    I'd classify this as Clickbait.

  • @Bird2920
    @Bird2920 Год назад

    Regex is amazing, and I love finding interesting ways to use it. PS You look and sound like you could be Paul Rudd's brother.

  • @GameCyborgCh
    @GameCyborgCh Год назад

    regex is black magic

  • @tinyBigGAMES
    @tinyBigGAMES 11 месяцев назад

    NICE!

  • @ddichny
    @ddichny 11 месяцев назад

    Might help to clarify that this regex doesn't "identify primes" (i.e. it does not match primes), it matches strings that have a composite (non-prime) length. From that you can "identify primes" by what it *doesn't* match, sure, but it's confusing to leave the impression that it matches primes when the point is that it matches things that *aren't* prime.

  • @semitangent
    @semitangent Год назад +1

    Awesome, it took me this video to realize that the +? modifier does the "counting up" for the trial division, sadly until the number being tested instead of, as already mentioned, its square root.

    • @SurenEnfiajyan
      @SurenEnfiajyan Год назад

      The expression will work without '?' as well , but it will be probably slower since it will start to check from the longest repeating strings.

    • @semitangent
      @semitangent Год назад

      @@SurenEnfiajyan Interesting, I certainly did not realize this. I will have to do some timing tests to see where the difference lies.

  • @Wolf-if1bt
    @Wolf-if1bt Год назад

    Maximum ingenuity for minimum efficiency

  • @TheSkepticSkwerl
    @TheSkepticSkwerl Год назад

    I guess the problem here, is performance and how much memory it uses

  • @Mr.Not_Sure
    @Mr.Not_Sure Год назад

    Expected to see regex on decimal representation of number.

  • @SurenEnfiajyan
    @SurenEnfiajyan Год назад +1

    ^1{0,2}$|^(1{2,})\1+$ this will work too, but will probably be slower for larger numbers.

  • @henrym5034
    @henrym5034 Год назад

    The site is now under maintenance.
    Look what have you all done.

  • @matthewj5333
    @matthewj5333 Год назад

    Looks like the sieve of Eratosthenesbut written using regex!

  • @gliaMe
    @gliaMe 11 месяцев назад

    Is it rejex or regex? It's like jif vs. gif to me.

  • @skye0_
    @skye0_ Год назад

    Shouldn't "11" also not be detected, considering 2 is a prime number?

  • @anafro_ru
    @anafro_ru Год назад +5

    Wow
    The question is: is regex faster than classical prime number function? Like looping thought i = 2 to i = sqrt(n) (afaik)

    • @siborgium9022
      @siborgium9022 Год назад +15

      No. It uses backreferences, which are extremely slow. It can't use integers and in general efficient number representation, can't use proper math, and it uses much more memory to store the NFA itself and matches. It has to be compiled and fed a specific number representation.
      Looping, on the other hand, is simple and efficient, not to mention numerous algorithmic optimizations to the process.

    • @anafro_ru
      @anafro_ru Год назад +1

      @@siborgium9022 thank you so much for this descriptive explanation! I really appreciate it ☺

    • @andrew-burgess
      @andrew-burgess  Год назад +6

      For sure. For me, this is mainly about learning a creative way to combine a lazy quantified and a back reference. Definitely not discovering new primes with this 😂

    • @anafro_ru
      @anafro_ru Год назад +2

      @@andrew-burgess haha, I would benchmark this just for fun 🤣

    • @wadecodez
      @wadecodez Год назад

      Yes and no. This is like asking if a function is faster than a lambda. Regex is like shorthand for programming. You can easily write sluggish regular expressions just like you can easily write sluggish loops. It all comes down to your understanding of time complexity.

  • @meargon9186
    @meargon9186 Год назад

    Class: we have only two wildcards: * and +. Easy.
    Exam: This is the regex for detecting prime numbers. Now make a turing machine for accepting primes.

  • @Linuxdirk
    @Linuxdirk Год назад

    So we no longer need complex calculations but instead just slap random strings of ones against this regex?

  • @IARRCSim
    @IARRCSim Год назад

    Does that expression really work for an infinite range of primes or is it correct only for prime numbers less than 200?

    • @neilthomas2549
      @neilthomas2549 Год назад

      On a machine with infinite memory, with a regex engine that can use that memory, yes, it works for an infinite number of primes. A hard limit is the memory on the machine. A hard limit, unless you can write a better regex engine, is the amount of memory it can actually use. An soft limit, that becomes harder real quick is the time that this horribly inefficient algorithm takes to execute on longer strings.

    • @IARRCSim
      @IARRCSim Год назад

      @@neilthomas2549 That's really interesting. That means a finite state machine is enough to decide if an integer is prime. I just tested up to 5000 and it only failed for the input 1. a regex working on a unary number would be O(n) for integer value n. What is the lower bound for the primality test problem? I wonder if the lower bound proves that a regex for binary would be completely impossible since a regex testing with binary would solve the problem in O(log n).

    • @calvindang7291
      @calvindang7291 11 месяцев назад

      @@IARRCSim A backreference is not a valid operation in a regular expression, and so the regex shown in the video cannot be turned into a deterministic finite automaton.

    • @IARRCSim
      @IARRCSim 11 месяцев назад

      @@calvindang7291 is a pushdown automaton powerful enough to represent this? Processing the unary with this backreferenced expression would still take O(length of the unary), right?

    • @calvindang7291
      @calvindang7291 11 месяцев назад

      ​@@IARRCSim A PDA is not powerful enough, either. (For instance, the language {WW | W is a word} matched by (.*)\1 is not context free.)

  •  Год назад

    Cool video! Note: instead of using "".padStart(num, "1") or "".padEnd(num, "1"), you can juste use "1".repeat(num) in JavaScript.

  • @orychowaw
    @orychowaw Год назад

    Indeed, a lot of problems become much simpler in unary notation. But as somebody already mentioned, it is a cumbersome notation...

  • @Jkauppa
    @Jkauppa Год назад

    easily amused

  • @youtubepooppismo5284
    @youtubepooppismo5284 Год назад

    Now match a string that is "1" repeated p times where p is a prime.
    Level: pumping lemma impossible

  • @cmyk8964
    @cmyk8964 Год назад +1

    Oh, that’s clever. Sounds unperformative as hell, but that’s clever.

  • @MadsterV
    @MadsterV Год назад

    I wonder about the performance of this....

  • @InverseTachyonPulse
    @InverseTachyonPulse Год назад

    The dark side of the Force is a path to many abilities some consider... unnatural 👀

  • @mahadevovnl
    @mahadevovnl Год назад +15

    For those, like myself, who have no idea why any of this matters: Prime numbers are the hidden gems of computer science. Used in cryptography, they're the key to a lock only you have - think RSA encryption. Prime-based hashing also neatens data storage, minimising collisions and speeding up access. Finally, prime numbers grease the wheels of complex algorithms, making everything run that bit smoother. All in all, they're an essential cog in the code machine.
    Super interesting stuff. Though in my 22+ years of being a software engineer I've never directly used it, nor could I explain what a prime number is if you asked next week, this kind of information is priceless for people who are very deep into the actual science part of computers.

    • @andrew-burgess
      @andrew-burgess  Год назад +4

      Thanks for adding this! Totally missed explaining the what/why of prime numbers! Appreciate you sharing this for everyone else 💚

    • @MortvmMM
      @MortvmMM Год назад +1

      just wait until quantum computers become a thing and make all hashing algorithms and bitcoin obsolete

    • @ccgarciab
      @ccgarciab Год назад +3

      Prime numbers are important in mathematics because they can be though of as the atoms of the natural numbers: a unique product of primes makes up any other number. That property makes them omnipresent in math, particularly in number theory.
      Cryptography makes use of number theory and therefore of primes. But primes are important for more than cryptography.

  • @black_platypus
    @black_platypus Год назад

    "RejEx"?
    Aah yes, good old rejular expressions 😜

    • @darrennew8211
      @darrennew8211 Год назад

      But they're not regular expressions. They're regex's, which aren't regular expressions.

    • @black_platypus
      @black_platypus Год назад

      @@darrennew8211 Which stands for...? 😅
      (I know modern RegEx implementations exceed formal regular language, but the term still comes from _Regular Expression,_ which they are often also still referred to as, anywhere outside formal language theory)

  • @trevorkerney
    @trevorkerney Год назад +1

    Say you frequent Hacker News without saying you frequent Hacker News

    • @andrew-burgess
      @andrew-burgess  Год назад

      Lol, was it on HN this week? I saw it on mastodon 🙃

    • @NostraDavid2
      @NostraDavid2 Год назад

      HN has a tendency to spill into other social sites (reddit, mastodon, lemmy, Twitter, etc.)

  • @zecuse
    @zecuse 11 месяцев назад

    1:37 Haven't finished yet, but holy crap, it looks like it's using the regex engine to try and find 2 factors of the number! If it can't, then it doesn't match and is therefore prime!
    Edit: Yup. An elegant looking prime detector if only overly brute force-ish. Only knowing if a number is a factor or not until you get to the end of a long string of 1s is very... unique.

    • @andrew-burgess
      @andrew-burgess  11 месяцев назад

      "Very... unique"
      You win most charitable comment in the video 😄

    • @zecuse
      @zecuse 11 месяцев назад

      @@andrew-burgess It reminds me of Willans formula which counts its way up to the nth prime. As you work through that, you'll generate a list of 1s until you reach the nth prime which will only produce 0s afterwards.

  • @paulimriss
    @paulimriss 11 месяцев назад

    this is awesome but implementing to a useful way takes too much memory too fast, just think of it taking 1 byte large to represent just a unit of a number then if the number is n it will use n bytes to store it

  • @wChris_
    @wChris_ Год назад +1

    This Unfortunately only works for unary numbers and it in fact does not match primes, but rather composite numbers. But to top it all of its not even regular! (this is because of the back reference!)

    • @erikkonstas
      @erikkonstas Год назад

      Not all regexes define regular languages, even though the same term is used...

    • @lucass8119
      @lucass8119 Год назад

      @@erikkonstas A regular expression SHOULD recognize only a regular language, in the strict sense. But, as usual, library implementations have nice to have features. Which... makes them not really regular expressions. But it doesn't matter, because they're more powerful and everyone likes it.

  • @framegrace1
    @framegrace1 11 месяцев назад

    This comment section is a perfect example of the internet nowadays.
    Someone post a curiosity and everyone talks and criticises very seriously about whatever obviety they happen to know about the subject.
    Guys, no one will use this for anything. No one is taking it seriously. relax and praise the cleverness of the method.

  • @PanduPoluan
    @PanduPoluan Год назад +1

    The number of backtracking is outright crazy, though.
    There are tools that will show you how much the regex engine performs backtracking to determine match/unmatch. I forgot what they are, though. I think there's at least a free one.
    Why is this important? Several years ago, I think CloudFlare or Akamai made a mistake with their regex resulting in their CDN failing to deliver stuff with 504 error as the number of backtracking for every match attempt ballooned to like in the millions of billions...

  • @KojiKazama
    @KojiKazama Год назад +1

    I always thought it was pronounced reg ex. "reg" for regular. I see now it is actually quite common to say rejex.

    • @darrennew8211
      @darrennew8211 Год назад

      Probably better to say rejex, since what a regex matches isn't regular. ;-)

  • @Yotanido
    @Yotanido Год назад

    This is pretty awesome. I have to say, though, this is stretching the definition of "regex". If you use back references, it's not exactly regular, is it.

    • @darrennew8211
      @darrennew8211 Год назад

      Apparently, the convention now is to call it a "regex" if it's not regular, and a "RE" or "regular expression" if it is.

  • @Escviitash
    @Escviitash Год назад

    The title of this video sound to me like you can enter a number, e.g 252, to see if it is prime.
    According to the explanation there will be no match if the entered number is prime.
    252 will also create no match which by the same logic should therefore be prime.
    In fact any number that are not a composite number of 1 will be prime by that logic.
    A little challenge. Determine whether the number 1000000007 is prime using only this function.
    It will only require a string of 1s that are 240 times longer than the total number of letter in the combined works of Shakespeare

  • @SlimThrull
    @SlimThrull Год назад

    Clever for sure. But the amount of processing power needed to figure out large numbers is absurdly high.

  • @TheGreatAtario
    @TheGreatAtario Год назад

    Rejex?

  • @SimonM90
    @SimonM90 Год назад

    ehm theres really no need to show people how to use this in javascript since it's very beautiful but also probably the most inefficient way to check for prime numbers, ever.

  • @josephs3973
    @josephs3973 Год назад

    It's a pretty cool trick and a creative use of regex but when it became clear how it works it instantly felt like a bad idea.

  • @John_Fx
    @John_Fx 11 месяцев назад

    i have a problem.
    use a regex!
    now I have two problems.

  • @martijn-vels
    @martijn-vels Год назад

    The time complexity of this regex tho...... 😅

  • @davidellsworth4203
    @davidellsworth4203 Год назад +3

    My mind was blown too, when I found out about this in 2014. I was so fascinated that I went on to extend it to many more number sets and functions, most of which I posted on code golf StackExchange as username Deadcode. If you Google "Match correct statements of multiplication" (with quotes) it'll bring up a summary. I've also recently written a .NET regex that matches prime numbers in decimal instead of unary.

  • @gFamWeb
    @gFamWeb Год назад

    I prefer regexr over regex 101, but both look cool!

  • @Tijme
    @Tijme 11 месяцев назад

    function isPrime(num) {
    return /^1?$|^(11+?)\1+$/.test("1".repeat(num))
    }

  • @MMarcuzzo
    @MMarcuzzo 11 месяцев назад

    Wow.

  • @The1RandomFool
    @The1RandomFool Год назад

    It's interesting, but man is it horribly inefficient and impractical for speed and memory usage. I tested it in Python against Sympy's isprime function. I also tried Rust, but \1 is not supported in Rust's regular expression.

    • @bertilow
      @bertilow Год назад

      Not supported?? So Rust is not a usable language then. I'm shocked.

  • @MikkoRantalainen
    @MikkoRantalainen Год назад

    7:49 I fail to understand what do you mean by "11111 is a prime number" when 11111 can be factored to 41 times 271.

    • @darrennew8211
      @darrennew8211 Год назад +1

      It's usually useful to read some of the comments that were already there before asking something like this, as they may have already answered your question several times over.

    • @Patrick_Bard
      @Patrick_Bard 11 месяцев назад +1

      Yeah, he fails to explain early that he is not referring to the number "11111" but the number with that amount of digits, in that cause 5. Also know as unary system.

  • @phizc
    @phizc Год назад

    Using "1" was an incredibly annoying character. I spent 90% of the video thinking it was "1" = one, "11" = eleven and so on. I even tried to make it fit with binary 😮‍💨.
    I know it was in the article, so not your fault.

  • @robslaney3729
    @robslaney3729 Год назад +4

    If you have a problem that can be solved using Regular Expressions, now you have 2 problems.

  • @samtux762
    @samtux762 Год назад +1

    > 111 is a prime
    Nope. 3*37 =111

  • @mepipe7705
    @mepipe7705 Год назад

    i feel clickbaited, because the regex does not identify prime numbers in general but only a prime count of digits...

  • @agamkohli3888
    @agamkohli3888 Год назад

    This is not a regular expression since you use the backslash

  • @TheDmviper
    @TheDmviper Год назад +3

    I think it's important to mention that regular expressions have some very specific rules that "regex" break.
    A regular expression always matches an entire string.
    A regular expression (RE) is:
    1) a single character or the empty string
    2) the concatenation of two REs
    3) the alternation of two REs (one or the other)
    4) 0 or more copies of an RE (known as kleene star)
    If you can't build the RE from these primitives it's not a RE but something else.
    The benefit is that real REs always run in O(n) time, which is why we care to make the distinction at all

  • @fflecker
    @fflecker Год назад

    I don't understand how you could test if 4,6,8 ... are a prime number. If I would write such code nobody would give me a job anymore.
    There are either easier ways to test prime numbers.

  • @simonwillover4175
    @simonwillover4175 Год назад

    /^(11+)(11+)+$/ is simpler, but a litttle bit slower due to the greediness.

  • @jeffvenancius
    @jeffvenancius Год назад

    Really cool, but I'm imagining it doesn't scale really well with big numbers?

    • @jeffvenancius
      @jeffvenancius Год назад

      Oh.... But if you need to convert in unary you could also do the conversion from binary(?) With some tweaks, of course

  • @michaelbauers8800
    @michaelbauers8800 Год назад

    head...hurts, lol. But very cool