Regular Expressions (Regex): All the Basics

Поделиться
HTML-код
  • Опубликовано: 24 ноя 2024
  • I go over how to get a lot out of just the fundamentals of regular expressions (regexes). We cover all the basics, but there is an even bigger world out there of possibilities I might cover in coming videos.
    WEBSITE: lukesmith.xyz 🌐❓🔎
    DONATE: lukesmith.xyz/... 💰😎👌💯
    OR affiliate links to things l use:
    www.epik.com/?... Get a cheap and reliable domain name with Epik.
    www.vultr.com/... Get a VPS and host a website or server for anything else.
    brave.com/luk005 Get the Brave browser.
    lbry.tv/$/invi... View my videos on LBRY.
    www.coinbase.c... Get crypto-rich on Coinbase.

Комментарии • 219

  • @robertoszek
    @robertoszek 4 года назад +273

    I feel like Luke's head is always cut off in his webcam view because his humongous chad brain wouldn't fit even if he went full screen.

    • @sch8836
      @sch8836 4 года назад +11

      It's a chest cam

    • @lovely-shrubbery8578
      @lovely-shrubbery8578 4 года назад +7

      @@sch8836 that's hot

    • @yoyojuninho6130
      @yoyojuninho6130 4 года назад +2

      He does that for the same reason he keeps wearing the sunglasses over his head.

    • @hershmysson
      @hershmysson 3 года назад

      @@yoyojuninho6130 he feels naked without them

    • @Jooohn64
      @Jooohn64 2 года назад

      Vegapunk ???

  • @greyman1104
    @greyman1104 4 года назад +169

    "Crucifixion, well that's a nice thing as well." - Luke Smith, 2020

    • @internetfriendsimulation9156
      @internetfriendsimulation9156 4 года назад +8

      Luke confirmed for Longinus. Why else would he be so fluent in biblical languages?

    • @horndog2224
      @horndog2224 4 года назад +4

      @@internetfriendsimulation9156 hes redpilled

  •  4 года назад +156

    2:06 grep stands for "global regular expression print". It is an ed command: g/re/p. Ed is the standard text editor.

    • @Jack-hd3ov
      @Jack-hd3ov 4 года назад +3

      was about to say this, +1

    • @musicamonarchy3062
      @musicamonarchy3062 4 года назад +18

      Ed is the standard text editor.

    • @francescominnocci
      @francescominnocci 4 года назад +2

      @@Jack-hd3ov same

    • @vN2w3Z59BM
      @vN2w3Z59BM 4 года назад +1

      ruclips.net/video/NTfOnGZUZDk/видео.html that's just the backronym

    • @reralt
      @reralt 4 года назад +4

      Brian Kernighan himself said came from g/re/p

  • @pmester228
    @pmester228 4 года назад +85

    Finally, I needed a regex tutorial.

    • @sk8sbest
      @sk8sbest 4 года назад +3

      Sure u did

  • @mastercontrol5000
    @mastercontrol5000 4 года назад +23

    16:30 You can also use [A-z] to match any upper or lower case character, because uppercase comes before lowercase in the ASCII table. That syntax actually means "match any character in the ASCII table between these two characters".

  • @bahathir_
    @bahathir_ 4 года назад +61

    I use grep to cheat a text based game called 'hangman'.
    I use GNU grep "-w' option to word matching..
    Example:
    $ grep -w 'v.ir.s' /path/to/dictionary/file
    Thank you.

  • @simozonelayer
    @simozonelayer 4 года назад +37

    "I want a period" - Luke Smith, 2020

  • @migtrewornan8085
    @migtrewornan8085 4 года назад +86

    "\+" is nothing to do with the shell - it's because grep uses Basic Regular Expressions which doesn't include "+" as a metacharacter. If you use egrep (which uses Extended Regular Expressions) you won't need the "\".

    • @minhajsixbyte
      @minhajsixbyte 4 года назад +1

      @@juxuanu egrep is a thing. search it.

    • @minhajsixbyte
      @minhajsixbyte 4 года назад +7

      @@juxuanu No, sorry, misunderstanding, i thought you said "as far as i know its grep -E, not egrep"

    • @sk8sbest
      @sk8sbest 4 года назад

      @@minhajsixbyte egrep is deprecated

    • @minhajsixbyte
      @minhajsixbyte 4 года назад +1

      @@sk8sbest oh. i know very little about these things actually.
      but why is this depricated btw?

    • @minhajsixbyte
      @minhajsixbyte 4 года назад +1

      @@sk8sbest oh thanks I have searched and got the answer!

  • @Griffin_door
    @Griffin_door 4 года назад +21

    Your channel has taught me more useful knowledge than college did

    • @drwoot
      @drwoot 4 месяца назад

      You must have gone to a terrible college.

  • @jonathanwarner1844
    @jonathanwarner1844 4 года назад +9

    The problem for me with regular expressions is the learning curve with using them efficiently, and since I only need to use them infrequently, I never get familiar enough with their use, to use them to their best advantage. If I were using them all the time, I would not have to keep starting from scratch, learning how to use them.

  • @ruhnet
    @ruhnet 4 года назад +7

    Great video! BTW “asdf./“ is actually a valid URL since all domain names technically end in a period after the TLD. Most software infers the period if it isn’t specified, which is usually the case. Specifying the period at the end is actually the more correct format. So for example “google.com./search” is valid and should work in any software that accepts a URL.

  • @cj00785
    @cj00785 4 года назад +3

    I swear I haven't really understood regex until now. Thanks for sharing your knowledge

  • @hammerheadcorvette4
    @hammerheadcorvette4 4 года назад +3

    Luke could do a 10 video series on Regex and still not scratch the surface. I encourage this content !!!

  • @mansourq6512
    @mansourq6512 4 года назад +2

    Words can’t describe how much I’m grateful for what you did

  • @davidbanhos7308
    @davidbanhos7308 4 года назад +6

    15 years working as a developer, at last someone made regular expression easy! I finally understood! Thanks Luke!

  • @kirk0831
    @kirk0831 4 года назад +7

    I really like the way you explain Linux stuff!!

  • @eporeon
    @eporeon 4 года назад +2

    luke kept his knee in the face cam the entire time what a power move

  • @TheZMDX
    @TheZMDX 4 года назад +2

    To be honest this is one of the best videos on your channel Luke. Short and informative, thank you.

  • @magetaaaaaa
    @magetaaaaaa 4 года назад +1

    Regular expressions is one of those things I need once in a while, wind up spending a bunch of time creating something that looks like I hit my head on the keyboard, then forget how I did it months later.
    It can be simple, but it can also start to get long and tiresome if the requirement is more complicated, like matching any valid IP address
    "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"

  • @NikoxD93
    @NikoxD93 4 года назад +17

    7:32 listen carefully when he says "spool", he makes the perfect old Minecraft fall damage sound! Wizardry

    • @yessfan
      @yessfan 3 месяца назад

      What the frick guys

  • @wepranaga
    @wepranaga 4 года назад +2

    thank you so much, this is so useful and in depth. teach you basically 90% there
    I literally sit through the whole thing effortlessly. great video

  • @pward17
    @pward17 4 года назад

    No bloat on these pups. I like that.

  • @JurajOravecSGOrava
    @JurajOravecSGOrava 4 года назад +1

    Luke shows us magic.

  • @FuzanToko
    @FuzanToko 4 года назад

    Thanks for sharing

  • @parthhanda6828
    @parthhanda6828 4 года назад +1

    I was recently thinking about starting to learn regex and this was really helpful as an introduction. Thank you.

  • @CrawlingChaos666
    @CrawlingChaos666 Год назад +1

    The Word of Luke have Power...

  • @RishabhRD
    @RishabhRD 4 года назад +17

    Vim diesel

  • @andreyseliverstov3134
    @andreyseliverstov3134 4 года назад +3

    Suggestion for the future video: full-text instant search in a local 2 TB archive of textbooks and articles (PDF + DJVU). Using regex, of course.

  • @sayanghosh6996
    @sayanghosh6996 4 года назад

    I was literally looking up several articles about regex today! This is perfect timing.

  • @GeneralHazerd
    @GeneralHazerd 4 года назад

    @Luke Smith keep up the good work 👍

  • @apoorv9492
    @apoorv9492 4 года назад +1

    thank you for video, i recently have been trying out DWM with dwmblocks, and wanting to write scripts for the statusbar, this was very helpful. im finding shell scripting very interesting and fun. i sure want to learn more. hope there's a follow up advanced video.

  • @kelvinmuli5420
    @kelvinmuli5420 4 года назад

    Love regex so much..

  • @rochelouis2494
    @rochelouis2494 Год назад

    super super, gracias por compartirnos este video. muy ilustrativo.

  • @hemanthakumar.h.n.4382
    @hemanthakumar.h.n.4382 2 года назад

    Very useful :) thank you!

  • @humanbeing_
    @humanbeing_ 3 года назад

    Thank you for this Luke! This is a HUGE help for me 👍

  • @haythamadnan3465
    @haythamadnan3465 Год назад

    Thank you Luke for this wonderful video.

  • @donihalim3239
    @donihalim3239 4 года назад

    Thank you Luke!

  • @PROGamersf36
    @PROGamersf36 3 года назад

    Thank you ! It is a great lesson!!!

  • @RiteshKumar-lh1xn
    @RiteshKumar-lh1xn 4 года назад

    Thank you very much for this. Please make advanced video on this too if you get time.

  • @FyahBurn95
    @FyahBurn95 4 года назад +1

    Good introduction with good examples! To throw something useful into it, don't you find weird you need to escape '+' but not '*'? And '*' is a (famous) shell expansion, but '+' is not! Also, if this was the case, it would be solved by the usage of double quotes you do, which avoids expanding anything but the dollar sign (for variables), or by switching to single quotes, which don't allow for any shell processing. The actual thing happening is that grep uses default regular expressions and the plus sign is from extended regular expressions, which grep understands but only if they are escaped. To use extended regexps without the need for escaping their metacharacters, try egrep or grep -E.

    • @ganainm01
      @ganainm01 4 года назад

      The "+" was not part of the original set of special characters; in fact, "a\+" (or "a+" with EREs) is just an alternative way to write "aa*".

  • @leafknot
    @leafknot 4 года назад

    Just in time, I needed this for writing my first script, thanks Luke!

  • @jcc1199
    @jcc1199 4 года назад +1

    now that we've found all instances of a certain thing in a text file, can you make a video on deleting, moving, replacing, etc. -- putting to use the output we've gotten here?
    thanks a lot. chad-like teaching content as usual

  • @Pabloparsil
    @Pabloparsil 4 года назад

    I think newbies out there would also like to know that
    and \t also have special meanings: end of line and tab respectively. I'm not sure if that works with grep but it does work with other tools like python.
    I use them all the time

  • @mightsnipe
    @mightsnipe 4 года назад

    Thanks Luke, super helpful! Cheers

  • @simozonelayer
    @simozonelayer 4 года назад

    Really helpful.

  • @abrokenlink_
    @abrokenlink_ 4 года назад +6

    Yo I have no idea what I’m watching but it sounds cool.

  • @JasonDeBoltGoogle
    @JasonDeBoltGoogle 4 года назад +1

    Luke, have you considered creating videos on cloud? You appear to be a systems thinker and I’d bet that you can teach cloud technologies pretty well.

  • @xtdycxtfuv9353
    @xtdycxtfuv9353 4 года назад

    this was a useful video. ty luke.
    I have to admit the oomer convention made me laugh and was very meta

  • @patrickprucha5522
    @patrickprucha5522 Год назад

    Thanks Luke. Good video!

  • @deagle-zlt
    @deagle-zlt 4 года назад

    Спасибо! Хорошо всё объяснил. Хорошее видео

  • @TheGaridi2
    @TheGaridi2 4 года назад

    I wa implementing search functionality on my web app and regexp is essential thanks for this crash course

  • @evannoynaert
    @evannoynaert 4 года назад +2

    Is there a reason you were using double quotes instead of single quotes? I would be more inclined to use single quotes with grep and egrep to avoid accidental expansion problems. Generally I only use doubles when I know that I really want expansion. Here is an example to demonstrate the difference. Add two lines to your rt file: The first is the sentence "Navigate to your $HOME directory." Then add the absolute address of your home directory to the file. You could do this with the command "echo $HOME >> rt"
    Now the results of grep "$HOME" rt and grep '$HOME' rt give very different results.
    Also, I tend to use either egrep or grep -E instead of just plain grep. This is in part because I cut my teeth on Regular Expressions in Perl, and egrep is closer to the Perl that I learned first. It is also considerably more powerful.

  • @antadhg
    @antadhg 10 месяцев назад

    instead of [0-9] for all digits you can use \d, similarly for any non-digits you can use \D

  • @veryown8084
    @veryown8084 4 года назад

    5:04 yeah it's from the regural expressions of formal languages (more specifically Regural languages which are equivalent of L3 languages, those that can be generated form a right-linear grammar) where x* means {x}* so basically (x^0, x^1, x^2, ... ) x^0 is epsilon or lambda (also known as the empty word, a word of 0 letters) and x^1=x, x^2=xx, x^3=xxx and so on

    • @veryown8084
      @veryown8084 4 года назад

      5:28 {x}^+ is basically the same as {x}^* but without the epsilon/lamda, so it's (x^1, x^2, x^3...) As you can see this have something to do with Math more specifically with monoids. More infos here: www.ncbi.nlm.nih.gov/pmc/articles/PMC3367686/ en.wikipedia.org/wiki/Regular_grammar

  • @osamaadil231
    @osamaadil231 4 года назад +14

    Will it help me find a soulmate?

  • @Viken43
    @Viken43 4 года назад

    Luke Smith has saved Linux overnight, invented time travel, an alter ego and global warming.... and of course LARBS ;-)

  • @DavidJBurbridge
    @DavidJBurbridge 4 года назад

    Me thinking yesterday: I should get around to learning regex for better grep/sed/awk etc
    Then you put this up. Cheers uncle Luke

  • @reverseila4363
    @reverseila4363 4 года назад

    Tnx for regx!
    Could you make a video on git and show us how you use it on your daily basis?

  • @zacharycarbon4312
    @zacharycarbon4312 4 года назад

    I've yet to see a more digestible explanation of regex. And I have seen many.

  • @BarraIhsan
    @BarraIhsan 3 года назад

    Or you can use \w to match letter and \W to match non letter
    And \d to match digit and \D to match non digit
    CMIIW

  • @NostraDavid2
    @NostraDavid2 4 года назад

    What I miss in this video: What to *actually* use regex for.
    Here's what I've used regex for:
    Refactoring code - I had some functions in JS that I wanted to turn into lambda's.
    Replacing HTML in several files - The files were partially identical and II wanted to replace something in the header elements.
    Find non-ascii letters - I copied Haskell code from a book and it didn't compile, because the book used unicode quotes for the comments and GHC broke on those quotes.
    Find the nth comma in a CSV file - I wanted to remove everything after the 3rd comma or something like that, because I didn't need that data - the file was too large to open in Excel.
    At the end of the day regex is a tool to serve a purpose. Don't learn it for the heck of it. Learn it because you can use it to solve problems.

    • @der0keks
      @der0keks 3 года назад

      wouldn't cut be better for the CSV thing? good tip for finding non-ascii though, that will come in handy.

    • @NostraDavid2
      @NostraDavid2 3 года назад +1

      ​@@der0keks Huh, didn't know the cut command was a thing! I'm more of a Windows guy, so I'm woefully behind on my terminal knowledge.
      Happy to hear someone has found something useful! The regex was [^\x00-\x7F] BTW, which is basically searching for any NOT ASCII char.

  • @kyu9649
    @kyu9649 4 года назад

    Good video. Maybe make a part 2 where you go over grouping and such, as it is also quiet important :P

  • @oleksiynehlyadyuk8123
    @oleksiynehlyadyuk8123 4 года назад

    wow that's a regex tutorial now.. can't wait more deeper examples

  • @arissk_
    @arissk_ 4 года назад

    Nice work. I like your videos on general command line tools. Can I suggest a presentation on sed or awk? I know these tools may require longer videos but I m sure you can manage it

  • @mersno
    @mersno 4 года назад

    I consider myself decent at bash but your videos always provide value, thanks boomer

  • @escravovoluntario6698
    @escravovoluntario6698 4 года назад

    Regex is awesome.

  • @DigitalMetal
    @DigitalMetal 4 года назад +2

    "[a-Z]" should work the same as "[A-Za-z]".
    Although there might be a difference I'm unaware of.

    • @FyahBurn95
      @FyahBurn95 4 года назад

      It matches by the actual code of the characters, and there are characters between z and A, so it would match all the letters plus other stuff.

    • @elandje
      @elandje 4 года назад +1

      It will probably behave in an unexpected way, because the upper case letters come BEFORE the lower case letters in ASCII...

    • @FyahBurn95
      @FyahBurn95 4 года назад +1

      @@elandje I was going to answer this, but if you try grep with [a-Z] it works, but if you do it with [A-z] it says "invalid range end". What??

    • @elandje
      @elandje 4 года назад

      @@FyahBurn95 It is like I said, 'Z' comes before 'a' therefore it is invalid. [A-z] works, but includes a few non-letters, like [, |, ] and @. Search online for ASCII table.

    • @FyahBurn95
      @FyahBurn95 4 года назад

      @@elandje Have you tried it? [A-z] does not work with grep for me, but [a-Z] does. I know what the ASCII table is and the fact that A-Z comes before a-z, and also that UTF-8 is an extension of ASCII, which is what matters unless you actually work with ASCII files.

  • @hp8246
    @hp8246 4 года назад

    Great video. Thanks for sharing these fundamentals of regular expressions. Do these basics work on Vim? Thanks again.

  • @zoop2174
    @zoop2174 4 года назад

    If you decide to make a more advanced tutorial definitely include lookahead and lookbehind as I use them all the time.

  • @albarshini490
    @albarshini490 4 года назад

    I was going to Email you to ask you to do this video, Thank you Luke very cool.

  • @ryukthegodofdeath8063
    @ryukthegodofdeath8063 4 года назад

    Awesome. Next awk?

  • @RaivoDoc
    @RaivoDoc 4 года назад

    This will actually land me a better job.. wow.

  • @horndog2224
    @horndog2224 4 года назад +2

    This actually is just in time for my sys admin class at uni over the summer, thanks luke!

  • @tacitus_
    @tacitus_ 4 года назад +1

    Hey Luke, sorry if this is too personal, but I noticed that you've started to display a lisp. Did you recently get Invisalign braces (or has the social isolation lowered your pronunciation level)? I had braces put on in my mid 20s and they made me a little lispy too

  • @peterjansen4826
    @peterjansen4826 4 года назад +14

    boomer, zoomer, doomer, coomer. I can' t keep up anymore, I must be gbetting old. Apparently I am a boomer now according to the zoomers even though I never was a boomer before.

    • @morzinbo
      @morzinbo 3 года назад +1

      boomer is a mindset as well as a generation of people that destroyed the USA

  • @phiasch
    @phiasch 4 года назад +1

    How are you quickly saving the 'note' file? I know ZZ is similar to :wq, but what's the similar command to :w? Where would I find the docs to read about commands like ZZ?

  • @BCDeshiG
    @BCDeshiG 4 года назад +2

    9:20 Finally, I can call someone every single oomer at once

  • @xtnctr
    @xtnctr 4 года назад +4

    I opted out: wife, car, children, etc over: vim, regex and linux. Never been happier.

  • @pichass9337
    @pichass9337 4 года назад

    You read my fucking mind

  • @diegoestrada35
    @diegoestrada35 4 года назад

    I found this really usefull

  • @ianpan0102
    @ianpan0102 4 года назад +1

    I clicked on this video because from the thumbnail I thought Luke was naked (his T-shirt colour).

  • @steveyuhas9278
    @steveyuhas9278 Год назад +1

    Tip for anyone like me out there:
    Watch all these videos at 0.75x.
    😉

  • @iLiokardo
    @iLiokardo 4 года назад

    to match any letter, lower case or upper, i think you can do [a-Z]. much better.

  • @auronkardek
    @auronkardek 4 года назад

    FINALLY

  • @thefantasicm_2407
    @thefantasicm_2407 4 года назад +1

    Will you be able to design a regular expression which matches exactly the words a^n.b^n (a random number of 'a' followed by the same number of 'b') ? ;-)

    • @magetaaaaaa
      @magetaaaaaa 4 года назад

      So the criteria is that it must match instances of ab where the number of a's and b's are equal? So ab or aabb would match but abb or aaaaabb would not match?

    • @thefantasicm_2407
      @thefantasicm_2407 4 года назад

      @@magetaaaaaa yes this is the criteria :-)

    • @magetaaaaaa
      @magetaaaaaa 4 года назад

      @@thefantasicm_2407 Hmmmm, I'm not a regex guru by any means but I feel like there would almost have to be something out of regex to do the analysis? I feel like trying to come up with a solution now.

    • @thefantasicm_2407
      @thefantasicm_2407 4 года назад

      @@magetaaaaaa This is a tricky question : the answer is that it is impossible to recognize this language with regular expressions, sorry :-). This is related to the Kleene theorem. Look for Pumping lemma (regular languages) to understand why.

    • @magetaaaaaa
      @magetaaaaaa 4 года назад

      @@thefantasicm_2407 Hmmmm, maybe something like this would do the trick with Python.
      import re
      file = open("textfile.txt", "r")
      for line in file:
      a = line.count('a')
      b = line.count('b')
      if re.match("^a+b+$", line):
      if a == b:
      print(line)

  • @spartan1o5
    @spartan1o5 4 года назад

    doing gods work! any cool projects like the corona project you had before?

  • @hofrreeze
    @hofrreeze 4 года назад

    I never expected that GG Allin would lecture me about regex someday.

  • @christbaumer
    @christbaumer 4 года назад

    15:30 TLDs actually got more complicated than that: en.wikipedia.org/wiki/Internationalized_country_code_top-level_domain

  • @musicamonarchy3062
    @musicamonarchy3062 4 года назад

    Ed is the standard text editor

  • @axpanos
    @axpanos 4 года назад

    When is the jagex tutorial coming?

  • @mrkvaccc
    @mrkvaccc 4 года назад +1

    Luke before: Searches for Jesus
    Luke now: "Crucifixion, well that's a nice thing as well"

  • @tato-chip7612
    @tato-chip7612 4 года назад +2

    luke how did your setup handle multiple languages for your linguist work?

    • @cocoapuffs5299
      @cocoapuffs5299 4 года назад

      i'm also interested in this, as i use arabic and spanish.

  • @biehdc
    @biehdc 4 года назад

    Finally a regex to human translator

  • @siborgium9022
    @siborgium9022 4 года назад

    [A-za-z] can be reduced to just [A-z]

    • @saeedbaig4249
      @saeedbaig4249 4 года назад

      That gives me a "grep: Invalid range end" error.
      It worked though when I did [a-Z]

  • @normanpedersen5454
    @normanpedersen5454 4 года назад +4

    My life: This is a learning exercise. Who cares.

  • @DeepakRajan
    @DeepakRajan 4 года назад

    Please keep making videso

  • @elclippo4182
    @elclippo4182 4 года назад

    17:35 How to validate an email address using a regular expression?
    stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression

  • @mgetommy
    @mgetommy 4 года назад

    epic swag

  • @victorprokop2240
    @victorprokop2240 4 года назад +2

    hey luke How can I integrate uganda.txt to my neovim i reaaally like it

  • @cpubug
    @cpubug 7 месяцев назад

    thnx