R Tutorial | Regular Expressions in R

Поделиться
HTML-код
  • Опубликовано: 25 янв 2025
  • Today we talked about regular expressions: what they are and how they are useful. We used the stringr package to do this, but the ideas are the same in base R and are similar in other languages including python, javascript, etc. Leave your questions in the comments!
    All code available here: github.com/col...

Комментарии • 63

  • @zhiyilan3606
    @zhiyilan3606 Год назад

    This is absolutely the best intro to Regular Expressions in R video on RUclips

    • @colinquirkDS
      @colinquirkDS  Год назад

      Thanks for the kind words, glad it was helpful!

  • @ricaulcastellon9615
    @ricaulcastellon9615 3 года назад +1

    I can only say you are the best. Even an old man like myself (61) can now understand the regex basics. I am and old engineer trying to catch up with the new generation.

    • @colinquirkDS
      @colinquirkDS  3 года назад +1

      Thanks for the kind comment, glad you enjoyed the video!

  • @ryanlargo7603
    @ryanlargo7603 Год назад

    I have a midterm today and you explained the * symbol really well and understanding. Thank you!!

  • @talorix
    @talorix 3 года назад

    where regular expressions really shine ✨😂 bright like a 💎
    Joke aside, great video 🙌

  • @rmwatson1372
    @rmwatson1372 3 года назад +2

    Thank you - that was outstanding. Best explanation of regex I have seen. The way you "built it up" from basically nothing to the phone extractor was a great explanation.

  • @lavkushsingh3252
    @lavkushsingh3252 3 года назад +1

    This is the first video I came across which actually explained the regular expressions the way I wanted to learn, thanks a ton buddy! lifesaver video ^_^

    • @colinquirkDS
      @colinquirkDS  3 года назад +1

      Glad to hear! Thanks for the nice comment

  • @gaurisaran6610
    @gaurisaran6610 3 года назад +2

    This is a wonderfully explained tutorial. well paced and well explained. Thank you!

  • @ahmed007Jaber
    @ahmed007Jaber 2 года назад

    Very well done, Colin. Comprehensive and fun tutorial

  • @iangreener3137
    @iangreener3137 3 года назад +1

    Thanks Colin. It's great to see the regex gradually being built up. Really helpful

  • @chapisql
    @chapisql 4 года назад +2

    Fantastic video, Colin! I literally had no clue about regex one hour ago, now I sort of have a basic understanding on how to use these tools. It would be awesome to do a follow up video applying these tools in a more complex setting. Thanks for your work!

    • @colinquirkDS
      @colinquirkDS  4 года назад +1

      Thanks for watching! Hoping to start these back up again soon

  •  3 года назад +1

    Thanks! This is really useful and well explained.

  • @eddie141987
    @eddie141987 3 года назад +1

    Nice video, Colin. You explain things very clearly. Keep up the great work

  • @abdullaahmed5536
    @abdullaahmed5536 3 года назад

    Thanks GREAT tutorial for Regular Expressions.

  • @arvinflores5316
    @arvinflores5316 3 года назад

    Thank you!!!!! I used this vid as a supplementary while I'm reading the stringr chapter in "R for Data Science, tidy verse" book, really helped me

    • @colinquirkDS
      @colinquirkDS  3 года назад

      It's a great book! Highly recommend it for anyone learning R.

  • @bhabishyaneupane2073
    @bhabishyaneupane2073 4 года назад

    I think we need more of this! Absolutely distilled it down to the basics. Thank you so much! Again, we need more of these tutorials haha :)

    • @colinquirkDS
      @colinquirkDS  4 года назад +2

      I would love to do more when I have a bit more time. Thanks for the support!

  • @NoahChubb
    @NoahChubb 3 года назад

    This is fantastic. The only question I have after this is taking it further to the sentence level of strings. So, applying str_match_all() on a sentence to extract strings that contain part of a string, and limit the extraction to the word level in a text mining approach. A demonstration of this would be useful. I plan on using tokens() to make this simpler for me and the data I'm currently working with, but I'd enjoy a follow up. Great video, following your channel for more.

    • @colinquirkDS
      @colinquirkDS  3 года назад +1

      Thanks, not a bad idea. Hopefully when I have some more time I could do a scraping demo, that would be very interesting!

  • @2A9D8F
    @2A9D8F 4 года назад +2

    Thank you for existing

  • @robertreid64
    @robertreid64 3 года назад

    Many thanks, Colin. Excellent tutorial

  • @transportation-talk
    @transportation-talk 4 года назад +1

    Thanks for this great tutorial. Please keep doing this. First 20 minutes in, and I really like that you talked about a potential error. Also, one question: how are you jumping to numbers in the strings in RStudio (looking for the keyboard shortcut you're using)?

    • @colinquirkDS
      @colinquirkDS  4 года назад +2

      Thanks! I wish I was that good at keyboard shortcuts, for some reason OBS doesn't want to capture my cursor. It's pretty confusing, so I'll try to fix it for next stream.

  • @SamirNeg
    @SamirNeg 2 года назад

    Great tutorial, thanks a lot!

  • @yunes7305
    @yunes7305 3 года назад

    Great tutorial. You are gifted 😄

  • @oussamakad4988
    @oussamakad4988 3 года назад

    Thank you for a great tutorial

  • @kylebrennan44
    @kylebrennan44 2 года назад +1

    Thanks! what about if you had characters instead of numbers?

    • @colinquirkDS
      @colinquirkDS  2 года назад +1

      I highly recommend going through the tutorials here if you still have some confusion:
      regexone.com/

    • @kylebrennan44
      @kylebrennan44 2 года назад

      @@colinquirkDS Hey Colin, this is a shot in the dark. I have been trying to extract the following pattern into a separate col for the attribute major. There are some repeating major strings and I cant seem to figure out how to set up the regrex to also extract both characters with angle brackets { }. and for some reason my pattern also pulls minor.... Any enlightenment would be much appreciated.
      test %
      mutate(major = str_extract_all(test$lith, "[major].*[{](\\D[a-z]*)[}]") %>%
      map_chr(toString))

  • @geraldssenoga6522
    @geraldssenoga6522 4 года назад +1

    This has been so helpful, thanks alot

  • @SpiritualMeP
    @SpiritualMeP 3 года назад +1

    Amazing!

  • @ericaltf4
    @ericaltf4 3 года назад

    Amazing. Get this man some more views.

  • @SergioAGottretRios
    @SergioAGottretRios 2 года назад +2

    Thank you very much Collin, great tutorial.
    Just one question: After I found succesfully some strings with the regex expretions, How could I include in the expretion the following 3 OR 4 words?
    I've got the expresion LEY\\sN°\\s(\\d{3,4})\\sDE\\s(\\d[1,2])\\sDE\\s(\\w{4,11})\\sde\\s(\\d{4}), which matches LEY N° 2371 DE 22 DE MAYO DE 2002, but then follows a name, (that consists in 3 or 4 words).
    Thanks in advance for your time, keep helping people

    • @colinquirkDS
      @colinquirkDS  2 года назад +1

      If you check out regex101.com or any other similar site, you can play around more deeply, but something like this might work for you?
      ^(\w+\s){2,3}\w+$
      Read as "find at least one word character followed by a space 2 or 3 times, and then find at least one more word character"
      You will have to work this into your full regex of course but that is the first thing that comes to mind. Good luck!

    • @SergioAGottretRios
      @SergioAGottretRios 2 года назад +1

      @@colinquirkDS Thanks a lot! I will try it later, but so far seems like what I need!

  • @geoforce1436
    @geoforce1436 3 года назад

    Really helpful, thanks mate!

  • @arielleking6316
    @arielleking6316 Год назад

    Question: Great video! I need to extract "Math & Science" from a column. I try and it gives me: unused argument error. There are words in front and behind "Math & Science" I tried ",*Math & Sciene.*" but I received an error for that too.

    • @colinquirkDS
      @colinquirkDS  Год назад

      Can you put your entire line of code in a comment?

    • @arielleking6316
      @arielleking6316 Год назад

      @@colinquirkDS str_match(kw_06$testdiv, ".*Mathematical & Physical.*") I was able to extract the values, now I need to add a separator after this pattern to split the column

  • @mutukumakau9200
    @mutukumakau9200 2 года назад

    Thanks Collin: How would one go about splitting the following based on 2 decimal points: for example 18.00-1.10 split to 18.00 and -1.10 another example 400.000.00 split to 400.00 and 00.00

    • @colinquirkDS
      @colinquirkDS  2 года назад

      Something like this should work for you:
      (.*?\..{2})(.*)
      Play around with it in a regex tester, but you can read it as "for the first group, match anything up until the first decimal, then get the next two characters. For the second group, get everything else."

  • @sacumut
    @sacumut 4 года назад +1

    very helpful. thank you so much

  • @manoharnookala4212
    @manoharnookala4212 4 года назад +1

    can you please share the code in GitHub and give the link in description

    • @colinquirkDS
      @colinquirkDS  4 года назад

      Done!
      github.com/colinquirk/LivestreamCode/blob/master/2020-08-12/stringr.Rmd

  • @forrestoakley4882
    @forrestoakley4882 2 года назад

    Thank you so much!!

  • @hen3vz
    @hen3vz Год назад

    Thanks for this

  • @siamaksiamak5583
    @siamaksiamak5583 3 года назад

    excelent tutorial , you should be a teacher

  • @cairebrittobarletta
    @cairebrittobarletta 3 года назад

    awesome!

  • @krushnachChandra
    @krushnachChandra 9 месяцев назад +1

    new sub

  • @abdulrahmanalghamdi3595
    @abdulrahmanalghamdi3595 2 года назад

    I LOVE YOU BRO

  • @alphonceassenga9247
    @alphonceassenga9247 3 года назад

    Thank you

    • @alphonceassenga9247
      @alphonceassenga9247 3 года назад

      Hi Colin, if you don't mind, please share your email address , I need to contact you. my email aassenga@ihi.or.tz . Thanks

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 Год назад

    You have to be careful naming your variables. letters already exists as lower case letters of the English alphabet.

  • @iamericfletcher4506
    @iamericfletcher4506 4 года назад +1

    Terrific content. Hope you don't mind if I add your channel to my Awesome R Learning Resources list on GitHub? If you'd like to contribute any resources of your own, please open a pull request! We would love to have your input.
    github.com/iamericfletcher/r-learning-resources

    • @colinquirkDS
      @colinquirkDS  4 года назад

      Glad you like it! Please do share it around!