Yes! Please, definitely make a second part. I teach in the Humanities (college literature and creative writing classes), and I'm actively searching for tools I can use for creative experiments with texts.
doc = nlp(u'A severe storm hit the beach. It started to rain.') for sent in doc.sents: print([sent[i] for i in range(len(sent))]) for sent in doc.sents: print([word for word in sent]) print([doc[i] for i in range(len(doc))]) # Check if the first word in the second sentence of the text is a pronoun for i, sent in enumerate(doc.sents): if i == 1 and sent[0].pos_ == 'PRON': print('The second sentence begins with a pronoun.') # Check how many sentences in the text end with a verb counter = 0 for sent in doc.sents: if sent[-2].pos_ == 'VERB': counter += 1 print(f'{counter} sentence(s) in the document end with a verb.')
**Named Entity Recognition**: A named entity is a real object that you can refer to by a proper name. It can be a person, organization, location, or other entity. Named entities are important in NLP because they reveal the place or organization the user is talking about. doc = nlp(u'I have flown to Islamabad. Now I am flying to Lahore.') for token in doc: if token.ent_type != 0: # If the ent_type attribute of a token is not set to 0 print(token.text, token.ent_type_) # then the token is a named entity
doc = nlp(u'A severe storm hit the beach. It started to rain.') for sent in doc.sents: print([sent[i] for i in range(len(sent))]) for sent in doc.sents: print([word for word in sent]) print([doc[i] for i in range(len(doc))]) # Check if the first word in the second sentence of the text is a pronoun for i, sent in enumerate(doc.sents): if i == 1 and sent[0].pos_ == 'PRON': print('The second sentence begins with a pronoun.') # Check how many sentences in the text end with a verb counter = 0 for sent in doc.sents: if sent[-2].pos_ == 'VERB': counter += 1 print(f'{counter} sentence(s) in the document end with a verb.')
I’ve come back to this video several times. The ONLY tutorial I’ve seen which walks through the whole process . The Python Tutorials for the digital humanities videos are also great. I am focused on biomedical text, but text is text when you are trying to get started.
45:04 *Named Entity Recognition:* A named entity is a real object that you can refer to by a proper name. It can be a person, organization, location, or other entity. Named entities are important in NLP because they reveal the place or organization the user is talking about. doc = nlp(u'I have flown to Islamabad. Now I am flying to Lahore.') for token in doc: if token.ent_type != 0: # If the ent_type attribute of a token is not set to 0 print(token.text, token.ent_type_) # then the token is a named entity
Thank you Dr William for taking me through such wonderful journey on NLP - it was my first learning on this area of python application and i found it quite useful and excited to do some more. Looking forward to having your part 2 soon!
We can extract noun chunks by iterating over the nouns in the sentence and finding the syntactic children for each noun to form a chunk. doc = nlp(u'The quick brown fox jumps over the lazy dog.') '''for chunk in doc.noun_chunks: # Regular method print(chunk)''' for token in doc: # Manual method if token.pos_ == 'NOUN': chunk = '' for w in token.children: if w.pos_ == 'DET' or w.pos_ == 'ADJ': chunk += w.text + ' ' chunk += token.text print(chunk)
I must say, in my hoping around of learning Vulkan, graphics pipeline, and the endless nights of random coding do's and don'ts, understanding this here has become surprisingly easy. But then again it is beginner level lol, you just do a good job explaining here.
@@python-programming really looking forward to the next video. One topic I have not seen you address is the question of tools for Annotation. When working in specialized language domains, extra training of models is a key step. As a newcomer, I have not yet found a process compatible with Spacy which is reasonably efficient. Prodigy?
doc = nlp(u'A severe storm hit the beach. It started to rain.') for sent in doc.sents: print([sent[i] for i in range(len(sent))]) for sent in doc.sents: print([word for word in sent]) print([doc[i] for i in range(len(doc))]) # Check if the first word in the second sentence of the text is a pronoun for i, sent in enumerate(doc.sents): if i == 1 and sent[0].pos_ == 'PRON': print('The second sentence begins with a pronoun.') # Check how many sentences in the text end with a verb counter = 0 for sent in doc.sents: if sent[-2].pos_ == 'VERB': counter += 1 print(f'{counter} sentence(s) in the document end with a verb.')
Thank you very much for making this video. I want to create my own corpus to analyze data. But as a newbie to Python, I found it really hard to start without a clear direction. Looking forward to Part 2!
Awesome content there Dr. William. I was really hyped during the series and every aspects of spaCy you've described perfectly. Now I'm interested on ML aspect of spaCY and It'd be great if you come with ML aspect of spaCy.
Thanks Great course and I love how easy and smooth the explanation is. Moreover I like how explaining each step before diving into it is really making the understanding easier for us to follow thanks a lot. BTW I've spent some time looking for the github account and repo related to this video here is it if anyone needs it to begin following the video, ENJOY...
Everyone seemed to be asking for part 2, but this coverage is good enough - so good that I don't think it deserved a part 2, otherwise a large part is going to be lots of repetition. I will keep exploring deeper based on this video itself.
Definitely important to dig into the .similarity() output before using it in one's own work. One of its flaws is that it cares too much about the number of words in the spans being compared. For example: print(nlp2("fries").similarity(nlp2("burgers"))) = .65 print(nlp2("fries").similarity(nlp2("hamburgers"))) = .58 print(nlp2("fries").similarity(nlp2("ham burgers"))) = .70 print(nlp2("french fries").similarity(nlp2("hamburgers"))) = .46 print(nlp2("french fries").similarity(nlp2("ham burgers"))) = .64 Also, I find that the small model correctly identifies West Chestertenfieldville as a GPE without modification, and I find that nlp.add_pipe("entity_ruler") does not add of the pipeline-description we see via nlp.analyze_pipes(). Rather, it seems that element of this description is in alphabetical order, and every nested sub-element is also in alphabetical order. I suspect this does not say anything about the true order of the pipeline.
46:00 # Print tokens and their part of speech print("Tokens and their POS tags:") for token in doc: print(f"{token.text}: {token.pos_}") print(' Sentences:') for sent in doc.sents: print(sent) # Print named entities print(" Named Entities:") for ent in doc.ents: print(f"{ent.text} ({ent.label_})")
Where’s part 2!!! If there’s time in part 2, I would definitely be interested to know how to train ML to help with research and literature reviews as an example
i found this to be an excellent tutorial - very clear, great examples and thorough. thank you for sharing this and i look forward to seeing you continue with another covering machine learning in spacy.
The Doc object’s **doc.sents** property lets us separate a text into its individual sentences: doc = nlp(u'A severe sand storm hit the Gobi desert. It started to rain.') for sent in doc.sents: print([sent[i] for i in range(len(sent))])
@@andrijor indeed! I am still working on it. Between the textbook and the video it takes a while to make. I am hoping to have it ready in early January.
This is very helpful. I am very new to Machine Learning and NLP. I am in a situation where I have thousands of documents which don't always have correct spellings. I have to analyze these documents to look for trends related to parts failure especially if the failure has resulted in death or injury. Ideally like to learn from the data that can inform the future failures before there is a death or injury. Can SpaCy help with this?
Lol. spaCy's most recent build as of Feb 2, 2022 does properly identify West Chestertenfieldville as a GPE. ruclips.net/video/dIUTsFT2MeQ/видео.html EDIT: Just finished the course - A phenomenal piece of work. Thanks so much for doing this. I can tell you put in an incredible amount of time and effort to do this and you provide it so graciously for free. It is so much clearer than spacy's documentation. They have a new spacy 101 so I'll give that a go now to cement this all in the nogin. And yes, eagerly awaiting part 2.
Thank you for the great video! When I run the most_similar method, copying the code on your notebook, I end up receiving a complete set of differemt words, some unrelated to the word and some in other languages. Example: country gave me ['country-0,467', 'nationâ\x80\x99s', 'countries-', 'continente', 'Carnations', 'pastille', 'бесплатно', 'Argents', 'Tywysogion', 'Teeters'] Can somebody help me understand why this is happening?
Same here. Curious but...maybe the transformers/models (not sure which) are retrained thus giving us a different set of words? Hopefully someone can answer this!
Yes! Please, definitely make a second part. I teach in the Humanities (college literature and creative writing classes), and I'm actively searching for tools I can use for creative experiments with texts.
Check out Microsft Power Automate AI Builder
@@NickWindham hey is the second part out yet?
hey, I`m not getting the expected output at 1:57:26. It`s showing KeyError: 0. Can you help me with that?
Where do you teach?
Bangers one after another. This channel is a treasure.
Clicked this video by accident but got hypnotised by the shirt and now I'm learning Python.
doc = nlp(u'A severe storm hit the beach. It started to rain.')
for sent in doc.sents:
print([sent[i] for i in range(len(sent))])
for sent in doc.sents:
print([word for word in sent])
print([doc[i] for i in range(len(doc))])
# Check if the first word in the second sentence of the text is a pronoun
for i, sent in enumerate(doc.sents):
if i == 1 and sent[0].pos_ == 'PRON':
print('The second sentence begins with a pronoun.')
# Check how many sentences in the text end with a verb
counter = 0
for sent in doc.sents:
if sent[-2].pos_ == 'VERB':
counter += 1
print(f'{counter} sentence(s) in the document end with a verb.')
50 minutes in and it is already the best practical explanation of how spaCy works.
I’ve been watching Dr. Mattingly’s other videos and they’re great.
Best Helpline for those who really want to learn NLP with ease and for free , can't wait for part 2
**Named Entity Recognition**:
A named entity is a real object that you can refer to by a proper name. It can be a person, organization, location, or other entity. Named entities are important in NLP because they reveal the place or organization the user is talking about.
doc = nlp(u'I have flown to Islamabad. Now I am flying to Lahore.')
for token in doc:
if token.ent_type != 0: # If the ent_type attribute of a token is not set to 0
print(token.text, token.ent_type_) # then the token is a named entity
I was searching for Spacy tutorials yesterday, and FCC uploaded it, thank you 💝. Interested in part 2.
Hi, can you tell where can i find the repository for the data?
doc = nlp(u'A severe storm hit the beach. It started to rain.')
for sent in doc.sents:
print([sent[i] for i in range(len(sent))])
for sent in doc.sents:
print([word for word in sent])
print([doc[i] for i in range(len(doc))])
# Check if the first word in the second sentence of the text is a pronoun
for i, sent in enumerate(doc.sents):
if i == 1 and sent[0].pos_ == 'PRON':
print('The second sentence begins with a pronoun.')
# Check how many sentences in the text end with a verb
counter = 0
for sent in doc.sents:
if sent[-2].pos_ == 'VERB':
counter += 1
print(f'{counter} sentence(s) in the document end with a verb.')
I’ve come back to this video several times. The ONLY tutorial I’ve seen which walks through the whole process . The Python Tutorials for the digital humanities videos are also great. I am focused on biomedical text, but text is text when you are trying to get started.
I can't believe such good content is for free, thank you.
45:04
*Named Entity Recognition:*
A named entity is a real object that you can refer to by a proper name. It can be a person, organization, location, or other entity. Named entities are important in NLP because they reveal the place or organization the user is talking about.
doc = nlp(u'I have flown to Islamabad. Now I am flying to Lahore.')
for token in doc:
if token.ent_type != 0: # If the ent_type attribute of a token is not set to 0
print(token.text, token.ent_type_) # then the token is a named entity
Super interessant de meer de diepte in te gaan. Met andere woorden stukje geschiedenis les.💪💪👍
just finished 1/3 and I have to say very good introduction. thanks a lot on the sharing
Thank you Dr William for taking me through such wonderful journey on NLP - it was my first learning on this area of python application and i found it quite useful and excited to do some more. Looking forward to having your part 2 soon!
where can I find text book
This video lesson was great. Looking forward to see the second part.
Thank you so much. The best course on SpaCy I have founded. Please make Part Two! We are waiting for it!
I'm definitely interested in the ML aspects of spaCy) Thank you very much for the video!
so much value! thanks for making this material available for free. Incredible value
Definitely interested in part2 of this course
Great video. Please make the second part ASAP. Keep up the good work.
Thanks for this incredible class and textbook, it was very helpful. Greetings from Brazil
Thanks!
We can extract noun chunks by iterating over the nouns in the sentence and finding the syntactic children for each noun to form a chunk.
doc = nlp(u'The quick brown fox jumps over the lazy dog.')
'''for chunk in doc.noun_chunks: # Regular method
print(chunk)'''
for token in doc: # Manual method
if token.pos_ == 'NOUN':
chunk = ''
for w in token.children:
if w.pos_ == 'DET' or w.pos_ == 'ADJ':
chunk += w.text + ' '
chunk += token.text
print(chunk)
I must say, in my hoping around of learning Vulkan, graphics pipeline, and the endless nights of random coding do's and don'ts, understanding this here has become surprisingly easy. But then again it is beginner level lol, you just do a good job explaining here.
Also an historian looking for ways to extract info from old documents. very looking forward to the second part.
I am trying to have it out in early January.
@@python-programming really looking forward to the next video. One topic I have not seen you address is the question of tools for Annotation. When working in specialized language domains, extra training of models is a key step. As a newcomer, I have not yet found a process compatible with Spacy which is reasonably efficient. Prodigy?
Best Helpline for those who really want to learn
Excelent!!! The best of the best!!!! Please do the second showing how to train the model.
This video is fantastic! I would really appreciate part 2
Thank you! interested in part 2.
Thanks
You're a wizard, W.J.B. Mattingly! Sincerely yours, a stan
this is absurd, opened yt for NLP videos and it was uploaded 1 sec ago.
Happened to me for a deep learning course.
@@avnishpanwar9502 this channel is a boon
Destiny
I started building a personal NLP agent and this was immediately recommended
Wild
doc = nlp(u'A severe storm hit the beach. It started to rain.')
for sent in doc.sents:
print([sent[i] for i in range(len(sent))])
for sent in doc.sents:
print([word for word in sent])
print([doc[i] for i in range(len(doc))])
# Check if the first word in the second sentence of the text is a pronoun
for i, sent in enumerate(doc.sents):
if i == 1 and sent[0].pos_ == 'PRON':
print('The second sentence begins with a pronoun.')
# Check how many sentences in the text end with a verb
counter = 0
for sent in doc.sents:
if sent[-2].pos_ == 'VERB':
counter += 1
print(f'{counter} sentence(s) in the document end with a verb.')
This is a great NLP tutorial. I have checked out a few others but this one here takes the cake. Thanks for the excellent resource!
Thank you for the explanations. They are very clear and relevant. Excellent video.
Thank you very much for making this video. I want to create my own corpus to analyze data. But as a newbie to Python, I found it really hard to start without a clear direction. Looking forward to Part 2!
Very much interested in the machine learning aspect of SpaCy. Thank you, this course was informative and handy.
Awesome content there Dr. William. I was really hyped during the series and every aspects of spaCy you've described perfectly. Now I'm interested on ML aspect of spaCY and It'd be great if you come with ML aspect of spaCy.
Thank you 🙏 , Interested in part 2
Awesome Video! Can't wait for part 2
crossing my fingers 🤞🤞🤞
Thanks Great course and I love how easy and smooth the explanation is. Moreover I like how explaining each step before diving into it is really making the understanding easier for us to follow thanks a lot. BTW I've spent some time looking for the github account and repo related to this video here is it if anyone needs it to begin following the video, ENJOY...
where?`
Everyone seemed to be asking for part 2, but this coverage is good enough - so good that I don't think it deserved a part 2, otherwise a large part is going to be lots of repetition. I will keep exploring deeper based on this video itself.
There are also a lot of other resources available (and free), if u have time to go through: course.spacy.io/en/
@@tthtlc thank you so much! do you have other resources like that?
Please make the second video about machine learning! this was so helpful
Thank you Dr. William. Looking forward for a part two.
Definitely important to dig into the .similarity() output before using it in one's own work. One of its flaws is that it cares too much about the number of words in the spans being compared. For example:
print(nlp2("fries").similarity(nlp2("burgers"))) = .65
print(nlp2("fries").similarity(nlp2("hamburgers"))) = .58
print(nlp2("fries").similarity(nlp2("ham burgers"))) = .70
print(nlp2("french fries").similarity(nlp2("hamburgers"))) = .46
print(nlp2("french fries").similarity(nlp2("ham burgers"))) = .64
Also, I find that the small model correctly identifies West Chestertenfieldville as a GPE without modification, and I find that nlp.add_pipe("entity_ruler") does not add of the pipeline-description we see via nlp.analyze_pipes(). Rather, it seems that element of this description is in alphabetical order, and every nested sub-element is also in alphabetical order. I suspect this does not say anything about the true order of the pipeline.
Please create Part 2!!!!! Part one was 🔥🔥🔥🔥🔥🔥🔥
Waiting for the second part ! This tutorial is perfect , thank you so much !
Love the work you are doing. Many thanks from India
46:00
# Print tokens and their part of speech
print("Tokens and their POS tags:")
for token in doc:
print(f"{token.text}: {token.pos_}")
print('
Sentences:')
for sent in doc.sents:
print(sent)
# Print named entities
print("
Named Entities:")
for ent in doc.ents:
print(f"{ent.text} ({ent.label_})")
Superb, Waiting for part 2 with thanks🙏👍
Outstanding overview of Spacy, can't wait for part 2! Thank you so much.
is part 2 out?
Thanks for your awesome introduction :). Would love to have your next course on using spaCy for ML.
very simple and easy to understand thank you for this
Great Tutorial. Learnt a lot about SpaCy fundamentals.
Really fascinating and accessible. Thank you.
This video is very useful for me. Thanks for always bringing the great video. Mad respect from me
This is ready awesome teaching video. I feel highly interesting in the part two video.
Very very helpful stuff! 31 minutes in the video and I'm already using spacy for my own analyses! Thank you so much!
Yes please for a part 2 on Machine Learning with Spacy!
Where’s part 2!!! If there’s time in part 2, I would definitely be interested to know how to train ML to help with research and literature reviews as an example
i found this to be an excellent tutorial - very clear, great examples and thorough. thank you for sharing this and i look forward to seeing you continue with another covering machine learning in spacy.
The Doc object’s **doc.sents** property lets us separate a text into its individual sentences:
doc = nlp(u'A severe sand storm hit the Gobi desert. It started to rain.')
for sent in doc.sents:
print([sent[i] for i in range(len(sent))])
what version of spacey is he using?
Excellent tutorial. Straight into the subject. Hats off to you !!
Great course.
It would be nice you upload the video with higher resolution.
Great work! Really a good video to learn using spaCy.
Let's do the second part of it 🙂
Hi and thank you very much for your tutorial. I really enjoyed it and looking forward to the second part of the tutorial
Thanks for the depth with this library sir
This is super awesome tutorial. Just what I need. Thanks!
Such a nice video, 2nd part please!!
Where can I access the textbook? Can someone let me know! Would really appreciate it!
Eagerly waiting for the second part.
Enjoyed This video waiting for part2.
Thank you so much! Such a wonderful video.
(1:35:44) Matcher
03/22/2023 2:21:22
eagerly waiting for the second part..please upload it soon...
That was a very nice explanation and an awesome tutorial. Waiting for the machine learning part.
Greate tutorial, please make a second part!
This is very helpful, thank you!
Looking forward for Machine Learning aspects of Spacy.
I would like the ML version too. So looking forward to seeing that
Can't wait for Part 2
Waiting for 2nd part sir 👌🙏
i woke up and im here
Thanks a lot, and please make the second part.
This tutorial is so freaking inspiring to me. NLP is so exciting and I'd love to integrate it with machine learning!!!!
I'd be 100% down to watch a tutorial with part 2!!!!!
Thanks! Good to know. I think I will start planning it this week.
@@python-programming Hi hi, any updates on part 2? I hope everything's ok :)
@@andrijor indeed! I am still working on it. Between the textbook and the video it takes a while to make. I am hoping to have it ready in early January.
@@python-programming Looking forward to it! 😊
Thank you such amazing content. This was so easy to follow and understand. Please do a 2nd part to the tutorial!! 🙏
Thank you so Much. It is very helpful
Your tutorials and your RUclips channel are great. Thanks so much for sharing your knoledge online. So helpful and well made.
Thanks so much Dr. Mattingly. Where can we find the machine learning related video?
Please do part2 I badly need this tutorial
Thanks for letting me know. I'll try it get it ready within the next few weeks.
@@python-programming thanks man .you are a life saver
great video we are looking for the second part please
this content is amazing
This is very helpful. I am very new to Machine Learning and NLP. I am in a situation where I have thousands of documents which don't always have correct spellings. I have to analyze these documents to look for trends related to parts failure especially if the failure has resulted in death or injury. Ideally like to learn from the data that can inform the future failures before there is a death or injury. Can SpaCy help with this?
Thanks for the wonderful videos
Thank you for such wonderful tutorial! Trying to follow the coding at 1:57:20 but where can we get the json data file?
this video is so engaging...
super interesting , already quickly subscribe both of the channels (y)
Thank you ❤️
just perfect! Thank you very much :)
Lol. spaCy's most recent build as of Feb 2, 2022 does properly identify West Chestertenfieldville as a GPE. ruclips.net/video/dIUTsFT2MeQ/видео.html
EDIT: Just finished the course - A phenomenal piece of work. Thanks so much for doing this. I can tell you put in an incredible amount of time and effort to do this and you provide it so graciously for free. It is so much clearer than spacy's documentation. They have a new spacy 101 so I'll give that a go now to cement this all in the nogin. And yes, eagerly awaiting part 2.
Thank you for the great video!
When I run the most_similar method, copying the code on your notebook, I end up receiving a complete set of differemt words, some unrelated to the word and some in other languages. Example: country gave me ['country-0,467', 'nationâ\x80\x99s', 'countries-', 'continente', 'Carnations', 'pastille', 'бесплатно', 'Argents', 'Tywysogion', 'Teeters']
Can somebody help me understand why this is happening?
Same here. Curious but...maybe the transformers/models (not sure which) are retrained thus giving us a different set of words? Hopefully someone can answer this!