sentdex It doesnt matter anything :) .You said "r" is for regexp , in python we prefix "r" to say it as raw text.This matters when users get confused thinking that prefixed text with r is regexp. :)
+sentdex The issue with using a normal string to write regexes that contain a \ is that you end up having to write \\ for every \. So the string literals "stuff\\things" and r"stuff\things" produce the same string. This gets especially useful if you want to write a regular expression that matches against backslashes and other special characters
DT stands for determiners just in case anyone is wondering. Determiner words consists of definitive articles (eg. the), indefinite articles (eg. a or an.), demonstratives (this, that, there etc), quantifiers (a little, too much), numbers (one, two, three).
Hello Harrison. your chunkGram looks like this: chunkGram = r"""Chunk: {**+?} """. Is there any reason for the order of the elements in your chunkGram, so why do we start off with adverb, then verb, then proper nouns then normal nouns? Does the chunk have to follow that order?
Awesome tutorial! This is exactly what I need. I do have one question, though. I'm trying to learn NLTK to do linguistic analysis on an Asian tribal language. I want to search different POS patterns to verify various proposed rules of grammar. I think that I can do this with chunking, but obviously the English language tools in NLTK are not going to help me much here and I'm going to need additional POS tags. What would be the best way for me to define my own word library, POS list, and NLTK grammar rules?
violinov Preservation of data. You may be interested only in the chunks, but you may also be interested in what wasn't chunked. You're left with choosing whether you care or not about the non-chunks, rather than NLTK making that decision for you.
sentdex How can I print only the chunked ones because from the code I expected that but instead it printed everything. Any function about that ? Thanks for the reply btw good videos man ;)
violinov Good question. I used to just split them using traditional string splitting, but you can actually do it correctly. Chunked is an nltk tree, with subtrees. So, we can reference all of these subtrees by doing chunked.subtrees. Then we want to filter these for specific labels. In our case, we're calling our label "Chunk" (see the regular expression we write). So, we can do something like: for subtree in chunked.subtrees(filter=lambda t: t.label() == 'Chunk'): print(subtree) Full code would look something like: import nltk from nltk.corpus import state_union from nltk.tokenize import PunktSentenceTokenizer train_text = state_union.raw("2005-GWBush.txt") sample_text = state_union.raw("2006-GWBush.txt") custom_sent_tokenizer = PunktSentenceTokenizer(train_text) tokenized = custom_sent_tokenizer.tokenize(sample_text) def process_content(): try: for i in tokenized: words = nltk.word_tokenize(i) tagged = nltk.pos_tag(words) chunkGram = r"""Chunk: {**+?}""" chunkParser = nltk.RegexpParser(chunkGram) chunked = chunkParser.parse(tagged) print(chunked) for subtree in chunked.subtrees(filter=lambda t: t.label() == 'Chunk'): print(subtree) chunked.draw() except Exception as e: print(str(e)) process_content() I'll also update this to this section's code on pythonprogramming.net here: pythonprogramming.net/chunking-nltk-tutorial/ Hope that helps!
Thanks for this series ! very helpful. I am new to Python and to NLP and NLTK -:) so pardon any basic questions..... I have a need where I need to check a customer's email (or say convert the phone call to speech which I will do using another software) and then from that find out 2 things. 1. Sentiment analysis 2. Was the email or call related to say a topic called XYZ (I donnot say..related to some catgory ) I have a list of 2 files... 1. training model with phrases which tell me if the customer is happy or sad.... 2. training model with phrases that show if the call or email is related to topic XYZ now from the last 4 videos in this series what I understand is that 1. I first do processing of my text using NLTK (from chapter 1 to 14) 2. provide #1 as input to scikit learn and do the analysis Please let me know if I understood this right... also...if my training models (say list of phrases which tell me if the customer is happy) has stop words...should, I rather remove the stop words from the training model in the first place ? say "I'm going to cancel" means customer is not happy and should it rather be "cancel" in the training model or should it be the phrase "I'm going to cancel"? again...still going through all the videos and apologies for jumping the gun -:) thanks a lot for your valuable time !!
can't really understand it.. and can't really practically use it even. in your script, at 5:15 in this video, you wrote it as ChunkGram = r"""{**?}""" but then, I can never find information about "" this thing in that sentence. I think it's something that especially exists in nltk module but what is it? Since only very few people use NLTK, it's not very easy to get full information about it. well... appreciated that you made tuts anyway. Bless you
Dear Harrison Sir, Is there any specific order of precedence while looking for the POS inside using regex? Can you please explain me in easy term that what chunking is all about?
hey, I greatly appreciate what you have done. I want to know how chunking (and other previous stuff) can be used to understand a text (eg:- A question ) and to retrieve information from a paragraph!! Simply basic steps. Thanks!!
+Roshintha Mediyawa chunking is mainly used to pull apart grammatical structure. You chunk by grammar, which is useful to know what are verbs nouns adjectives...etc. knowing what each is helps you figure out meaning. Noun followed by a verb usually means the noun did the verb, for example
how can we use the chunking to extract only noun phrases after tokenizing the entire sentence to words and adding POS tags as it converted to list of tuples
Hi, I have a question regarding NER. So NLTK can recognize multiple named entities like person, location and etc. And my question is what is full list of Named Entities that can be recognized by NLTK? Thanks in advance.
Thank you so much these tutorials are super helpful. I have a question and I am not sure where to go to: I am using NLP to analyze language and match it to specific graphics. Users would enter text and then the text would display graphics according to both the actual word and syntax. I am creating my own algorhythms that match the verbal content to a specific graphics-animation algo, which would be executed through javascript. I am essentially creating my own language classification that is related to visual elements. I am a relative beginner and my question is essentially how does one structure the architecture so that the nltk will be able to analyze the user text input, return variable to which it belongs. and then go into the javascript to connect with the graphics algo,( because the linguistic classification will determine the order of the graphics display) I was starting with NLTK and Python but then ended up using a simple library for now (NLP compromise) because I am not doing statistical analysis as of yet, I just want to be able to start connecting works and expressions to the graphics, which is easier with JS at this early stage since I am using the browser to display everything. It's just not clear to me how I can connect Python/NLTK to the browser to use it live, and if it's at all possible to use it "live".
How come the three quotes in your chunkGram don't comment the text out? Sorry if that's a dumb question. New to python. Thank you for all you do appreciate the knowledge!!
Actually, in Python triple quotes is not a multiline comment. It is used for strings that span multiple lines. However, when that string is not assigned to a variable, then it is garbage collected when executed and it serves as a comment for us in that respect. I think only # is purely for commenting.
Hi Sentdex, I was wondering if there is a way to use nltk and pos-tagging to write a program to test if a word is of a particular tag? For example: word = "bicycle" if word has the tag NN: print(True) else: print(False) Thanks in advance.
Is there a way to show which words are often associated together? For example, i would like to get a list of words that occur most often when the sentence contains the words "good book"?
Stefan van der Leeden Hard to do it globally on anything, but generally the reason why people might look for lists of words together would really be so at least one of the words was a noun. So, chunking noun phrases would be the choice, then you'd just build a big list of the noun phrases, then use the from collections import Counter function to see if you have any insights.
Hey.. I have created a chunk based on the following grammar : grammar = "rohit: {?+?}" I got chunked output. But now i want to access the chunk with the help of the name i have given to it ( in this case, rohit). Could you please tell me how can i do it.
what should i do if i want to chunk noun, verbs,adverbs and so on... Should we create different chunkgrams for each such classifiers or by using some syntax like : chunkGram_Noun = r"""Chunk_Noun: {||.} Chunk_Verb: {||||||.}""" Both ways its some error. please tell me a better approach. with example.
Hi Harrision , After watching three videos I have one question, can we make a prediction of words given that with group of words(Ex: list_words = "Alright, so I'm back in high school, I'm standing in the middle of the cafeteria".) want to predict next words through NLP or we have to use anything more. Could you please share your thoughts , it really means a lot with what am doing. Thanks Srikar
Sagar Samtani Jack is an idiot and Jill is not. Chunking will allow you to associate "idiot" with Jack, as opposed to having to take a wild guess as to who "idiot" belongs to. With chunking, we group groups of grammar together to solve for problems like this.
Just in case someone is still following this thread, I would love to know who to get the tree diagram to appear using PyCharm. I have tried adding the line suggested below with no success. Suggestions?
I have a question, when I tokenize a sentence using the following code :- from nltk.tokenize import sent_tokenize, word_tokenize raw_data="1. This is a sentence. " tokenized_sentences=sent_tokenize( raw_data) print (tokenized_sentences) ________ Result: >>> 1. >>>This is a sentence. ------------------- How can I get only the complete sentence including paragraph numbering (without splitting numbers) at the beginning, the result should be as this: >>>1. This is a sentence Thanks
Hi there l tried running the code and l am getting an "unexpected EOF whilst parsing before l even get to the chunked.draw() stage, may l please have suggestions on how to solve this. Thank you
@Martin K gave this response: 'In case you were still wondering; I removed the try: statement and added the function call process_content() at the bottom (ofc outside the loop). This solved my EOF error.'
I am getting an error of "module 'nltk' has no attribute 'RegexParser'". Can anyone please help me how to resolve this? I could not find much on internet for this
Mohammed Abujayyab it is a part of the NLTK corpora that we installed in the beginning. You can find it in %appdata%. We talk more about this in part 9: pythonprogramming.net/nltk-corpus-corpora-tutorial/
Yes it is possible. Mention the language. That's all. Eg: spanish_text='Insert your text here' def poscontent(): try: for i in nltk.sent_tokenize(spanish_text,language='spanish'): word=nltk.word_tokenize(i) postag=nltk.pos_tag(word) print(postag) except Exception as e: print(str(e)) This should work:)
Also that chart appears on a small pop-up window(for the chunk charts) while the cell shows still running. Until you close all the window pop-ups appearing for each line, the cell will appear to keep running.
In case you were still wondering; I removed the try: statement and added the function call process_content() at the bottom (ofc outside the loop). This solved my EOF error.
Hi Sendex or anyone. I am having a recurring problem with the stemming and chunking code from the video and am copying exatly. I ma getting URL errors: URLError: only when I run the for loops part of speech tagging and chunking. All other codes in previous videos work fine such as as stop words and stemming. This is puzzling because there are no urls in the code. I have installed on 3 different computers from different locations. I've even tried 2.7 and 3.5 using Canopy and Anaconda. The code only works on one computer using Canopy. Has anyone had this problem? I don't have access to a unix shell otherwise I would try that too. This must be a bug in Python.
Hey I have been following your twitter sentiment analysis videos.I have learnt a lot from it.Is there any way I can use this with a GUI ?? As in write a tweet inside a comment boxand it would show me the sentiment !! Help ! Thanks
i now im late to the party but erhhher.... "Greater/less than"... Is a common way to describe the sideways "V" ... WAKAK (>x_x)> have you kept count of comments with same reply lol
Mr. Harrison. My script was based on your tuts about 99% and still, can't fucking use them practically and it's just so very extremly frustrating. what did I do wrong? pastebin.com/Xdrf3q9b this is the whole code. and It gives me errors constantly. all I wanna do is just extracting certain kinds of noun or others. I applied regular expression, but just simply, putting words, to see if it's filtering in a right way but it just dosen't. if keeps returining error anyway. I will explain more specificly what I aimed to do (1) extracting certain kinds of noun from the sentence, and store it in a usable variable. so it shouldn't return any stupid form like (Python : NNP) but like (NNP = 'Python, Autohotkey, Javascript') literally basic of basic thing that I can do isn't it? BUT I CAN'T FIND ANY THREADS OR TUTS ABOUT THIS IN ANY WHERE. Harrison, please help me. This kind of simplest subject is soaking my time and it's just freaking ridiculous. All I wanna see is just one simple usage example and can not find it fucking nowhere.
Instead of making too much of funny sounds/sentences, if u can concentrate on teaching that would do more good. Sometimes your way of doing such things is annoying.
Its called being normal and giving a personal touch. No one gets inspired from listening to monotonic lectures from bland teachers. Thats the biggest problem with teaching.
9:51
"This is kinda ugly"
"The computer loves this"
"The human hates it"
😂😂😂😂😂😂😂
Those brackets at 5:33 are called angular brackets.They are also called chevrons by many people in industry
putting "r" infront regex text doesn't mean that it is a regexp it means consider the text as raw
Chakradhar Kasturi Do you know of a situation where it matters? Never been in a situation that I know of where it mattered if I used it or not.
sentdex It doesnt matter anything :) .You said "r" is for regexp , in python we prefix "r" to say it as raw text.This matters when users get confused thinking that prefixed text with r is regexp. :)
+sentdex The issue with using a normal string to write regexes that contain a \ is that you end up having to write \\ for every \. So the string literals "stuff\\things" and r"stuff\things"
produce the same string. This gets especially useful if you want to write a regular expression that matches against backslashes and other special characters
DT stands for determiners just in case anyone is wondering. Determiner words consists of definitive articles (eg. the), indefinite articles (eg. a or an.), demonstratives (this, that, there etc), quantifiers (a little, too much), numbers (one, two, three).
Hello Harrison. your chunkGram looks like this: chunkGram = r"""Chunk: {**+?} """. Is there any reason for the order of the elements in your chunkGram, so why do we start off with adverb, then verb, then proper nouns then normal nouns? Does the chunk have to follow that order?
do these symbols mean grouping here? (in chunkGram 8:17)
this is amazing.Why are you so good at everything sentdex?
Awesome tutorial! This is exactly what I need. I do have one question, though. I'm trying to learn NLTK to do linguistic analysis on an Asian tribal language. I want to search different POS patterns to verify various proposed rules of grammar. I think that I can do this with chunking, but obviously the English language tools in NLTK are not going to help me much here and I'm going to need additional POS tags. What would be the best way for me to define my own word library, POS list, and NLTK grammar rules?
I love you man. You just saved my research!!!!!!
why it's just impossible to use original regular expression (not nltk regular expression) on chucked or tagged list from nltk?
I've only ever known < or > as angled brackets
I reference them from math,: < : less than, >: greater than
Operational conditionals. The technical term :)
relational operators
Thank you for pronouncing regex properly and not "rejex" as some people do.
many thanx - im still watching - but so far very impressed ! excellent work ! :D
My program doesn't print anything? why..
it says "unexpected EOF while parsing"
same problem
you must be missing the except statement
@@omarrazi4826 This fixed it for me
Above code is showing an error like RegexpParser object is not callable
Year 2021, still this video playlist is useful
Hi
Can we get the diagram (chunked.draw()) using user input?
Coz I am not getting the output if I give input from the keyboard.
Please answer
Hey! So if i am getting this right, chunking is like breaking the sentence into meaningful segments?
At 9:26 can you tell why it printed out the non - chunks as well ?
violinov Preservation of data. You may be interested only in the chunks, but you may also be interested in what wasn't chunked. You're left with choosing whether you care or not about the non-chunks, rather than NLTK making that decision for you.
sentdex How can I print only the chunked ones because from the code I expected that but instead it printed everything. Any function about that ? Thanks for the reply btw good videos man ;)
violinov Good question. I used to just split them using traditional string splitting, but you can actually do it correctly. Chunked is an nltk tree, with subtrees. So, we can reference all of these subtrees by doing chunked.subtrees. Then we want to filter these for specific labels. In our case, we're calling our label "Chunk" (see the regular expression we write).
So, we can do something like:
for subtree in chunked.subtrees(filter=lambda t: t.label() == 'Chunk'):
print(subtree)
Full code would look something like:
import nltk
from nltk.corpus import state_union
from nltk.tokenize import PunktSentenceTokenizer
train_text = state_union.raw("2005-GWBush.txt")
sample_text = state_union.raw("2006-GWBush.txt")
custom_sent_tokenizer = PunktSentenceTokenizer(train_text)
tokenized = custom_sent_tokenizer.tokenize(sample_text)
def process_content():
try:
for i in tokenized:
words = nltk.word_tokenize(i)
tagged = nltk.pos_tag(words)
chunkGram = r"""Chunk: {**+?}"""
chunkParser = nltk.RegexpParser(chunkGram)
chunked = chunkParser.parse(tagged)
print(chunked)
for subtree in chunked.subtrees(filter=lambda t: t.label() == 'Chunk'):
print(subtree)
chunked.draw()
except Exception as e:
print(str(e))
process_content()
I'll also update this to this section's code on pythonprogramming.net here: pythonprogramming.net/chunking-nltk-tutorial/
Hope that helps!
sentdex It does! Thank you very much
Great that helps a lot: However I keep getting an error: 'Tree' object has no attribute 'label'
Thanks for this series ! very helpful. I am new to Python and to NLP and NLTK -:) so pardon any basic questions.....
I have a need where I need to check a customer's email (or say convert the phone call to speech which I will do using another software) and then from that find out 2 things.
1. Sentiment analysis
2. Was the email or call related to say a topic called XYZ (I donnot say..related to some catgory )
I have a list of 2 files...
1. training model with phrases which tell me if the customer is happy or sad....
2. training model with phrases that show if the call or email is related to topic XYZ
now from the last 4 videos in this series what I understand is that
1. I first do processing of my text using NLTK (from chapter 1 to 14)
2. provide #1 as input to scikit learn and do the analysis
Please let me know if I understood this right...
also...if my training models (say list of phrases which tell me if the customer is happy) has stop words...should, I rather remove the stop words from the training model in the first place ?
say "I'm going to cancel" means customer is not happy and should it rather be "cancel" in the training model or should it be the phrase "I'm going to cancel"?
again...still going through all the videos and apologies for jumping the gun -:)
thanks a lot for your valuable time !!
can't really understand it.. and can't really practically use it even. in your script, at 5:15 in this video, you wrote it as ChunkGram = r"""{**?}""" but then, I can never find information about "" this thing in that sentence. I think it's something that especially exists in nltk module but what is it? Since only very few people use NLTK, it's not very easy to get full information about it. well... appreciated that you made tuts anyway. Bless you
Why I'm getting this error?
rule.apply(chunkstr)
AttributeError: 'str' object has no attribute 'apply'
try to >> import string, maybe (?)
Dear Harrison Sir,
Is there any specific order of precedence while looking for the POS inside using regex?
Can you please explain me in easy term that what chunking is all about?
m little bit confused with PuncktSentenceTokenizer would be nice if you could explain it a little bit elaborately like where and why it is used
hey,
I greatly appreciate what you have done. I want to know how chunking (and other previous stuff) can be used to understand a text (eg:- A question ) and to retrieve information from a paragraph!! Simply basic steps. Thanks!!
+Roshintha Mediyawa chunking is mainly used to pull apart grammatical structure. You chunk by grammar, which is useful to know what are verbs nouns adjectives...etc. knowing what each is helps you figure out meaning. Noun followed by a verb usually means the noun did the verb, for example
Oh, that's completly not gonna work with russian :) You know, adjective noun verb, adjective verb noun, noun adjective verb, adjective adjective adjective... Crazy stuff!
5:31 They are called angled brackets
After the nltk.download() command, how did you make a comment on the next line (9:59) without the downloader or the command prompt popping back up?
Nomad Soul Sorry I was referring to the first video
Nomad Soul We're not using the interactive interpreter, and then, before running the whole script, I remove the nltk.download()
< > are angle brackets or chevrons if you wanna get fancy
how can we use the chunking to extract only noun phrases after tokenizing the entire sentence to words and adding POS tags as it converted to list of tuples
I need to print words which contains letter 'z' any not digits. how can I do that?
Hi thanks for your useful videos . Can please tell which tools are you using ??
Hi,
I have a question regarding NER. So NLTK can recognize multiple named entities like person, location and etc. And my question is what is full list of Named Entities that can be recognized by NLTK?
Thanks in advance.
Is it true that the '?' in and is optional? 'RB.' already matches adverb of any tense, right?
Thank you so much these tutorials are super helpful. I have a question and I am not sure where to go to: I am using NLP to analyze language and match it to specific graphics. Users would enter text and then the text would display graphics according to both the actual word and syntax. I am creating my own algorhythms that match the verbal content to a specific graphics-animation algo, which would be executed through javascript.
I am essentially creating my own language classification that is related to visual elements. I am a relative beginner and my question is essentially how does one structure the architecture so that the nltk will be able to analyze the user text input, return variable to which it belongs. and then go into the javascript to connect with the graphics algo,( because the linguistic classification will determine the order of the graphics display)
I was starting with NLTK and Python but then ended up using a simple library for now (NLP compromise) because I am not doing statistical analysis as of yet, I just want to be able to start connecting works and expressions to the graphics, which is easier with JS at this early stage since I am using the browser to display everything.
It's just not clear to me how I can connect Python/NLTK to the browser to use it live, and if it's at all possible to use it "live".
0:12 what is chunking....
it's a brand taken over by La Choy, famous for their chicken chow mein w/ crispy noodles.
How come the three quotes in your chunkGram don't comment the text out? Sorry if that's a dumb question. New to python. Thank you for all you do appreciate the knowledge!!
Actually, in Python triple quotes is not a multiline comment. It is used for strings that span multiple lines. However, when that string is not assigned to a variable, then it is garbage collected when executed and it serves as a comment for us in that respect. I think only # is purely for commenting.
How to do grammar check using nltk in python? Can you please help
Hi Sentdex, I was wondering if there is a way to use nltk and pos-tagging to write a program to test if a word is of a particular tag? For example:
word = "bicycle"
if word has the tag NN:
print(True)
else:
print(False)
Thanks in advance.
does the chunkParser.parse() return an array? or list type?
cause I was thinking like getting a part of a word from the chunked words
Can someone explain the use of ? in the code- thing??
Is there a way to show which words are often associated together? For example, i would like to get a list of words that occur most often when the sentence contains the words "good book"?
Stefan van der Leeden Hard to do it globally on anything, but generally the reason why people might look for lists of words together would really be so at least one of the words was a noun. So, chunking noun phrases would be the choice, then you'd just build a big list of the noun phrases, then use the from collections import Counter function to see if you have any insights.
sentdex Thanks, that gave me some new ideas! Exciting stuff
Hey.. I have created a chunk based on the following grammar : grammar = "rohit: {?+?}" I got chunked output. But now i want to access the chunk with the help of the name i have given to it ( in this case, rohit). Could you please tell me how can i do it.
Where can I find Medical Corpora for train data? Thanks
5:27 in this case you call them angle brackets bro
what should i do if i want to chunk noun, verbs,adverbs and so on...
Should we create different chunkgrams for each such classifiers or by using some syntax like :
chunkGram_Noun = r"""Chunk_Noun: {||.} Chunk_Verb: {||||||.}"""
Both ways its some error. please tell me a better approach. with example.
< and > are called chevron
Am getting an error saying idle's subprocess dint make any connections.
Can you link me to your website's list for the modifiers?
Hi i have a question and maybe it will not answered :'/ but how i can make "chunking" in language spanish ? i have this problem for a project
Hi Harrision , After watching three videos I have one question, can we make a prediction of words given that with group of words(Ex: list_words = "Alright, so I'm back in high school, I'm standing in the middle of the cafeteria".) want to predict next words through NLP or we have to use anything more. Could you please share your thoughts , it really means a lot with what am doing.
Thanks
Srikar
Thank you so much, this video is very helpful! :)
Hey Harrison, I still don't completely understand the full value of chunking. Can you explain a way which it can be applied?
Sagar Samtani Jack is an idiot and Jill is not.
Chunking will allow you to associate "idiot" with Jack, as opposed to having to take a wild guess as to who "idiot" belongs to. With chunking, we group groups of grammar together to solve for problems like this.
I'm having an error, "name 'nltk' is not defined'"even though I've imported nltk. Please help.
could you please explain pos and chunking
why use this {**+?} ?? // you wrote it but i didnt understand properly why ?
Hi, I am using Jupyter notebook , I am not getting any output like a tree structure after running the code of chunking and chinkng.
run %matplotlib inline at the start of the notebook.
Just in case someone is still following this thread, I would love to know who to get the tree diagram to appear using PyCharm. I have tried adding the line suggested below with no success. Suggestions?
I have a question, when I tokenize a sentence using the following code :-
from nltk.tokenize import sent_tokenize, word_tokenize
raw_data="1. This is a sentence. "
tokenized_sentences=sent_tokenize( raw_data)
print (tokenized_sentences)
________
Result:
>>> 1.
>>>This is a sentence.
-------------------
How can I get only the complete sentence including paragraph numbering (without splitting numbers) at the beginning, the result should be as this:
>>>1. This is a sentence
Thanks
I don't understand why the chunks are not starting with adverbs as they should?
? in Regex syntax means 0 or 1 so there must not be any. Here is the Python Regex set of operations docs.python.org/3/library/re.html
How does this statemnet works:
filter=lambda t: t.label() == 'Chunk'
Lambda is a one liner fuction that would return only those labels, having name "Chunk", an filter is a parameter.
Image dialogue is continuing to coming even cancelled many times..wondering why is it happening?? Can anyone please help here
Hi there l tried running the code and l am getting an "unexpected EOF whilst parsing before l even get to the chunked.draw() stage, may l please have suggestions on how to solve this. Thank you
@Martin K gave this response:
'In case you were still wondering; I removed the try: statement and added the function call process_content() at the bottom (ofc outside the loop).
This solved my EOF error.'
I am getting an error of "module 'nltk' has no attribute 'RegexParser'". Can anyone please help me how to resolve this? I could not find much on internet for this
You missed a 'p'. It should be 'RegexpParser'.
2005-GWBush.txt where can i find these
So where we can find '2006-GWBush.txt' and how can we use my text file shall I put it in some directory :c:\example.txt ?
Mohammed Abujayyab it is a part of the NLTK corpora that we installed in the beginning. You can find it in %appdata%. We talk more about this in part 9: pythonprogramming.net/nltk-corpus-corpora-tutorial/
Thanks a lot
Is it possible to use POS tagging in sanish?
Yes it is possible. Mention the language. That's all.
Eg:
spanish_text='Insert your text here'
def poscontent():
try:
for i in nltk.sent_tokenize(spanish_text,language='spanish'):
word=nltk.word_tokenize(i)
postag=nltk.pos_tag(word)
print(postag)
except Exception as e:
print(str(e))
This should work:)
My chunked.draw() is taking forever. I am following the tutorial and my computer is not slow. Any help?
note: I do have matplotlib and wrote import matplotlib
I faced the same issue. Probably because the file is really big. I tried tokenized[:10] as the sample and it worked
Also that chart appears on a small pop-up window(for the chunk charts) while the cell shows still running. Until you close all the window pop-ups appearing for each line, the cell will appear to keep running.
I call them () angle brackets
I always called them "greater than" and "less than" signs.
whY " unexpected EOF " showing? plz help sir
In case you were still wondering; I removed the try: statement and added the function call process_content() at the bottom (ofc outside the loop).
This solved my EOF error.
Martin K thank you Martin K, I had the same issue and that fix sorted me out
Me too, good job.
Hi Sendex or anyone. I am having a recurring problem with the stemming and chunking code from the video and am copying exatly. I ma getting URL errors: URLError: only when I run
the for loops part of speech tagging and chunking. All other codes in previous videos work fine such as
as stop words and stemming.
This is puzzling because there are no urls in the code. I have installed
on 3 different computers from different locations.
I've even tried
2.7 and 3.5 using Canopy and Anaconda. The code only works on
one computer using Canopy. Has anyone had this problem? I don't
have access to a unix shell otherwise I would try that too. This
must be a bug in Python.
stackoverflow.com/questions/35827859/python-nltk-pos-tag-throws-urlerror
Hey man, do you have these examples typed out somewhere?
Yep, check out this series specifically here: pythonprogramming.net/tokenizing-words-sentences-nltk-tutorial/
They're called angular brackets. :)
where can I find the list of part of speech?
pythonprogramming.net/part-of-speech-tagging-nltk-tutorial/
the arrow brackets are called chevrons. > is a chevron
chevron is the hill-cap symbol and the side angled brackets. ^ this is chevron. angled brackets
whispers "you son of a bitch" XD
started here -> Gone For some other video -> came back -> Gone For some other video -> came back-> Gone For some other video->came back-> Understood
Hey I have been following your twitter sentiment analysis videos.I have learnt a lot from it.Is there any way I can use this with a GUI ??
As in write a tweet inside a comment boxand it would show me the sentiment !! Help ! Thanks
apple releases a phone ,comes with new colored case and 100 dollars more ;-)
Where do I find the text file for this tutorial?
Here you go sir: pythonprogramming.net/chunking-nltk-tutorial/
Thanks Harrison, but still couldn't find the 2005-GWBush.txt and 2006-GWBush.txt either. not even on part of speech.
@@leonardo08 github.com/teropa/nlp/tree/master/resources/corpora/state_union
Imagine Harrison is your TA
Just so you know, there are like five videos put up in the last month that just copy this video in its entirety under different accounts.
< less then, > greater then
Ateelol I dunno, I think they are alligators.
hehe yes you have to be carefull when you use them since some of them are alligators. Thanks for the video.
sentdex hahaha lol
chunkagram chunkagram call us up and win some ram
woow
$100 dollar phone case? yup sounds like apple
i now im late to the party but erhhher.... "Greater/less than"... Is a common way to describe the sideways "V" ... WAKAK (>x_x)> have you kept count of comments with same reply lol
They're angle brackets O.O
Mr. Harrison. My script was based on your tuts about 99% and still, can't fucking use them practically and it's just so very extremly frustrating. what did I do wrong?
pastebin.com/Xdrf3q9b
this is the whole code. and It gives me errors constantly. all I wanna do is just extracting certain kinds of noun or others. I applied regular expression, but just simply, putting words, to see if it's filtering in a right way but it just dosen't. if keeps returining error anyway.
I will explain more specificly what I aimed to do
(1) extracting certain kinds of noun from the sentence, and store it in a usable variable. so it shouldn't return any stupid form like (Python : NNP) but like (NNP = 'Python, Autohotkey, Javascript')
literally basic of basic thing that I can do isn't it? BUT I CAN'T FIND ANY THREADS OR TUTS ABOUT THIS IN ANY WHERE. Harrison, please help me. This kind of simplest subject is soaking my time and it's just freaking ridiculous. All I wanna see is just one simple usage example and can not find it fucking nowhere.
dirty dollar! hahaha!!!
you speak poorly. think b4 speak
Instead of making too much of funny sounds/sentences, if u can concentrate on teaching that would do more good. Sometimes your way of doing such things is annoying.
Its called being normal and giving a personal touch. No one gets inspired from listening to monotonic lectures from bland teachers. Thats the biggest problem with teaching.