Yes please make more videos on skweak and even weak supervised/unsupervised training. I’ve been going through this issue where I have data but it would take hours to label it which isn’t very fun
This will be a fantastic series. Especially because it addresses big chunks of an overall workflow. I am finding it quite hard to integrate different ‘pieces’ from different sources. Also, your viewers are likely to have specialized needs, so annotation / training is a big hurdle. Understanding this package in detail would be so helpful in many domains. Another area that would be useful to develop would be clustering on multiple dimensions, including scoring, to help sort texts that are highly structured.
Great video! Exactly what I think I need. Would be really great if you can continue this series and teach a beginner like me. Would appreciate any whole cycle guidance from start to production if possible 🙏🙏🙏
Interesting vid, as always. What, would you say, are the main advantages of this approach over "naive" string matching? Also I didn't really understand the role of the HMM in this process.
Hi, when trying to run the 3rd cell I get the error 'NoneType' object has no attribute 'read'. This error refer to the line "text = archive_file.extractfile(archive_member).read()....
Hey! Could you explain your hmm_docs cell, where you loop for each text in texts. I thought we would set doc = nlp(new_docs[i]) rather than setting the same doc = nlp(doc) 195 times. Could you correct me if I’m wrong please :)
A perfect video as usual! I see every video you post. Thank you!
Thanks!!!
you are a life saver. I was just staring at a corpus trying to figure out how to take the first few passes at labeling it. ❤
Yes please make more videos on skweak and even weak supervised/unsupervised training. I’ve been going through this issue where I have data but it would take hours to label it which isn’t very fun
This will be a fantastic series. Especially because it addresses big chunks of an overall workflow. I am finding it quite hard to integrate different ‘pieces’ from different sources. Also, your viewers are likely to have specialized needs, so annotation / training is a big hurdle. Understanding this package in detail would be so helpful in many domains. Another area that would be useful to develop would be clustering on multiple dimensions, including scoring, to help sort texts that are highly structured.
Thank you so much for this video. You are actually saving my professional life right now with these videos!!
Which paper are you referring to at 3:30 ?
Great video! Exactly what I think I need. Would be really great if you can continue this series and teach a beginner like me. Would appreciate any whole cycle guidance from start to production if possible 🙏🙏🙏
Notebook: github.com/wjbmattingly/skweak/blob/main/ner_demo.ipynb
Interesting vid, as always.
What, would you say, are the main advantages of this approach over "naive" string matching?
Also I didn't really understand the role of the HMM in this process.
Hi, when trying to run the 3rd cell I get the error 'NoneType' object has no attribute 'read'. This error refer to the line "text = archive_file.extractfile(archive_member).read()....
Hey! Could you explain your hmm_docs cell, where you loop for each text in texts. I thought we would set doc = nlp(new_docs[i]) rather than setting the same doc = nlp(doc) 195 times. Could you correct me if I’m wrong please :)