Hi @Callingwind, you're welcome. You could use a french dictionary. I know there are translated versions of the NRC dictionary (though I don't know whether they are any good), and saw that there is a french adoption of the NRC called "FEEL" (french expanded emotion lexicon). The main thing to watch out for is that you deal with accents and stuff properly. More generally, if you want to do NLP with french it's good to read up on character encoding (if you haven't already). Simply put, you'll either want both your texts and dictionary to be in unicode (UTF-8) and include special characters, or transliterate it to ASCII. e.g. iconv("Hôtel-Résidence", to='ASCII//TRANSLIT') As a sidenote, I just want to point out that if you're looking into sentiment analysis for professional or academic application, you might want to consider a machine learning approach instead (transformers like BERT are killing it). The field of NLP has been moving fast, and dictionary based sentiment analysis is on the way out.
@@callingwind5071 If by recession index you mean measuring whether/how newspapers mention recession, or more generally bad economic tidings, then I would recommend not using a general sentiment dictionary/model. A general sentiment analysis only tells you whether a text is positive/negative, but not really about what. While you could argue that sentiment in the context of business news is a decent proxy for recession, how good a proxy this is depends on how you operationalize and define recession. If you are very specifically interested in recession, you might even just use a small dictionary or search query to find explicit references to recession. You might then want to consider using boolean queries instead of simple dictionary terms. Then you can also specify that words need to occur within a certain distance (e.g., crash and market). This is possible with corpustools: cran.r-project.org/web/packages/corpustools/vignettes/corpustools.html#search_features
Hi Kasper, thanks for the video. Is it possible to have the sentiment analysis for French?
Hi @Callingwind, you're welcome. You could use a french dictionary. I know there are translated versions of the NRC dictionary (though I don't know whether they are any good), and saw that there is a french adoption of the NRC called "FEEL" (french expanded emotion lexicon). The main thing to watch out for is that you deal with accents and stuff properly. More generally, if you want to do NLP with french it's good to read up on character encoding (if you haven't already). Simply put, you'll either want both your texts and dictionary to be in unicode (UTF-8) and include special characters, or transliterate it to ASCII. e.g.
iconv("Hôtel-Résidence", to='ASCII//TRANSLIT')
As a sidenote, I just want to point out that if you're looking into sentiment analysis for professional or academic application, you might want to consider a machine learning approach instead (transformers like BERT are killing it). The field of NLP has been moving fast, and dictionary based sentiment analysis is on the way out.
@@kasperwelbers super interesting answer. Thanks a lot. I am looking to build a recession index based on newspapers in my country. Any other tips?
@@callingwind5071 If by recession index you mean measuring whether/how newspapers mention recession, or more generally bad economic tidings, then I would recommend not using a general sentiment dictionary/model. A general sentiment analysis only tells you whether a text is positive/negative, but not really about what. While you could argue that sentiment in the context of business news is a decent proxy for recession, how good a proxy this is depends on how you operationalize and define recession. If you are very specifically interested in recession, you might even just use a small dictionary or search query to find explicit references to recession.
You might then want to consider using boolean queries instead of simple dictionary terms. Then you can also specify that words need to occur within a certain distance (e.g., crash and market). This is possible with corpustools: cran.r-project.org/web/packages/corpustools/vignettes/corpustools.html#search_features