NLP for Under-resourced African Languages - David Ìfẹ́olúwa Adélaní - Talking Language AI #6

Поделиться
HTML-код
  • Опубликовано: 6 авг 2024
  • There are over 7100 languages spoken by humans around the world, yet the vast majority of language models only support the English Language.
    This makes it incredibly challenging to build products and projects using multilingual language understanding. In this talk, David addresses the challenges faced in NLP research and development for African Languages, which are spoken by over a billion people.
    David will also share his findings of human-annotated named entity recognition (NER) datasets and the development of Multilingual pre-trained language models (PLMs) for 20 widely spoken languages in Africa through multilingual adaptive fine-tuning (MAFT).
    ==
    Check out David’s work here- dadelani.github.io/
    Follow him here- / davlanade
    How to join Masakhane, www.masakhane.io/#h.p_ANKd5Nj...
    ==
    Join the Cohere Discord: / discord
    Discussion thread for this episode (feel free to ask questions):
    / discord
    ==
    0:00 Introducing David Adelani
    2:42 Progress of Language Technology in English
    4:57 When Language Technology is Needed Urgently
    5:52 Why Research on Other Languages
    7:31 Not Many Languages Benefit from NLP
    8:39 What are Under-resourced Languages
    13:00 NLP for Under-resourced African Languages
    15:37 Challenges for NLP in African Languages
    18:50 Developing labelled datasets for African Languages
    21:31 About The Masakhane Research Community
    22:35 MasakhaNER - Named Entity Recognition
    31:11 Improving Pre-trained Language Models: Language Adaptive Fine-tuning (LAFT)
    40:48 Multilingual Adaptive Fine-Tuning (MAFT)
    52:12 Conclusion
    54:00 Masakhane and how people can get involved
    58:45 Applying these techniques to low-resourced Asian languages
  • НаукаНаука

Комментарии •