L15.1: Different Methods for Working With Text Data

Поделиться
HTML-код
  • Опубликовано: 23 дек 2024

Комментарии • 6

  • @blackswann9555
    @blackswann9555 23 дня назад

    Excellent tutorial!

  • @abubakarali6399
    @abubakarali6399 3 года назад

    Possible word percentage is probability? Is it same like embedings?

    • @SebastianRaschka
      @SebastianRaschka  3 года назад

      Yeah, probably meant probability. This is different from embeddings thought. Here, you can just think of it as a multi-class problem where each word represents a class. Like in a multi-class setting, you get a score or probability for each class, and you usually pick the highest one as the predicted one. An embedding is a different concept, although you could consider the values before going to the softmax function as an embedding so it is somewhat related.

  • @kachrooabhishek
    @kachrooabhishek 2 года назад

    Hi Sebastian , I have one quick question .
    for multi class text classification for short text (max 3-4 words) . What's the best possible approach .
    Am sure like embeddings is not meant for that . NER not entirely sure .
    Waiting for your valuable feedback

    • @SebastianRaschka
      @SebastianRaschka  2 года назад

      You could try a bag-of-words and n-gram-based approach: github.com/rasbt/machine-learning-book/blob/main/ch08/ch08.ipynb
      Or, as a baseline, a rule-based / dictionary-based approach is maybe also a good idea: github.com/rasbt/stat453-deep-learning-ss21/blob/main/L15/0_rule-based-baseline.ipynb

  • @abubakarali6399
    @abubakarali6399 3 года назад +2

    nice.