LayoutLM: Pre-training of Text and Layout for Document Image Understanding (Paper Summary)

Поделиться
HTML-код
  • Опубликовано: 12 сен 2024

Комментарии • 17

  • @TechVizTheDataScienceGuy
    @TechVizTheDataScienceGuy  2 года назад +2

    Watch more paper summaries at ruclips.net/video/ykClwtoLER8/видео.html

    • @sudhirpol1895
      @sudhirpol1895 Год назад

      Content is really good but one thing is that, in hugging face implementation they have not used OCR output for Fine-tuning task. During pre-training it is a not a multimodal model, but during fine tuning it should be called as multimodal model, right?

  • @marinamaher8211
    @marinamaher8211 Год назад +2

    Great, thanks for this clear explanation.
    If you do V2 & V3, it will be awesome.

  • @TheMarComplex
    @TheMarComplex Год назад +1

    This was pretty interesting, love to know about the V1 architecture as well!

  • @neeleshshukla242
    @neeleshshukla242 2 года назад +1

    Nice summary. btw which editor are you using. Looks like a good way of online annotation and adding notes.

    • @TechVizTheDataScienceGuy
      @TechVizTheDataScienceGuy  2 года назад

      Hey Neelesh, thanks for appreciating. I use GoodNotes editor for annotations. You can check the link for the same in the description of any video.

  • @AjitKumarMCS
    @AjitKumarMCS Год назад

    nice summary. Please make vedio on LayoutLMv2 also

  • @mariussame9357
    @mariussame9357 Год назад

    Hi ! Thanks for the video ! I want to ask you a question i'm working in different use cases and the majority of the time the goal is to extract information and i found this model really interesting the problem that I have is I'm a french person so the text from which I want to extract the information are in french and I assume that this model was pretrained on english document so do you think that I can still fine tuned the model on my french document or do you have any recommendation?

  • @yosefasefaw4207
    @yosefasefaw4207 2 года назад

    thanks a lot! you are amazing

  • @TheMarComplex
    @TheMarComplex Год назад

    Thanks!

  • @user-lj7bw2db1l
    @user-lj7bw2db1l 10 месяцев назад

    Do for V3 its bit different

  • @shloimielevitsky5983
    @shloimielevitsky5983 10 месяцев назад

    great video, can you do a version 2 vs version 3

    • @shloimielevitsky5983
      @shloimielevitsky5983 7 месяцев назад

      have you done one of those models? what about the LiLT model?

  • @yashumahajan7
    @yashumahajan7 Год назад

    please create a video on layoutlmv2

  • @arnavdman
    @arnavdman 2 года назад +2

    This was pretty interesting, love to know about the V1 architecture as well!