Content is really good but one thing is that, in hugging face implementation they have not used OCR output for Fine-tuning task. During pre-training it is a not a multimodal model, but during fine tuning it should be called as multimodal model, right?
Hi ! Thanks for the video ! I want to ask you a question i'm working in different use cases and the majority of the time the goal is to extract information and i found this model really interesting the problem that I have is I'm a french person so the text from which I want to extract the information are in french and I assume that this model was pretrained on english document so do you think that I can still fine tuned the model on my french document or do you have any recommendation?
Watch more paper summaries at ruclips.net/video/ykClwtoLER8/видео.html
Content is really good but one thing is that, in hugging face implementation they have not used OCR output for Fine-tuning task. During pre-training it is a not a multimodal model, but during fine tuning it should be called as multimodal model, right?
Great, thanks for this clear explanation.
If you do V2 & V3, it will be awesome.
This was pretty interesting, love to know about the V1 architecture as well!
nice summary. Please make vedio on LayoutLMv2 also
thanks a lot! you are amazing
You’re welcome ☺️
Hi ! Thanks for the video ! I want to ask you a question i'm working in different use cases and the majority of the time the goal is to extract information and i found this model really interesting the problem that I have is I'm a french person so the text from which I want to extract the information are in french and I assume that this model was pretrained on english document so do you think that I can still fine tuned the model on my french document or do you have any recommendation?
Nice summary. btw which editor are you using. Looks like a good way of online annotation and adding notes.
Hey Neelesh, thanks for appreciating. I use GoodNotes editor for annotations. You can check the link for the same in the description of any video.
Do for V3 its bit different
great video, can you do a version 2 vs version 3
have you done one of those models? what about the LiLT model?
please create a video on layoutlmv2
Sure. Thanks!