Hello Sir, it was an excellent tutorial but I am unable to install unilm/layoutlm as collab has python version 3.11 by default. I downgraded python version to 3.7 and tried running again now transformers are not installed properly. Please help me.
@italykiduniya I suggest since you already know your keys why don't to through code just create your own hash data. For example in python Output_dict = {} For key in list_of_keys: Val1 = LayoutLM(doc , query)[index_of_value] Output_dict[Key] = val Please use this as an inspiration and not the exact code
I want to use my own dataset already annotated using IOB format json file. Possible to use it as is? Or do I need to translate it to the json format similar to the FUNSD dataset? If yes, is there any (free) converter/translator available to transform the annotation format?
Hi @Karndeep Singh...I am using my own dataset and Let us say I want to annotate 3 labels "DATE" , "INVOICE NUMBER" and "OTHER".. So will it possible to automatically annotate rest of all tokens with "OTHER" label Using Layoutlmv3 ?
Another nice tutorial about layoutlm. Btw: the annotation process is quite a tedious task... Could u also make a tutorial how to cluster relevant keywords in a document? Thanks👍
I have documents which is in pdf format. All documents are different. I want to extract some information from the document. What should i do? Each pdf are contains more pages 5-10.
@@muhammedfaisalpj4810 Yes you can use NER to extract relevant information from the given context. Make sure the quality of PDF or image is good before OCR otherwise it may distort the results of NER.
@@karndeepsinghhi, i want to extract some information from pdf and then convert to excel. I want to train a model to recognise key value pairs in my documents but how?
@Karndeep Singh Will we be able to tag data in FUNSD format using Label Studio?( UBIAI is an online tool, so we can not send our data outside the network)
Great tutorial! How do you know which "answer" belongs to which "question"? Is that somehow part of the model output or does it merely "tag" each token? Thanks!
I think the way you would do it is group them by line/y-position, and group together question/answer pairs at the same y-levels (or close enough). It doesn't really know which one is which
LayoutLM outputs in BIOES format for each word. To make into a certain key, value pair just club BIOES tags for each word to make it final value. Just need to write few lines of code to accumulate the output of LayoutLM into the required formate
Hi Karndeep, for a custom invoices parameters extraction while annotating the label, shoul i do it as question answer style ? Like invoice number question, invoice number answer? Or just annotate the answer part only?
Hello, Thank you for the precise tutorial. I'm facing issue while running 1st cell in google colab, below is the error: Collecting lxml==4.5.1 (from layoutlm==0.0) Downloading lxml-4.5.1.tar.gz (4.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 16.0 MB/s eta 0:00:00 error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip. Preparing metadata (setup.py) ... error error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip. hint: See above for details.
Hi, I have an urgent task. Could you please help me which approach I should do to extract data from invoice images and then classify them with required fields? Thanks for your work
Hi, thanks for this video. I keep getting this error when I run the first block of code to clone unlim. error: subprocess-exited -with error and error: metadata-generation-failed
Hii, seems like in the latest python version it's throwing this error: ModuleNotFoundError: No module named 'layoutlm' I tried changing the python version but still doesn't work can you share updated code or how can I fix this bug Thank you
@@karndeepsingh can I use funds dataset to train a key value extraction model Output of the model:{"To" : "Ka", "H": "g", } Can u pls suggest me a way for this ,i want to do this on funsd dataset
@@karndeepsingh I applied ocr as I tried with easyocr. what I was doing in the process was Reader.readtext("Path of image"), then taking bounding boxes coordinates and text respectively which are having most width among all the bounding boxes in output, then I checked the text with mrz codes and converting into a data frame for convenience. Now, the problem is that sometimes S is considered as 5, and sometimes 0 is considering as O, it is because of the OCR issue I am thinking. How can I get a resolution regarding that, please help me to get out of this.
Thanks @@karndeepsingh Do we have to first get the screenshot of each page in pdf and use the screenshot image one by one for processing and asking the question? Or is there any API available to directly upload the entire PDF and ask questions?
Wow, ready a great video. Just out of curiosity, what is the specs of the hardware or service you are using. How Long do you think it will take to train 5000 annotated documents.
Hello, I would like to know if we prepar data with another tool other than UBIAI. I couldn't sign in to the platform.
Hello Sir, it was an excellent tutorial but I am unable to install unilm/layoutlm as collab has python version 3.11 by default. I downgraded python version to 3.7 and tried running again now transformers are not installed properly. Please help me.
there is an issue with downloading decencies when we install layoutlm from unilm
Is there any way we can get the output of LayoutLM model as key value pairs where the question is key and the answer is the value of it.
Yes, you can use predicted BIOES tag and right a simple code to club these predicted BIOES tags into a single word according to the tags specified
hi, you made it?
can you give any directions to achieve this pls?
@italykiduniya I suggest since you already know your keys why don't to through code just create your own hash data. For example in python
Output_dict = {}
For key in list_of_keys:
Val1 = LayoutLM(doc , query)[index_of_value]
Output_dict[Key] = val
Please use this as an inspiration and not the exact code
I want to use my own dataset already annotated using IOB format json file. Possible to use it as is? Or do I need to translate it to the json format similar to the FUNSD dataset? If yes, is there any (free) converter/translator available to transform the annotation format?
Hi @Karndeep Singh...I am using my own dataset and Let us say I want to annotate 3 labels "DATE" , "INVOICE NUMBER" and "OTHER".. So will it possible to automatically annotate rest of all tokens with "OTHER" label Using Layoutlmv3 ?
Another nice tutorial about layoutlm. Btw: the annotation process is quite a tedious task... Could u also make a tutorial how to cluster relevant keywords in a document? Thanks👍
Sure
@@karndeepsingh Great, I'm also looking ahead for that tutorial. Subsribed😁
I have documents which is in pdf format. All documents are different. I want to extract some information from the document. What should i do? Each pdf are contains more pages 5-10.
You need to pass each page or first find the relevant page and then pass it to the trained LayoutLM model
@@karndeepsinghThank you for the reply. You know it has any other solution like named entity recognition?
@@muhammedfaisalpj4810 Yes you can use NER to extract relevant information from the given context. Make sure the quality of PDF or image is good before OCR otherwise it may distort the results of NER.
@@karndeepsinghhi, i want to extract some information from pdf and then convert to excel. I want to train a model to recognise key value pairs in my documents but how?
Hey @muhammedfaisalpj4810 did u worked on it? Can we connect?
@Karndeep Singh Will we be able to tag data in FUNSD format using Label Studio?( UBIAI is an online tool, so we can not send our data outside the network)
Thankyou so much for wonderful explanation!
The only doubt is how can we get the output in key-value pairs
Great tutorial! How do you know which "answer" belongs to which "question"? Is that somehow part of the model output or does it merely "tag" each token? Thanks!
I think the way you would do it is group them by line/y-position, and group together question/answer pairs at the same y-levels (or close enough). It doesn't really know which one is which
hi karan ,
how to extract the key-value in json at the end .
LayoutLM outputs in BIOES format for each word. To make into a certain key, value pair just club BIOES tags for each word to make it final value. Just need to write few lines of code to accumulate the output of LayoutLM into the required formate
hi, you made it?
can you give any directions to achieve this pls?
Amazing Video. Thanks a lot Kamaldeep
Hi Karndeep, for a custom invoices parameters extraction while annotating the label, shoul i do it as question answer style ? Like invoice number question, invoice number answer? Or just annotate the answer part only?
Hello,
Thank you for the precise tutorial. I'm facing issue while running 1st cell in google colab, below is the error:
Collecting lxml==4.5.1 (from layoutlm==0.0)
Downloading lxml-4.5.1.tar.gz (4.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 16.0 MB/s eta 0:00:00
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Preparing metadata (setup.py) ... error
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
Can you help me to solve the above issue Sir? Thank you.
great tutorial , shall make small video from custom layoutml model please
how can annotate data and train this model on my own form data
I have used UBIAI tool on 14 days trial to annotate the dataset. Although, you can try with label studio ( I personally haven’t tried)
@@karndeepsingh thank you
@@karndeepsingh ubiai is not allowing to annotate jpg, on free trial. any other option plz.
Hello I have doubt how may i create personal dataset? Is there any tools or other method?
Hi Karn
what is the output of this model ? position co-ordinates along with keyword?
how we'll extract the data in key value pair?
Each word BIOES tag with Bounding box information
@@karndeepsingh how to extract that?
hi, you made it?
can you give any directions to achieve this pls?
Hi, I have an urgent task. Could you please help me which approach I should do to extract data from invoice images and then classify them with required fields? Thanks for your work
It depends on usecases, complexity and also how many field required to be extracted ?
Hi, thanks for this video. I keep getting this error when I run the first block of code to clone unlim. error: subprocess-exited -with error and error: metadata-generation-failed
Excellent explanation
Hii, seems like in the latest python version it's throwing this error:
ModuleNotFoundError: No module named 'layoutlm'
I tried changing the python version but still doesn't work can you share updated code or how can I fix this bug
Thank you
im also facing same error ,could you explain me how can i fix this bug
@@banagarmahesh1747 try restarting the runtime
@@NiketBahety did that solve?
@@pratikpatel6967 yes
can we use pretrained model without any finetuning ,how to get results for that model
Yes you can use pretrained model
@@karndeepsingh how to extract the question and answer only in a key value format
Eg:{"To" : "KA Sparrow",
"From" : "DJ Landro",
"Subject" : "....."
}
You can annotate question and answer formate using UBIAI tool and use DoCQA pretrained model to trained
@@karndeepsingh can I use funds dataset to train a key value extraction model
Output of the model:{"To" : "Ka",
"H": "g",
}
Can u pls suggest me a way for this ,i want to do this on funsd dataset
can you please suggest me, how to use LayoutLMV2 for passport data extraction
For passport may be you can use Object Detection to locate the area of interest and then apply OCR on it.
@@karndeepsingh I applied ocr as I tried with easyocr. what I was doing in the process was Reader.readtext("Path of image"), then taking bounding boxes coordinates and text respectively which are having most width among all the bounding boxes in output, then I checked the text with mrz codes and converting into a data frame for convenience. Now, the problem is that sometimes S is considered as 5, and sometimes 0 is considering as O, it is because of the OCR issue I am thinking. How can I get a resolution regarding that, please help me to get out of this.
Its an issue with OCR. May be you need to change some parameters of OCR that you are using. Or May be consider using different OCR.
@karndeepsingh you are so talented. I am glad to have you here.
Can we use mutiple PDF documents to extract the information using LayoutLM?
Yes
Thanks @@karndeepsingh
Do we have to first get the screenshot of each page in pdf and use the screenshot image one by one for processing and asking the question? Or is there any API available to directly upload the entire PDF and ask questions?
excellent. thanks
How to train a custom model for key value extraction
is the process same for pretrained layoutlmv3 model
Yes
Can you please let me know how can we extract that data as a text from document for further analysis. It;s very urgent
Thanks
the predicted output will be in BIOEU tags associated with labels. You have club these BIOEU tags into one to form a text
What are the algorithms used in this model ??
Combination of OCR, Vision Transformer and Roberta
How to annotate scanned pdf?.
The inference time is more. How can we optimise the code ?
You can try Knowledge distillation or quantisation techniques
@@karndeepsingh Sure will make a note and try it
can you please share the github link of this code.It will be very helpfull.
Hi Karn, can we use it for pdf as we train NER models
Yes
@sunitbehera597 hey did u worked on ut? Can we connect?
Wow, ready a great video.
Just out of curiosity, what is the specs of the hardware or service you are using. How Long do you think it will take to train 5000 annotated documents.
It used GTX1080 and it should take atleast 2hrs-3hrs to train
Excellent
How to implement this solution on multipage document?
You have to pass each page of a document into the model
Yes, but won't the model get confused as the new page will have different class at the same coordinate on first page.
nice explaination. could you explain how to create own dataset using ocr for layoutlm
You can use UBIAI tool to do it or else you can prepare the data on your own by extracting text wise BIOES tags and respective text bounding boxes
@@karndeepsingh can you show me how I can prepare without any tools plz . It will be great help for me
@@karndeepsingh sir, could you make a video without the use of UBIAI tools, It will great help for me. Thank you in advance
Sir is there any specific tool for data annotation for this? #urgenthelp
UBIAI
@@karndeepsingh if possible can you make a video on how to annotate the text using UBIAI
Sure
can make video for layout lm v2 please
How to make this dataset? any suggestion
Use Paid tool like UBIAI
Bhaia pls make a video on how to get job as fresher in data science
Lol, do extract information in json format. Do not know how this work.
Prⓞм𝕠𝕤𝐌