Get Data from PDFs and Send to EXCEL with Power Automate Desktop!

Christine Payton

Просмотров 36 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 2 июн 2024
Power Automate Desktop is the lesser-known cousin to Power Automate Cloud. As of 2023, it's free with the latest version of Windows - so you may already have it installed! You can use it with a free Microsoft account OR with an existing M365 subscription.
In this video, we go through how to extract and parse specific values and metadata from PDFs by looking for words "next to" our values and send them to an Excel sheet. This is a beginner-level video, though it helps to be familiar with the concept of regular expressions for an easy intro into robotic process automation (RPA).
This technique can be particularly useful for legal and finance groups, who have a tendency to get filled PDFs routinely and then need to organize them!
0:00 Demo
2:00 Configure the Excel template
2:49 Power Automate: Get files in folder
4:42 Loop through files and extract text
6:24 Parse text using regular expressions
9:00 Split text to isolate your values
10:44 Send to Excel
Наука

Комментарии • 111

@kidd32888 5 месяцев назад ⁺⁴
omg you saved my organization hundred of hours and this is not a hyperbole! THANK YOU SO MUCH!!
@bi-ome 5 месяцев назад ⁺¹
aw that's so great to hear!
@killiancolombo331 5 месяцев назад ⁺¹
That is pretty cool!
@PowerBeaver 7 месяцев назад ⁺²
Awesome, thanks for posting this. I started doing such kind of thing using vba, which was quite a journey😅. Didn't think PA Desktop could do the trick. 👍🏻
@bi-ome 7 месяцев назад
Yeah!! I didn't realize VBA could do this, so we both learned something. 😆
@jerrysaen Месяц назад
Thank you very much!
This helps me with my business
@akilalaksiri9172 26 дней назад
Thank you very much! You saved me :)
@iracypimenta440 3 месяца назад
First, thank you for your content. I might be doing something wrong, although I have followed yours instructions step-by-step, because I am only able to get even without the break just one iteration in my pdf file. Thanks again
@bi-ome 3 месяца назад
Did you add a breakpoint? The red circle? That will stop it after one loop, I think I added one in the video to show it... you can click the dot to turn it off. Also check what you're looping over for the actions and whether the thing that you're looping over contains all the files you want to loop on (you can check this by looking into the variable content after it runs).
@schmidtnolan 7 месяцев назад ⁺¹
Great information! I've modified this slightly to use 'crop text' to extract text between two headers. Would you have any suggestions for when the headers have slight variations? I'm thinking of running a subloop with each potential iteration based on the condition of if the text is found... the initial build will stink but should do the trick. Thoughts?
@bi-ome 7 месяцев назад
Cool! It depends on what your variations are… I’ve used the switch/case cards for this, basically to check if it contains the word I’m looking for before parsing (with one case for each variant, and slightly different regex for each).
@peterolah1165 7 месяцев назад ⁺²
great video Christine! I got a question: I got many PDFs with multiple values to search and 1-2 of these are missing. I got the index out of range error message (right after the write to excel command), I try the error handling option and an IF sub flow but didn't work.
@bi-ome 7 месяцев назад
Try putting the if-statement above the parse text step - so it will only run the parse text and split steps if the file contains the text you're looking for. That way you don't have to error handle it, because it won't error. :)
When you do this, make sure to have a step that resets any variable you were using to store values to blank at the end of the loop, so that doesn't insert the prior loop's value in Excel when the text doesn't exist.
@hvidsteen89 3 месяца назад ⁺¹
Thank you very much for a great tutorial. But do you have any idea, if I can get this to work with all the files combined into a single PDF instead of all my source material being split into seperate files?
@bi-ome 3 месяца назад ⁺¹
Probably, you can use a regular expression to collect all matches for your query - I had GPT write one to try, this looks for the word "balance" and gets anything after it so you can see if something similar would work: balance:\s*\$?(\d+[\.,]?\d*)
It should put all the matches in a list, then you can loop over those to do things after splitting out whatever you don't need.
BUT you might also try connecting to the file with Power Query (it can use PDFs as a source) and see what that gives you. The only reason I didn't use it in the example is because multiple files usually don't have the exact same structure between the files, but if yours is all in one file it might work.
Alternately there's a ton of new AI options in the cloud Power Automate for entity extraction that use AI Builder credits, and you likely have credits if you have Power Automate licenses to try.
@user-cl1ir5ui4i 2 месяца назад
This is some really powerful stuff, thank you. In the video you used some reglar expressions that you use, where can I find the complete list?
@bi-ome 2 месяца назад
Thanks! There's not a list of regular expressions per se - here's more info on them: coderpad.io/blog/development/the-complete-guide-to-regular-expressions-regex/
ChatGPT/copilot are very good at creating regular expressions, so I usually just start by giving them what I'm looking for and an example and it will usually work. Do make sure to tell it something along the lines of "just give me the regular expression, not Python, and do not use look-aheads or look-behinds" to get the "right" thing to plug in. :)
@gedelasaisravya2603 3 месяца назад
Thank you helped a lot. I also need to extract address from the pdf which can be of different number of lines in each file. How do I parse that and write to excel?
@bi-ome 3 месяца назад
Depends what the files look like, you could try using an action or regular expression that gets text between two values to get everything in between whatever comes before and after the address - that should get it no matter how many lines there are. Not sure how it needs to be broken out, but regular expressions are great at parsing the street address/zip/state/etc from a text block. You can use if-conditions to run different expressions according to various things (e.g. count the line break characters to get a line count and use that, or if it is a certain country parse it differently etc).
@dvwatts Месяц назад
Thank you for the video! Maybe I missed it but was there an option to deal with multiple choice or check box data types?
@bi-ome Месяц назад
That's a really good question. I didn't need to get checkboxes so I didn't think to try - I found a PDF with checkboxes to prod at and when I copy and paste a checked box out of the PDF there is no indication that the box exists from that or that it's checked, which makes me think the extract text isn't the way to go here.
I tried doing an internet search, too, and didn't find anything that looked like a functioning example of someone doing this, so you've stumped me lol. It's not supported in AI Builder either, apparently, which is usually a good fallback. Though maybe with OCR there it'd be able to "see" the checkmark?
@jomarismeneses3765 5 месяцев назад
Good video! How can I do if I have documents with different structures and information and I want to put them in the same Excel document but in different sheets of it? Thank you!
@bi-ome 5 месяцев назад
You can add either an "if" branch or "cases" that checks the PDF text to see if it contains some particular unique string for each circumstance before extracting the text (you'd use a different extraction expression for each branch).
There's actions for "get all Excel worksheets" that you could probably combine with "set active Excel worksheet" to change the worksheet before it gets the first free row to insert the data. You'd get-all outside the loop, then inside each conditional branch have it set the active sheet before it inserts.
I'm not sure if the action that swaps sheets gives itself sufficient time to complete or not before continuing, you might need to add a short pause between that and inserting (just something to watch for, I've not tried it yet - might be fine). :)
@CLTPHDdad 2 месяца назад
Hi Christine, Wonderful video! I am wondering whether you could offer any direction on an issue I am having. Basically, I have followed the steps in your video to extract PDF text and have viewed the structure of the text. Resultantly, I have identified a static, unchanging text string represented in all PDFs in the file folder (i.e., the report name) on line 35 and I need the dynamic information (client name, client ID, and date) from the subsequent line (i.e., line 36) to be individual variables that will be incorporated into the Excel file in columns A, B, and C, respectively. The format of the string is: "Full Name | ID Number | MM.DD.YYY" and there is some additional unnecessary text that follows the date. Any direction you might provide would be greatly appreciated!
@bi-ome 2 месяца назад
Sure, I use ChatGPT to produce the regex because that saves a lot of time. It provided this to get the full text of line 36: (?:.*
){35}(.*)
So if you take that with your parse text action and put it in a variable, then use the action that splits text (search for the word split in the action menu) and split on " | ", you will get each of the values you want in an array. You can reference those by position with %yourVarName[0]% for the first item, %yourVarName[1]% for the second, and so on. You can insert those in columns in Excel like we do in the example in your loop that loops over the docs. 😀
@KomailButt 4 месяца назад
Thank you so much for providing such useful information. I need suggestion. I have document library folder in SharePoint where PDFs comes. In each PDF there is a table which is spreaded over multiple pages. PDFs also include text but I am only interested in extracting those tables from the PDFs. Please suggest the possible easier way to do this task. I am not sure whether I should use AI builder model (online) or this deskstop solution.
@bi-ome 4 месяца назад
I feel like AI Builder is probably going to be easier for your case if you have the credits for it. The method in this video is better for extracting very particular information from the text using what amounts to trigger words to find the content. It has the advantage of being "free", but is not as flexible as a LLM would be.
@komailbutt2998 4 месяца назад
@bi-ome Thanks. I am looking for a free resource if possible in the first place, but let's see. By the way, how much credit does it require for doing this task, and do you know any other way to do this task for less money?
@bi-ome 4 месяца назад
@@komailbutt2998 AI builder credits come with most of the Power Platform license packages, so you might have enough floating around to work with. They’re a first-come-first-serve sort of thing at the tenant level.
@gioargentati7802 7 месяцев назад ⁺¹
Thank you for posting this informative video. Sometimes, when I try to read text from a PDF using ExtractPDFText, instead of getting readable text, I get strange characters. Do you have any suggestions on how to handle this sort of thing? Thank you in advance, Christine!
@bi-ome 7 месяцев назад ⁺¹
Are they scanned PDFs? Scanned PDFs can be very weird on the text end. I would try opening one that is having the issue and selecting the text inside the PDF and pasting it into a text editor just to see what it looks like - that'll tell you if the issue is the Power Automate action or the PDF itself.
@kbirstein 4 месяца назад
@@bi-ome I had the same issue. I just created a PDF with one line of text and I was able to select it in the PDF and copy it and paste it to Notepad and it appeared as text so I'm thinking this only works on special types of PDFs and it's impossible to know which ones it will work with . . .
@bi-ome 4 месяца назад
@@kbirstein Do you have an example of what the weird text looks like? It could be an encoding issue.
@craigjenkins6917 2 месяца назад
Great video. How would you extract the data from the text above rather than after the line. Thanks
@bi-ome 2 месяца назад
I had a conversation with ChatGPT about this, because I'm not a regex expert, it says that regex is designed around finding text after the target and not before. Is there static text before that you can use instead? The wrench is that Power Automate doesn't seem to accept "lookbehinds" in regex like some applications can, so you have to get creative.
@ryannedieu Месяц назад
Thank you so much for the tutorial. I know someone has asked about dealing with out of range blank values. Could you elaborate more? I have a fairly simple PDF data extraction with two fields. However, one of them can have a lot of blank values. Thank you.
@bi-ome Месяц назад
There's lots of ways to do the error handling - you could put the part that selects the value and adds to Excel steps in an error block and set it to continue at the end of the block if it hits an error (which will happen when it tries to do things to a null value). That would basically just have it continue on to the next PDF if it doesn't find a value - not sure if you need to record the blank or not.
You could also use the "if" conditional to check the isolated value (after you remove the word you're looking for) or the second position in the array, using a "is not blank" and put the update Excel actions there. Meaning it would only add them to the Excel file if they had a value. You could use "else" to set it to some specific text like "no value" probably too?
The error handling takes some trial and error to get working the way you want it to.
@ryannedieu Месяц назад
Thank you so much for the response. I’m able to get it work with your suggestions. Thank you!
@JosephHadaCleveland 13 дней назад
Is there any chance you could do a video, if it is possible, on how to feed in a list of URLs from excel and save each page as a pdf to a folder?
@bi-ome 13 дней назад
The only way I've gotten the web pages to PDF to work is to "print to PDF" via using "send keys" with hotkeys, with tab to tab through menus and enter to "click" print. Lol. It works okay that way but you need to have pauses in there to space out the send keys actions. You'd just loop over the Excel rows and launch a browser with the link as the address and close the browser at the end of the loop.
@user-cl1ir5ui4i 2 месяца назад
Christine, this tutorial gets me most of the way to what I need done. The only problem is that the email I get from my vendor contains a link that I need to follow to get a web based copy of the invoice. To your knowledge is there a way to have Power Automate activate the link so I can scrape the webpage for the information I need and then save the webpage as a PDF with the name that I got from the document? In advance, thanks
@bi-ome 2 месяца назад
I think this will be a very difficult one but technically possible... you might want to go with Power Automate Cloud for this instead of desktop, since you'd be triggering it when an email is received. There is a comment in this particular thread that has what looks like a viable solution to me, using the HTTP connector in a cloud flow: ideas.powerautomate.com/d365community/idea/94b667cd-89cc-4c61-8982-79b53c1a6019
Alternately, you can parse email text in the desktop version, maybe do regex to get the URLs ( \bhttps?://[^\s]+ ), then use that to launch a web browser with the URL (from a variable), and save them... Web-based PDFs are a particular pain because the "save" button for a PDF is not something PAD recognizes as a UI element typically, so what I've done as a workaround is use "send keys" to print it to PDF (control p). When you're saving it, you can use send keys again to send a particular text string to the filename field based on a variable (I use send keys with "tab" to tab down to that field... lol... because it can't seem to select it with the recorder). Increase the time between keys slightly or it will get "ahead of itself" while things load.
@user-cl1ir5ui4i 2 месяца назад
@@bi-ome Thank you. If I find a solution I will let you know.
@Steph-ho2gl 2 месяца назад
Super tutorial Christine! is it also possible to collect data from table for which response is below the key word, and not after? thank you!
@bi-ome 2 месяца назад ⁺¹
If you just want the one value, you can use your target text and
(.+) as the regular expression. That should get everything on the next line. If you are wanting to extract the WHOLE table, that depends on the structure of the PDF - if it's just one table, Power BI can sometimes grab it, but if you have multiple tables or a table that spans multiple pages and isn't always in the exact same location, there is entity extraction with AI in AI Builder or Syntex.
@Steph-ho2gl 2 месяца назад
@@bi-ome thank you for your response Christine ! After further investigation, i discovered that Power Automate desktop has not only a functionality to extract text as you exposed in your video, but also the functionality to extract tables from PDF.
Both functionalities are very complementary and very powerful. Thank you again !
@bi-ome 2 месяца назад
@@Steph-ho2gl Awesome! Want to link to the resource you used? I don't actually use PAD all that often, this video was based on a technique I picked up while scrambling to do a particular project, so I'm interested!
@Steph-ho2gl 2 месяца назад ⁺¹
Hi Christine, I used the following sequence :
1. Select on file using the action Message box - Select file >> Variable = “Selectedfile”
2. Extract table from PDF file [from SelectdFile into “ExtractedPDFTables” variable]
3. Message box where Power automate will list all the tables he identified in the PDF
4. Launch Excel (which I have prefilled with the header column I want to keep.
5. Insert “For each” to loop all the identified tables [Variable used = “ExtractedPDFTable”]
a. Then embed one “if” condition and I specified a specific content on one header column. In fact, my PDF file contains 42 tables, and only one is interesting for my purpose because it contains a specific key word in the CurrentItem.datatable.columnHeadersRow Contains Key word”.
i. Obtain the first free line in my excel >> It creates a variable “FirstFreeRowOnColumn”
ii. Write in excel [CurrentItem.datatable]when condition is met
b. End if
6. End the loop started in 5.
Hope it can help you!
@@bi-ome
@bi-ome 2 месяца назад
@@Steph-ho2gl Thank you so much for taking the time to share!
@taesunyoo4637 2 месяца назад
Hi Christine thank you for sharing wonderful video!. Quick question: does this work on Power Automate Cloud as well? sounds kinda silly question.
@bi-ome 2 месяца назад ⁺¹
Cloud doesn’t have regex per se, but you can extract from PDF with things like AI Builder to get what you’re after via prompt. The issue I have with it is that when it gets the file content from a PDF, it’s in base64 and has to be converted to be usable by AI… There’s various methods and workarounds for that, and alternatives you can use for more $$ like Syntex. The technique in this video is basically the budget DIY version of all that! 😅
@taesunyoo4637 2 месяца назад
@@bi-ome thank you for your prompt reply. ah I was searching around RUclips as well and like you mentioned I did see solution relying on AI builder to mark/highlight each data field from PDF file. Too bad that regular expression is not supported by cloud which is a bit shocking for me. thank you very much!!
@bi-ome 2 месяца назад
@@taesunyoo4637 You can use regex with Syntex/SP Premium! Just not Power Automate. Syntex is a more holistic version of all of this, it's premium priced but they put a UI around it and let you auto-extract the data and apply it to the item in the SP library.
@munbingkong5401 27 дней назад
Hello this has been a great help, may i know how do you copy and paste the extracted pdf into the notepad with numbers indicating each line for it ? Because when i did that , it only copy word for word and did not include any line numbers..
@bi-ome 26 дней назад
The numbers were just in the PDF file, they're part of the table text. If you want row numbers in the text editor though, you can use a different text editor - e.g. VSCode would have that.
@munbingkong5401 26 дней назад
@@bi-ome Hello thank you for replying me ! i have been doing this power automate for my corporate JDE software and out of 10 times maybe 3 times it will log in successfully , by chance , do you know what's the problem?
@anushkatiwari1506 5 месяцев назад ⁺¹
Thanks for the video, but I have a different request. I want to extract data from all the worksheets of a excel workbook using a loop and then print it in a pdf and save it. I want my user to first select the path of excel workbook and the path where the pdf should be saved. How can I do this?
@bi-ome 5 месяцев назад
Printing is tricky because often the UI browser often has trouble selecting menu options from the print menu when it runs. What I have done in the past is use "send keys" to do all the navigation in the print/save menu, using tab to navigate through options and "enter" to "click" selected buttons. I increase the delay between keys to significantly above the default, because if they run too quickly together they will trigger before the prior one is finished applying.
You can use the action that allows users to enter an input to collect the save path if it will be different each time, then use the path concatenated with the filename in the filename slot in the save menu - that will save the file to whatever path is entered (e.g. C:\Users\username\Documents\FolderName\filename.pdf). Since this isn't usually a selectable input field, I use "send keys" to send the filename/path to that box - the cursor will be there by default when the window opens, if it's not for whatever reason you can send more keys to get there. 😊
@tateeslick4695 11 дней назад
I have my table in the same format but when I view the detracted pdf text, it’s on a new line. Please show or explain what would be different when the heading I want is on a certain line and the text I want is on another line. In the pdf, they are next to each other.
@bi-ome 10 дней назад
It might be "getting" a line break character that you can't see until it's in Excel. You can try one of the methods here and see if one of these works: www.reddit.com/r/excel/comments/zolxmo/how_do_i_clear_this_empty_line_breaks_for_each/
@sebastiansolis5279 2 месяца назад
This is great. Is there any Way to send info to a share list?
@bi-ome 2 месяца назад
Yes, there are SharePoint actions in the desktop app, but they will trigger premium licensing. You might consider using the Excel file to feed a PowerShell script, too, that would also work!
@jl6523 2 месяца назад
She's good!
@Gaurav19882001 2 месяца назад
Thank You So Much for your lovely Support please make some more other tutorials on PDF to Exel mam Please it will be great help for All
@bi-ome 2 месяца назад
Is there something in particular you’re trying to do? The thing with Power Automate is it can do SO many things, it helps to know what people are looking for
@starkybaxter4867 Месяц назад
Hi, Is this also works on Leads data pdf sheet? I want them to extract to csv file
@bi-ome Месяц назад
I'm not sure what a Leads data pdf sheet is, but it will work on any selectable text data in PDFs. If you want it to go to CSV, you can save the Excel file as CSV type at the end of the flow. There's also an action to go directly to CSV from a data table variable, skipping Excel, but we're not using data table variables in this particular tutorial - either will work
@piratefish5856 3 месяца назад
As a new user doing my best to learn power automate, i thank you, your video has been greatly helpful. Your tutorial works great for me except for one detail. At the 12:30 mark in your video, where you are pulling the second position in the array and enter the [1], i cannot get this to work. None of my variables have % signs around them. i have been searching for how to turn on the % signs, or how to make it work without the % signs. How do i make this work?
@bi-ome 3 месяца назад
The % signs are how you reference variables - so you can type %your-variable-name% to use them in formulas. It's just how Power Automate knows you're making a reference vs just normal text - so you can type whatever you want in the box. They won't have percentages in the dropdown menu. 🙂
@piratefish5856 3 месяца назад
@@bi-ome That was one of the first things i tried. It gives me an error that reads "Parameter 'Value to write': Unexpected characters. Characters are used in the formula in an unexpected way." I don't know if i am on a newer version or if there is a setting for the % signs. In your video, all your variables have % signs around them. In my use of power automate, nothing has a % sign around it.
@bi-ome 3 месяца назад
@@piratefish5856 You are using PA *desktop*, right? Like from your start menu? Power Automate Cloud is very different and they are often confused--
@piratefish5856 3 месяца назад
@@bi-ome I am using the PA desktop app. Version 2.40.157.24023, which is the most recent version as of 01/29/2024. To try to be more specific, on your Write to Excel worksheet popup, in the Excel Instance block, on your video is shows "%ExcelInstance%" but for me i see "=ExcelInstance"; same thing for the Value to write block, your video shows "%InstitutionName%" by default, my screen shows "=InstitutionName"
@bi-ome 3 месяца назад
Hm your comments somehow hit the moderation filter, and when I approved them they disappeared... but you mentioned something about the variables not having % in them? I would go to the step that they are created in, e.g. the step that opens the Excel file, and check that they have percentages there. When I name a variable in a step, it auto-adds the percentage signs, but perhaps yours don't for whatever reason? Also they only show up when you're editing the step and click on the var name in the editor, if that's part of it.
@theretheheheheee 2 месяца назад
I have like a hundred pdf personal data, like ID card, which state name, dob, etc. Can I use this to transform it to excel rather than to typing it one by one ?
@bi-ome 2 месяца назад
Yes, as long as they follow a similar structure (e.g. state name is always after "State:") and the text is selectable in the PDF (not scanned). If you're working with scanned ID cards, you need to use OCR on the text first, which you could do with a call to an API service.
@narmididi6336 4 месяца назад
Hello. How do i extract from PDF address info which is not on the same line?
Thanks
@bi-ome 4 месяца назад
Regular expressions are really commonly used to parse addresses, my guess is Chat GPT or the internet has them readily available. I would try to get the address, then split on the line break and comma to separate out the street, city, and zip-
@davidsides5398 5 месяцев назад
I get an error that says Index '1' is out of range, when it tries to write the second line item. Any idea?
@bi-ome 5 месяцев назад
That usually means it’s blank, that there’s nothing in it. I would check to see what ends up in your variable when it runs, your expression may need adjustment-
@jomelreyes9133 3 месяца назад
Hi can you help, I cant proceed to Value to write field on Write to Excel worksheet action my flow variable is Invoice. In the latest power automate the default is =Invoice not %Invoice% and when I attemping to put [1] or =Invoice[1] it gives me an error "unexpected Chatacter"
@bi-ome 3 месяца назад ⁺¹
Do you have this feature turned on? "Power Fx enabled (Preview)."
I heard from another commenter that this setting was on and it was changing the structure of all of the variable references. You might toggle it off and see if that does it. This video was created before this feature was added.
@jomelreyes9133 3 месяца назад
@@bi-omeIt worked, because I couldn't figure out how to disable Power FX, what I did was I just created a new flow, and yes, it worked. Thank you for your help.
@user-ws4ih1ev1q 2 месяца назад
You just extracted the values once what should I do to run a loop and extract all the values from my data? Is there a way to do so?
@bi-ome 2 месяца назад
It's running in a loop on the files, so it'll extract each thing you configure it to once per document. and create a row in Excel for each. If your files are structured differently, you can provide more info about what you're trying to do and I can try to offer a suggestion. :)
@user-ws4ih1ev1q 2 месяца назад
@@bi-ome See in my pdf I just dont have one id or one name I have multiple and I want to extract all of them one by one in excel so is there a way to do it? Like run a loop until all the values are extracted and then stop?
@bi-ome 2 месяца назад
@@user-ws4ih1ev1q Yeah, there is a toggle in the "parse text" action where you can toggle whether it gets the first occurrence only or all. So if the names always occur after the text "Name: " you'd use Name: (.+) for your regex, toggle off "first occurrence only" and it should just put them all in an array. You can reference those based on array position, or for-each loop over the items in the array to do things with them.
That said, if your PDFs are complex, it might be a lot easier to use AI tools like Syntex (but more expensive).
@user-ws4ih1ev1q 2 месяца назад
@@bi-ome Actually I used a python script to extract the data, the normal Parse Text was not working, so I dont know how to reach to the solution of my problem.
@KutzKraftsNKreationz Месяц назад
I need it for my business but i zoned out after 3 mins. has nothing to do with the video. I am just not tech savvy. how can I get a one on one to see if i can use it for my business?
@bi-ome Месяц назад
If you have any particular questions you can ask here and I'll try to answer. Any automation tool will require a technical person, or at least someone who can make it through a video or two lol. I would recommend this tool for people who use Microsoft 365, thus are already invested in Microsoft stack products. I used it here because it's what was available to me at the time.
@cbacca2999 5 дней назад
What is the link to get the PDF file so I can follow along?
@bi-ome 4 дня назад
Sure, I went back and checked and its not super straightforward to download them from the FFIEC site directly, you have to put in the institution ID to get each file (it's been a while), but I found what looks like the files I used on my PC and uploaded them here: christine-payton.com/resources-files/
@user-di3rl6ww2d 2 месяца назад
Hi what if the pdf info is all image and unable extract the data
@bi-ome 2 месяца назад
You would need to use OCR on it in the flow to get the text. There’s various tools to do this, easiest to use is probably AI Builder in Power Automate Cloud-
@AsaelCosentino 26 дней назад
Getting the problem all the time: "file is open in another application".
Someone knows how to fix?
@bi-ome 26 дней назад
You can only have the Excel file open in one instance. So if you are testing your flow, you need to remember to close the file between runs so that it’s not already open when the flow tries to open it :)
@Eli22clem 2 месяца назад
Didn't work (edit) After mapping the shared folder my pdfs were on to a drive it did work! Now just have to get it to pull all instances in each pdf and not just the first.
@bi-ome 2 месяца назад ⁺¹
Yes, quite literally nothing you will ever make in Power Automate desktop will work the first time you run it - it requires troubleshooting to get things working. That's part of the process. :)
@shivarajaprasadambati9632 7 месяцев назад
I have pdf where data having in other languages
@bi-ome 7 месяцев назад
You can look for other language words with regular expressions too!
@MCAES01 3 месяца назад
you record your extra large screen and when we watch it on small screens or smartphones the words become microscopic
get a dislike
@123456789santia 5 месяцев назад
Do you work for google? cause your channel is full of knowledge that takes too much time to learn.
@bi-ome 5 месяцев назад
Lol, no, thanks though - I have been using most of these tools a really long time. PA Desktop is pretty new for me, but I picked up this particular trick out of desperation trying to complete a particular project while doing a lot of mental flailing around trying to learn the tool. Thought it might save someone else some stress to share. 🤣
@lizzyvallejo8298 4 месяца назад
Necesito extraer el Saldo Actual de un extracto bancario el dato se muestra así: Saldo Actual $ 1.234.567.890.23 y cuando hago el paso de analizar texto no me trae información, sabes que debo cambiar?
@bi-ome 4 месяца назад ⁺¹
Puede ser complicado cuando tienes que lidiar con comas y signos de dólar, porque el PDF es texto y no necesariamente reconoce que esos son parte del número. Yo intentaría algo así (usé GPT para obtener esta fórmula, es muy bueno para hacer expresiones regulares).
Current Balance \$\s*(\d{1,3}(,\d{3})*)
Explicación:
Current Balance \$: Coincide con el texto literal "Current Balance $". El signo de dólar se escapa porque $ es un carácter especial en regex.
\s*: Coincide con cero o más caracteres de espacio en blanco.
(\d{1,3}(,\d{3})*): Captura el número.
\d{1,3}: Coincide con 1 a 3 dígitos (para el primer grupo del número).
(,\d{3})*: Coincide con cero o más grupos de una coma seguida exactamente por tres dígitos.
@aycampos 3 месяца назад
@@bi-ome thank you so much for putting this video together, I have watched 3-4 times already and I almost finish my flow, the only thing is the same issue or similar as above, but my amounts do not have a dollar sign and are not next to the word but down below in a different row or underneath, and I have up to millions for the amount, could you please help me with the expression? Thank you
@bi-ome 3 месяца назад ⁺¹
@@aycampos Yeah, the expression TargetText:
(.+) should get whatever is on the line below the target text (replace target text with the key word you're looking for). I'd run it with that and see if you get something you can split parts off of. It depends on if you have other text in that second line or not. For all of these I am just using ChatGPT to help me write regular expressions, just give it an example of what your content looks like and ask it to write "regex without look-aheads or look-behinds". It's very good at regular expressions. 🙂
@aycampos 3 месяца назад
@@bi-ome oh wow, thank you for getting back to me so quick, I'll try both the expression and ChatGPT.

Следующие

Автовоспроизведение

Microsoft Power Automate | Add data to Excel, get data from Excel, Conditions and Send Email | Guide