Hey, thank you for your contribution! Does anyone know how could I set as an input a whole folder of PDFs? In example: Let's say I have a folder of 50 PDFs and want an output folder of 50 converted TXTs. Can I do that in this code?
Thank you very much for your nice presentation. Which version of Python you are using. On the latest version, I am unable to install PyPdf2. Kindly guide me.
Previously I tried to extract text from pdf using pypdf2 but it didn't worked for me, actually output on console was blank but the links on internet have outputs. So do I need to convert pdf to text to achieve result? Normal text extraction and printing on console wont work?
Does anyone know how it is extracting and converting the data into text? It seems you are reading in binary and then the module command does its magic to provide the text version. I tried this on a readable pdf with values that were formatted within excel tables. I noticed that when I created the .txt file, a lot of the information was left out. I am attempting to have the program do a "copy and paste" of the pdf file into a text file, but I don't think this method does that. Does anyone else know a different method? Great video though as it worked! Just not for my specific case....
This method doesnt work. PDF file text format are coming as a blank page.Tried with 4 diffferent files.for camscanner O/P is "CamSanner" nothing else.. And rest of PDF it is blank txt file.can you help in this
Bro what are the advance stuff should I know when become a programmer what key workds should I know(I hope you know what I'm asking) please help me broo???
Brother there is nothing as advance in programming you just get more deeper into the concept but the basics remains the same ..... If you want my advice then polish your skills by working on more and more projects and them experiment with concepts like mixing them or using parts of different concepts ( ex- combining face recognition of python and obstacle avoiding robot car of arduino.... I am working on this project ) so i guess you are getting my point
Hello @iknowpython sir! Sir actually I'm using this code on my pydroid3 on my Android phone. Actually it is creating the text file but not writing the all text into that .txt file from that pdf file. It only copies the heading from that pdf file. The pdf file only have text, headers & footers only. Sir plz help🙏. Thank you
using the library pyaudio(audio I/O library.) and this pyttsx3 help to install multiple voices by using the code engine.setProperty('voice', voices[0].id) or voices = engine.getProperty('voices')#this is a part of a code engine.setProperty('voice', voice.id) #instead of 0 you can add multiple installed voices this is the way to do it we dont need to import pyaudio but we get a error without it
@@saurabhchavan5451 you can use for loop this is a part of code which helps to read multiple pages for filename in folder: pdf = open(join(pdf_dir, filename),'rb') pdfReader = PyPDF2.PdfFileReader(pdf) for page in range(1, pdfReader.numPages): pageObj = pdfReader.getPage(page) pdfWriter.addPage(pageObj) text = pageObj.extractText()
Hello @iknowpython sir! Sir actually I'm using this code on my pydroid3 on my Android phone. Actually it is creating the text file but not writing the all text into that .txt file from that pdf file. It only copies the heading from that pdf file. The pdf file have text, headers & footers only. Sir plz help🙏. Thank you
can't open file 'Show': [Errno 2] No such file or directory i have this error
Hey, thank you for your contribution! Does anyone know how could I set as an input a whole folder of PDFs? In example: Let's say I have a folder of 50 PDFs and want an output folder of 50 converted TXTs. Can I do that in this code?
Thank you very much for your nice presentation. Which version of Python you are using. On the latest version, I am unable to install PyPdf2. Kindly guide me.
hey man thank you soo much i am using python 3.6.2 .......can specify what is the error that you are getting
@@Iknowpython >>> pip install PyPDF2
SyntaxError: invalid syntax
it works for me bro......... make sure you have no space before 'pip'
i have one pdf and many failed to convert it to excel. will you accept the challenge??
Bro which compiler do u use
Previously I tried to extract text from pdf using pypdf2 but it didn't worked for me, actually output on console was blank but the links on internet have outputs. So do I need to convert pdf to text to achieve result? Normal text extraction and printing on console wont work?
Video is informative will definitely try this. Waiting for some more stuff. Keep it up buddy 👍👍
Thank you soo much
Thanks for video, but 480p or 720p quality of the video may be good for us.
Bro the pixel intensity of video is 2016x1134 .... Try 720 px it will become very clear then
The poor quality is on your end, go to the settings on video and you can control quality from there
Can we use Spyder, Jupyter notebook, PyCharm for this ??
it doesn't matter what ide or editor you use the program remains the same everywhere
@@Iknowpython awesome thanks man
Does anyone know how it is extracting and converting the data into text? It seems you are reading in binary and then the module command does its magic to provide the text version. I tried this on a readable pdf with values that were formatted within excel tables. I noticed that when I created the .txt file, a lot of the information was left out. I am attempting to have the program do a "copy and paste" of the pdf file into a text file, but I don't think this method does that. Does anyone else know a different method? Great video though as it worked! Just not for my specific case....
This method doesnt work.
PDF file text format are coming as a blank page.Tried with 4 diffferent files.for camscanner O/P is "CamSanner" nothing else.. And rest of PDF it is blank txt file.can you help in this
Bro what are the advance stuff should I know when become a programmer
what key workds should I know(I hope you know what I'm asking)
please help me broo???
Brother there is nothing as advance in programming you just get more deeper into the concept but the basics remains the same ..... If you want my advice then polish your skills by working on more and more projects and them experiment with concepts like mixing them or using parts of different concepts ( ex- combining face recognition of python and obstacle avoiding robot car of arduino.... I am working on this project ) so i guess you are getting my point
@@Iknowpython thanks man
Hello @iknowpython sir! Sir actually I'm using this code on my pydroid3 on my Android phone. Actually it is creating the text file but not writing the all text into that .txt file from that pdf file. It only copies the heading from that pdf file. The pdf file only have text, headers & footers only. Sir plz help🙏.
Thank you
Works perfectly, Thank you Sir !
Welcome man , i am happy it helped 😊😊
@@Iknowpython i used it on a multi pages pdf and it didn't work like the example you showed any ideas bro ?
In my converted text file all words are attached together. There is no spaces between words
i think txt file open in w(write) mode becase overwrite it good for this
what if there are images in the pdf?
Thank you so much
why you use r before path name can u elaborate please
its is string formating brother for the path definition in windows
The converted txt does not contain any text, can anyone help me?
Thanku brother ❤️
PdfFileReader module is deprecated
Very interesting.
1. How can I convert a text dialogue, assigning one voice to each character of the conversation?
using the library pyaudio(audio I/O library.) and this pyttsx3 help to install multiple voices by using the code
engine.setProperty('voice', voices[0].id) or
voices = engine.getProperty('voices')#this is a part of a code
engine.setProperty('voice', voice.id)
#instead of 0 you can add multiple installed voices
this is the way to do it
we dont need to import pyaudio but we get a error without it
its only converting one page
Thanks bro
Welcome brother
Hey bro if more than one pages
Then what can do
Pageobj=pdfreader.getPage(?????????)
@@saurabhchavan5451 you can use for loop this is a part of code which helps to read multiple pages
for filename in folder:
pdf = open(join(pdf_dir, filename),'rb')
pdfReader = PyPDF2.PdfFileReader(pdf)
for page in range(1, pdfReader.numPages):
pageObj = pdfReader.getPage(page)
pdfWriter.addPage(pageObj)
text = pageObj.extractText()
Just get frustrating unicode error.
trying to fool us
Nope you trying to be over smart 😊😊
sorry i know you are a great programmer but since i had my internet down i was a bit angry so i replied a bit bad and remember i have liked the video
3update to 2023, many instructions are deprecated
import PyPDF2
pdffileobj=open('D:\Python_spyder\\1.pdf','rb')
pdfreader=PyPDF2.PdfReader(pdffileobj)
x=len(pdfreader.pages)
pageobj=pdfreader.pages[x-1]
text=pageobj.extract_text()
file1=open(r"D:\Python_spyder\\1.txt","a")
file1.writelines(text)
file1.close()
print("hecho")
Hello @iknowpython sir! Sir actually I'm using this code on my pydroid3 on my Android phone. Actually it is creating the text file but not writing the all text into that .txt file from that pdf file. It only copies the heading from that pdf file. The pdf file have text, headers & footers only. Sir plz help🙏.
Thank you