HOW TO CONVERT .PDF TO .TXT USING PYTHON

I know python

Просмотров 34 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 9 ноя 2024

Комментарии • 48

@joseadriano8168 4 года назад ⁺²
can't open file 'Show': [Errno 2] No such file or directory i have this error
@anastasiatsoukala306 4 года назад ⁺¹
Hey, thank you for your contribution! Does anyone know how could I set as an input a whole folder of PDFs? In example: Let's say I have a folder of 50 PDFs and want an output folder of 50 converted TXTs. Can I do that in this code?
@bilalsharif313 5 лет назад
Thank you very much for your nice presentation. Which version of Python you are using. On the latest version, I am unable to install PyPdf2. Kindly guide me.
@Iknowpython 5 лет назад ⁺²
hey man thank you soo much i am using python 3.6.2 .......can specify what is the error that you are getting
@bilalsharif313 5 лет назад
@@Iknowpython >>> pip install PyPDF2
SyntaxError: invalid syntax
@Iknowpython 5 лет назад
it works for me bro......... make sure you have no space before 'pip'
@mayuragarwal8860 4 года назад
i have one pdf and many failed to convert it to excel. will you accept the challenge??
@bhanusrinivaskoppolu4814 5 лет назад
Bro which compiler do u use
@MrVivekc 5 лет назад
Previously I tried to extract text from pdf using pypdf2 but it didn't worked for me, actually output on console was blank but the links on internet have outputs. So do I need to convert pdf to text to achieve result? Normal text extraction and printing on console wont work?
@MrVivekc 5 лет назад ⁺¹
Video is informative will definitely try this. Waiting for some more stuff. Keep it up buddy 👍👍
@Iknowpython 5 лет назад
Thank you soo much
@heyderelesgerov9499 5 лет назад ⁺¹
Thanks for video, but 480p or 720p quality of the video may be good for us.
@Iknowpython 5 лет назад ⁺¹
Bro the pixel intensity of video is 2016x1134 .... Try 720 px it will become very clear then
@MrDogloverguy 4 года назад
The poor quality is on your end, go to the settings on video and you can control quality from there
@hamzamuhammadkhan 5 лет назад
Can we use Spyder, Jupyter notebook, PyCharm for this ??
@Iknowpython 5 лет назад ⁺¹
it doesn't matter what ide or editor you use the program remains the same everywhere
@hamzamuhammadkhan 5 лет назад ⁺¹
@@Iknowpython awesome thanks man
@nicolaswirtz6952 4 года назад
Does anyone know how it is extracting and converting the data into text? It seems you are reading in binary and then the module command does its magic to provide the text version. I tried this on a readable pdf with values that were formatted within excel tables. I noticed that when I created the .txt file, a lot of the information was left out. I am attempting to have the program do a "copy and paste" of the pdf file into a text file, but I don't think this method does that. Does anyone else know a different method? Great video though as it worked! Just not for my specific case....
@manish36556 3 года назад
This method doesnt work.
PDF file text format are coming as a blank page.Tried with 4 diffferent files.for camscanner O/P is "CamSanner" nothing else.. And rest of PDF it is blank txt file.can you help in this
@alenwalker7362 5 лет назад
Bro what are the advance stuff should I know when become a programmer
what key workds should I know(I hope you know what I'm asking)
please help me broo???
@Iknowpython 5 лет назад ⁺¹
Brother there is nothing as advance in programming you just get more deeper into the concept but the basics remains the same ..... If you want my advice then polish your skills by working on more and more projects and them experiment with concepts like mixing them or using parts of different concepts ( ex- combining face recognition of python and obstacle avoiding robot car of arduino.... I am working on this project ) so i guess you are getting my point
@alenwalker7362 5 лет назад
@@Iknowpython thanks man
@hardcode1136 4 года назад
Hello @iknowpython sir! Sir actually I'm using this code on my pydroid3 on my Android phone. Actually it is creating the text file but not writing the all text into that .txt file from that pdf file. It only copies the heading from that pdf file. The pdf file only have text, headers & footers only. Sir plz help🙏.
Thank you
@chakirfri 4 года назад
Works perfectly, Thank you Sir !
@Iknowpython 4 года назад ⁺¹
Welcome man , i am happy it helped 😊😊
@chakirfri 4 года назад
@@Iknowpython i used it on a multi pages pdf and it didn't work like the example you showed any ideas bro ?
@abhi2k68 3 года назад
In my converted text file all words are attached together. There is no spaces between words
@abhishekthakkar8897 4 года назад
i think txt file open in w(write) mode becase overwrite it good for this
@akashgeorge5433 4 года назад
what if there are images in the pdf?
@siddhigolatkar8558 3 года назад
Thank you so much
@gandharvkumar4538 5 лет назад
why you use r before path name can u elaborate please
@Iknowpython 5 лет назад
its is string formating brother for the path definition in windows
@marcscherzer 4 года назад ⁺¹
The converted txt does not contain any text, can anyone help me?
@Banjara_boys_and_girl 4 года назад
Thanku brother ❤️
@kalh-tg9wb 9 месяцев назад
PdfFileReader module is deprecated
@zephird.t.3038 4 года назад
Very interesting.
1. How can I convert a text dialogue, assigning one voice to each character of the conversation?
@pythonmacho9954 4 года назад ⁺¹
using the library pyaudio(audio I/O library.) and this pyttsx3 help to install multiple voices by using the code
engine.setProperty('voice', voices[0].id) or
voices = engine.getProperty('voices')#this is a part of a code
engine.setProperty('voice', voice.id)
#instead of 0 you can add multiple installed voices
this is the way to do it
we dont need to import pyaudio but we get a error without it
@omarrazi4826 4 года назад ⁺¹
its only converting one page
@finociasubahani6035 5 лет назад
Thanks bro
@Iknowpython 5 лет назад
Welcome brother
@saurabhchavan5451 4 года назад
Hey bro if more than one pages
Then what can do
Pageobj=pdfreader.getPage(?????????)
@pythonmacho9954 4 года назад
@@saurabhchavan5451 you can use for loop this is a part of code which helps to read multiple pages
for filename in folder:
pdf = open(join(pdf_dir, filename),'rb')
pdfReader = PyPDF2.PdfFileReader(pdf)
for page in range(1, pdfReader.numPages):
pageObj = pdfReader.getPage(page)
pdfWriter.addPage(pageObj)
text = pageObj.extractText()
@petrockspiracy3120 3 года назад
Just get frustrating unicode error.
@yagavyagav763 4 года назад ⁺¹
trying to fool us
@Iknowpython 4 года назад
Nope you trying to be over smart 😊😊
@yagavyagav763 4 года назад
sorry i know you are a great programmer but since i had my internet down i was a bit angry so i replied a bit bad and remember i have liked the video
@XavierBustosC Год назад ⁺¹
3update to 2023, many instructions are deprecated
import PyPDF2
pdffileobj=open('D:\Python_spyder\\1.pdf','rb')
pdfreader=PyPDF2.PdfReader(pdffileobj)
x=len(pdfreader.pages)
pageobj=pdfreader.pages[x-1]
text=pageobj.extract_text()
file1=open(r"D:\Python_spyder\\1.txt","a")
file1.writelines(text)
file1.close()
print("hecho")
@hardcode1136 4 года назад
Hello @iknowpython sir! Sir actually I'm using this code on my pydroid3 on my Android phone. Actually it is creating the text file but not writing the all text into that .txt file from that pdf file. It only copies the heading from that pdf file. The pdf file have text, headers & footers only. Sir plz help🙏.
Thank you

Следующие

Автовоспроизведение

Automating Movie Description Script using Python (Package used imdbpy)