You are my hero. Whenever I have a question and need it answered with solid examples: your channel always appears with a video answering my question. A treasure trove for any data practitioner. Thanks.
I tried it for a XLSB file of around ~4GB after converting it with excel to CSV UTF-8, but it doesn't work. Don't know if Excel messed things converting, but seems like it has to much information that Pandas couldn't open, even trying to read it as Bytes or with readlines, or other libraries. I tried all the possible pd.read_XXXX methods of Pandas after trying all kinds of formats and its specific engine as pandas argument. Tried XLSB, CSV, ODS, XLS, XLSX, etc and neither of them were able to be read by Pandas. Excel opened the file in around 5-10 minutes, and Pandas couldn't read it, even after 90 minutes (My laptop has i9 12gen + NVME1TB + 32Gb RAM).
my large csv file is in Portuguese language, and i need to get them in English. can you tell me how? i have tried with Translator from googletrans module, that lead to errors in my case that too after a large span of waiting time.
You are my hero. Whenever I have a question and need it answered with solid examples: your channel always appears with a video answering my question. A treasure trove for any data practitioner. Thanks.
Thank you so much this was exactly what I need for slicing up an overly large .csv file. I am glad to have come across your video.
Very helpful. Thank you! 👍
I tried it for a XLSB file of around ~4GB after converting it with excel to CSV UTF-8, but it doesn't work. Don't know if Excel messed things converting, but seems like it has to much information that Pandas couldn't open, even trying to read it as Bytes or with readlines, or other libraries.
I tried all the possible pd.read_XXXX methods of Pandas after trying all kinds of formats and its specific engine as pandas argument. Tried XLSB, CSV, ODS, XLS, XLSX, etc and neither of them were able to be read by Pandas. Excel opened the file in around 5-10 minutes, and Pandas couldn't read it, even after 90 minutes (My laptop has i9 12gen + NVME1TB + 32Gb RAM).
Excelent !!! Thank you!
Hey, Soumil thanks for this Awesome code.
Thanks !! Helps a lot from Zipfile module !!
Will it work if the chunk remaining at last is less that the chunk size we defined?,And is the procedure same for xlsx file format?
Awesome code, Thanks soumil
Thanks you
where should I put my input file?
my large csv file is in Portuguese language, and i need to get them in English. can you tell me how? i have tried with Translator from googletrans module, that lead to errors in my case that too after a large span of waiting time.
You are just splitting the first 100 data in one CSV. How can you shuffle and store 100 randomly selected data in one CSV?
Can you create a video on scrapping a web page and store the data into S3 to use it in athena?
Where to keep sample file who's is to be splitted into many
Hiii sir
I have 3GB size of json file how to convert in CSV
what is this platform you are using?
Thank you !
Just open the file and use a generator for reading line by line
How could I avoid `the number order` at first column in splitted file?
.to_csv(file_name, index=False)
Noice … I have a 600 gb file .. I will try this and post the time taken.
Great method
How we can do that on xlsx file !
When make videos, always remember that every one not geneious as yourself to understand so fast rapid , go slow dear