Text is pretty small. PDFs can be hugely bloated but text with basic formatting doesn't require much at all. The text itself and a little bit of metadata about formats and positions and pages.
@@GreatMossWater even without huffman codes / similar compression methods text is really small, each character is in 1 of 256 states, whereas with an image every pixel has 3 colour values which are each in 1 of 256 states. This means that with no compression a single pixel of an image is equivalent to 3 characters.
@hedgeearthridge6807 I used to print Excel files to PDF - my invoices were commonly about 50kb. I then read the Adobe PDF v1.6 specification doc and worked out why these files were so large. Wrote my own library in Javascript, removed Excel from the process entirely, reduced human-interaction to just data-entry (thus reducing the scope for human-error in the process) while speeding up the production of and reducing the final size of my pdfs. My invoices are now around 2kb and are produced in about 1/4 of the time. I think I enjoyed reading that spec document far too much! 😆
@@nathansnail ignoring that Unicode is a bit bigger, most people write in only a fraction of that in repeating pattern we call words and considering that PDFs are compressed files using older zip deflate or "newer" lz4, zips usually take not that much space even when fonts are included in the file since most fonts are vector graphics which take up very little space and many PDFs only contain a fraction of the font
I remember at my school library there were a bunch of choose your own adventure books called "Combat Heroes" which had book based maze adventures, and even allowed you to cross-reference pages and fight someone reading the opponent's book (White Warlord vs. Black Baron, or Scarlet Sorcerer vs. Emerald Enchanter). This PDF version is a smart recreation of the system.
I've made PDFs for my work that were on the order of 200,000+ pages that i ended up having to break down the file because adobe bogged down and ran way too slow with that many pages. If you are wondering, they were conformity packages that contained the entirety of certs, inspections and testing for every single part that went on a plane for a small airplane manufacturer.
There's a 1.6GB PDF file on our server. Discovery in a legal case. The best part is that the idiot who put it together attempted to redact things by drawing black boxes over things. Thank you kind person for showing us where the juicy stuff was, and allowing us to see it!
this is exactly the kind of stuff I like hearing about. it reminds me that the job I do isn't JUST expertise, or passion, or smarts, but the intersection of all of those things at once. the fact that someone's job can be "computers" in any capacity, and they're able to have such a misunderstanding of even the basic tech they're working with that they're able to *put a black box over some text without removing said text* and think "that oughta do it", it serves as a small, albeit powerful reminder for me, that even the most automatic, zero-thought professional impulses I have every day on the job are things that are absolutely worth being paid for. the best part is that experience just builds upon all that, and if I'm worth it as I am now, it's not going to get any less true. rambling aside, long story short, I wish I could tell every self-taught IT kid out there the same. if you care about your work, and you're always seeking to expand your knowledge, and can accept when you're wrong and learn better, then you're already better than everyone your age who's just doing it because they think it's an easy salary.
I loved the tacit recommendation of Sumatra PDF reader. I'm always on the lookout for quick "light weight" apps for everyday stuff. I'd love a video of about light weight apps
FoxIt had the chance to be the go-to Adobe alternative, but they blew it. 🤦 Sumatra ftw (I keep both versions 2 and 3; I hate when a program changes too much, hence also versions 1 and 2 of CDisplay/EX).
I recently started down this lite program rabbit hole again. Right now I'm testing out lite spreadsheet programs for when I need a little more than the calculator to quickly compare some things (mostly for video games), but not open a full-blown program like Excel/Google Sheets.
i used to use IrfanView as my picture viewer app, because the one that came with windows was slow even on super fast hardware... but right now i don't think there is much of a difference....
@@TriglycerideBeware It's kinda weird LEO/Businesses have access to a webportal version that splits everything out into manageable indexed pages. The .pdf is a public facing version of the full archive thats built once a week when changes are made to the main system.
I remember getting my rental history from movie gallery in PDF as an email I could download that was a little more than 25,000 pages then swapped from AOL to yahoo and lost it. It started out as a Mom & Pop then was bought by Movie Gallery, and I was on an account with my parents' and sibling's. The history went back to the late 70's early 80's with countless movies and games rented and bought from there.
8:08 If that Character Info actually works on every screen (and the game is not 100% scripted/predicted), than this not only doubles the pages. There is HP, MP and Gold and you would need a page for every (at this stage in the game) possible combination of those three values.
The thickness of a book page is 0.12mm, if you print a pdf that contains 1 million pages in a book, that book would be 60 meters thick. That's just insane
4:48 Old-school sysadmins would call this one a zip bomb, xml bomb, or similar depending on the exact filetype on context used to generate this sort of recursive expansion of a file. Can cause plenty of havoc when your parser isn't designed to detect and shortcut cases like this.
Years ago I moved to SumatraPDF because I was working with a 100,000 page PDF that always crashed acrobat after a couple of seconds. It was the technical reference manual for a system-on-a-chip.
My only gripe with sumatra pdf is that it can't edit forms, which i found it is apparently somewhat proprietary, but it will happily open massive textbooks etc
The printer settings and preview are much better in Adobe Reader or nonexistent in Sumatra. For everything else that can be done with PDFs, Sumatra is probably the best choice.
I found a pdf of every character in Unicode that was 2,700 pages long and thought that was the longest document ever, but that’s nothing compared to these.
I had a legal PDF the other day that was just shy of 4GB that was several hundred thousand pages long. strangely enough the 32bit version of adobe kept crashing trying to open it. the 64bit version worked fine, once opened it was over 24Gb in ram.
The game-book is just a Choose-Your-Own-Adventure book in PDF form, which already exist, but this one is just bigger and more tedious to make because of the graphics and interface requiring a lot more pages. 🤷 (CYOA aren't the only ones like this, they were just the biggest and most famous; there was also the popular _Fighting Fantasy_ series, but also plenty of other game-books, including official D&D ones. I think I had one called DragonQuest-not the Anne McCaffrey book.)
Largest pdf that wasn't a gimmick for me was a 7200 page, 265 MB medical records pdf spanning 11 years. It was scans of scans of printed records lol. The OCR processing was a nightmare.
Once upon a time, early 1980s, I wrote a "one-line" BASIC program that was a Text Adventure game. The program had a one-line processing engine, and as many DATA statements as you like. Each DATA statement contain numerous fields and represented a room / situation.
I wrote a script to write out the permutation of 10 for my math class for extra credit. I tried to put it all into a text file then to a pdf but i accidentally made 1 page per number. I literally couldn't open the file so i just kept it as a txt file instead
that game pdf take me back to "choose your own adventure" book. time before I know about internet that was the most surprising book I found in the school library.
The largest pdf i've ever seen is the control flow of a game represented as a directed graph (i thought it would be useful when i made it) the graph has dimensions of 1km by 1m. Entirely impractical but quite funny when my code finished processing and generated the graph.
I still remember the biggest PDF I encountered in the wild. It was a database of prime numbers with special properties. In total, it was more than 300,000 numbers over 19,148 pages. Why it was distributed as a PDF, I'll never know.
While not a large document in page count at under 650 pages, the largest PDF I have in my fast archive is NTRS 19730024039 "Study of Alternate Space Shuttle Concepts Volume II Part I Concept Analysis and Definition" from Lockheed for NASA at over 820 megabytes. Its companion (Volume II Part II) is another 649 pages at nearly 250 megabytes.
My company has a tool which creates a PDF with GTIN barcode labels to stick on fashion items. One (tiny) page per item, for a big purchase order (i.e. what we are ordering from our supppliers, to later sell to customers). So when you order like 200 articles, and 1000 items for each, you easily get 200000 pages. For smaller orders, people do print them (on a label printer). (For the big orders we usually got some third-party printing service where we deliver the data in a different way.) I think we should have used the pagegroup trick to make these less huge files, as many pages are just duplicated, but we didn't yet.
The largest PDF file I've ever seen is 174GB. It was generated by a bug in a PDF printer which just kept printing out a single PDF file endlessly, with the user completely unaware. It would've been bigger, but 174GB was all that remained on the file server at the time. We were able to respond and solve the problem in 15-20 minutes, but oh my god I've never seen anything like it before
Well, not a pdf but, I generated a file that output to a text file 0 to 2,147,483,647 with "" for translation strings. Took forever to generate but, sadly, I cannot open it because the only program I could partially manage to open it was notepad++. (There is a 2 GB filesize limit on opening.) The file is "22.9 GB (24,658,692,654 bytes)" I saved it to my external drive just to say I have it. :P
That PDF game made me think of the old "Choose Your Own Adventure" and the "Which Way" books that I saw in the library when I was in elementary school....
I always wondered how big PDFs can get and how many pages you can actually pack into them, considering thr file size. Never knew that you could play games on them as well. Mind blown 😂 Now I'm curious to see how big other file formats can get I remember a few years back looking for the largest PNG out there. I forgot what it was thought.
Theoretically PNGs can be 2147483647 x 2147483647 pixels in size, multiply that by 4 for an uncompressed 32bit image. It won't fit on any storage medium in the universe. However, most editors and viewers imply some limitations, 64K x 64K pixels should be sufficient.
When I was about 6, I remember using Word on a school computer and trying to see how big i could make a document by just [ctrl+A] [ctrl+c] [ctrl+v] over and over until I accidentally crashed the computer. I was so impressed because I was so curious and youn but anyways this video reminded me so thought i'd share
I have a 10,0000-page PDF that is full of SCANNED pages. It's construction documents (bue prints and spec sheets for components etc) for a large building, and the file is just over a gigabyte. It's ridiculous. I cannot share it for various reasons.
@PWingert1966 My biggest ones were my computer science assembly class and my neurobiology class. I concatenated all the lecture slides for open note tests when Zoom classes were mandatory. ctrl + f is my hero lol
Funnily enough I tried to do the same thing with the Unreal Engine Github repository since it was just small enough to fit within GPT's upload limits, and GPT kinda sucks for Unreal when compared to other programs since a lot of the documentation is restricted from viewing prior to agreeing to Epic's developer agreements. Needless to say it didn't really work since GPT would just give up after a few attempts when trying to search through such a massive repository to find the answers to rather broad topics.
What would be a better way to store information? My company runs reports that scour our entire database (which is spaghetti code) which literately freezes production.
Doesn't your database solve that problem? And alternatively, making a program that reads and displays some serialized data, like FlatBuffers, Protocol Buffers, or even something like JSON if it's not that large
Attorneys deal with large pdfs regularly. When you request production of communications between people, you often get pdf export of emails from 10s of relevant people for years and years of entries. It's a nightmare to having to sort through to find evidence so I always prefer to handle such large evidence in pst or msg or other native email files. I'm sure there are legal reference books that are millions of pages long out there. It's not uncommon for a legal refence series to spank 10s of THICK books spanning multiple bookshelves.
So... a 200,000 page text adventure, cool. I gotta look into Undying Dusk. I have fond memories from many years ago of playing text adventures on my Commodore 64 (yes I'm that old)
If we're talking filesize, the biggest PDFs will be basically image files. I scanned some of my books and the filesizes are larger than most of the ones mentioned in this video despite having a fraction of the pages.
The larges i've seen and actually used was a report from a certification lab for a complex standard. The short version of the report was around 100 pages, and the detailed version was around 10 000 pages.
I was legit thinking you were going to go out and surprise us by instead going to page size. Pdfs support ridiculous page dimensions. In Acrobat 7.0, you can change the UserUnit size of a pdf to 75000 at most, giving you a page dimension size of 15 million by 15 million inches at most. That's enough to cover a sizable chunk (about half) of Germany
It's not really surprising to have a tiny PDF file generating an obscene amount of pages, because PDFs are Postscript, and Postscript is a Turing-complete programming language, so you basically have a loop that generates zillions of pages...
A lot of textbooks easily reach the order of the thousands of pages. I think 10000 is the upper limit before they start to split them into volumes rather than a single book.
They aren't public, so I can't share them, but I work with multi-thousand page PDFs at work every week. One projects we have at work has surveys that get mailed out and I have to parse the PDF before printing. Those survey PDFs are anywhere between 3000 and 12000 pages depending on how many recipients we have that week.
I don’t know what’s going on with RUclips’s audio tracks feature. I’m a German in Germany and it always selects the Polish audio. Maybe it decided to do that because it’s geographically a close language (and German is not available), except these languages are not similar at all. I added English as 2nd language to my Google account but somehow it still decides to use any other language but the original English audio track. It’s really annoying and maybe I’m not the only one. I don’t think this feature is any good to reach a bigger audience the way it doesn’t quite work atm.
I'd love to see a series about the largest, biggest, etc file of all kinds of different formats. Why for example can't I make Photoshop files larger than 10kx8k (or whatever amount exactly) pixels? Why does Google sheets only allow 5000 rows while Excell goes up to 1048576 rows? Is there a max size that databases can have? The Full Human Genome Project is also downloadable. Not sure how large it is, but believe you can download it as a txt file that is dozens of MB.
I'd imagine the longest theoretical useful PDF would be a full printout of Wikipedia in the length of 100,000,000s of pages. There's got to be a tool out there that can do that.
You could probably download all the articles on Wikipedia (the text only) and make a massive PDF out of it, if we're counting only useful documents. They have a single-file dump that currently decompresses to 86 GB (pages-articles-multistream.xml.bz2). Wikipedia in English has 6.7M articles, if you include all languages it's almost 60M.
This is a topic that only a nerd would enjoy. I love it. Just the most random thing to geek over but it's actually kind of insane. Just imagine if these were printed.
Coming in hot with the useful practical tutorials 😤🤡
wait what 9 hours ago? this was 4 mins ago
Hi thio!
@@thatonehenward4275 you can make a video available to channel members earlier than normal viewers
@@thatonehenward4275people who Join ThioJoe's membership can acces his videos early, that's how😂
Largest PDF I've used was the encyclopedia of industrial chemistry at around 28000 pages
Now all we need is someone to port Doom to PDF.
That's what I was thinking too.
impossible
Do it. Please 😂
Pdfs are actually written in postscript which is Turing complete. Some varients aren't though. That's how people used to hack printers with pdfs
cant wait for bad apple on pdf
What's funny is that Adobe invented the PDF format yet Sumatra does better than their own flagship product
Adobe is like McAfee, they've devolved into malware. 😒
Adobe in a nutshell.
Also, in my experience, Dropbox viewer and GDrive viewer handle certain languages better than the official adobe pdf app
The thing about adobe is that it's massively bloated. It has a shit load of features, but most of them are useless to most of the users.
Many years ago I got tired of Adobe being so bloated and been using SumtraPDF for years now :)
My largest PDF was 180GB due to me accidentally putting the entire photo gallery into it (863.727) Pages
I wonder how you did....
The size efficiency of PDFs is always astounding to me. A small book with full typesetting and everything can be smaller than a phone camera jpeg.
Text is pretty small. PDFs can be hugely bloated but text with basic formatting doesn't require much at all. The text itself and a little bit of metadata about formats and positions and pages.
Books do have a lot of the same words that can be repeated with a character or symbol, like using abbreviations but in code.
@@GreatMossWater even without huffman codes / similar compression methods text is really small, each character is in 1 of 256 states, whereas with an image every pixel has 3 colour values which are each in 1 of 256 states. This means that with no compression a single pixel of an image is equivalent to 3 characters.
@hedgeearthridge6807
I used to print Excel files to PDF - my invoices were commonly about 50kb.
I then read the Adobe PDF v1.6 specification doc and worked out why these files were so large.
Wrote my own library in Javascript, removed Excel from the process entirely, reduced human-interaction to just data-entry (thus reducing the scope for human-error in the process) while speeding up the production of and reducing the final size of my pdfs.
My invoices are now around 2kb and are produced in about 1/4 of the time.
I think I enjoyed reading that spec document far too much! 😆
@@nathansnail ignoring that Unicode is a bit bigger, most people write in only a fraction of that in repeating pattern we call words and considering that PDFs are compressed files using older zip deflate or "newer" lz4, zips usually take not that much space even when fonts are included in the file since most fonts are vector graphics which take up very little space and many PDFs only contain a fraction of the font
I remember at my school library there were a bunch of choose your own adventure books called "Combat Heroes" which had book based maze adventures, and even allowed you to cross-reference pages and fight someone reading the opponent's book (White Warlord vs. Black Baron, or Scarlet Sorcerer vs. Emerald Enchanter). This PDF version is a smart recreation of the system.
that's so epic omg
A PDF game, now that's pretty unique!
People will try to make games on absolutely anything they can ^^
i've seen a ppt game too !
I've seen a game on desmos graphing calculator
they should turn it into a PowerPoint game next
Ever heard of Choose Your Own Adventure?
I've made PDFs for my work that were on the order of 200,000+ pages that i ended up having to break down the file because adobe bogged down and ran way too slow with that many pages. If you are wondering, they were conformity packages that contained the entirety of certs, inspections and testing for every single part that went on a plane for a small airplane manufacturer.
Cool!
There's a 1.6GB PDF file on our server. Discovery in a legal case. The best part is that the idiot who put it together attempted to redact things by drawing black boxes over things. Thank you kind person for showing us where the juicy stuff was, and allowing us to see it!
this is exactly the kind of stuff I like hearing about. it reminds me that the job I do isn't JUST expertise, or passion, or smarts, but the intersection of all of those things at once. the fact that someone's job can be "computers" in any capacity, and they're able to have such a misunderstanding of even the basic tech they're working with that they're able to *put a black box over some text without removing said text* and think "that oughta do it", it serves as a small, albeit powerful reminder for me, that even the most automatic, zero-thought professional impulses I have every day on the job are things that are absolutely worth being paid for. the best part is that experience just builds upon all that, and if I'm worth it as I am now, it's not going to get any less true.
rambling aside, long story short, I wish I could tell every self-taught IT kid out there the same. if you care about your work, and you're always seeking to expand your knowledge, and can accept when you're wrong and learn better, then you're already better than everyone your age who's just doing it because they think it's an easy salary.
I loved the tacit recommendation of Sumatra PDF reader. I'm always on the lookout for quick "light weight" apps for everyday stuff. I'd love a video of about light weight apps
FoxIt had the chance to be the go-to Adobe alternative, but they blew it. 🤦 Sumatra ftw (I keep both versions 2 and 3; I hate when a program changes too much, hence also versions 1 and 2 of CDisplay/EX).
Yes, this. A video on useful lightweight, open source programs for every day stuff would be nice.
I recently started down this lite program rabbit hole again. Right now I'm testing out lite spreadsheet programs for when I need a little more than the calculator to quickly compare some things (mostly for video games), but not open a full-blown program like Excel/Google Sheets.
zathura with poppler on top
The biggest PDF I ever seen was intel manual, almost 5k pages
That's what I was thinking
i used to use IrfanView as my picture viewer app, because the one that came with windows was slow even on super fast hardware... but right now i don't think there is much of a difference....
The full vulkan documentation is over 5k pages long too
When I was in the uni, it had under 4k pages if I remember correctly. But of course I didn't read it all, just searched for instructions I needed.
Biggest I've seen is the RCMP's(Canadian FBI) firearms reference table. 225MB and 105205 pages and growing.
It chokes out most PDF readers.
Yeah I would imagine some legal documents can be MASSIVE.
Crikey!!
Hey, I've seen this one on Libgen! Although they claim it's only 104,931 pages
@@TriglycerideBeware It's kinda weird LEO/Businesses have access to a webportal version that splits everything out into manageable indexed pages.
The .pdf is a public facing version of the full archive thats built once a week when changes are made to the main system.
@@antiKhaos I see, interesting. That explains the discrepancy
That interactive PDF RPG game is the coolest thing I've seen in a long time. Thanks.
I remember getting my rental history from movie gallery in PDF as an email I could download that was a little more than 25,000 pages then swapped from AOL to yahoo and lost it. It started out as a Mom & Pop then was bought by Movie Gallery, and I was on an account with my parents' and sibling's. The history went back to the late 70's early 80's with countless movies and games rented and bought from there.
8:08 If that Character Info actually works on every screen (and the game is not 100% scripted/predicted), than this not only doubles the pages.
There is HP, MP and Gold and you would need a page for every (at this stage in the game) possible combination of those three values.
that explains the ridiculous size
The thickness of a book page is 0.12mm, if you print a pdf that contains 1 million pages in a book, that book would be 60 meters thick. That's just insane
4:48 Old-school sysadmins would call this one a zip bomb, xml bomb, or similar depending on the exact filetype on context used to generate this sort of recursive expansion of a file. Can cause plenty of havoc when your parser isn't designed to detect and shortcut cases like this.
Years ago I moved to SumatraPDF because I was working with a 100,000 page PDF that always crashed acrobat after a couple of seconds. It was the technical reference manual for a system-on-a-chip.
My only gripe with sumatra pdf is that it can't edit forms, which i found it is apparently somewhat proprietary, but it will happily open massive textbooks etc
Acrobat is as bad as McAfee at this point, installing all kinds of virus-like garbage.
The printer settings and preview are much better in Adobe Reader or nonexistent in Sumatra. For everything else that can be done with PDFs, Sumatra is probably the best choice.
I found a pdf of every character in Unicode that was 2,700 pages long and thought that was the longest document ever, but that’s nothing compared to these.
I had a legal PDF the other day that was just shy of 4GB that was several hundred thousand pages long. strangely enough the 32bit version of adobe kept crashing trying to open it. the 64bit version worked fine, once opened it was over 24Gb in ram.
Under normal circumstances, a 32 bit process can only access 4GB of RAM at most, so it isn't that strange that a 32 bit program would fail
The game-book is just a Choose-Your-Own-Adventure book in PDF form, which already exist, but this one is just bigger and more tedious to make because of the graphics and interface requiring a lot more pages. 🤷 (CYOA aren't the only ones like this, they were just the biggest and most famous; there was also the popular _Fighting Fantasy_ series, but also plenty of other game-books, including official D&D ones. I think I had one called DragonQuest-not the Anne McCaffrey book.)
I remember the Lone Wolf series written by Joe Dever. Still own "The Cauldron of Fear" in German.
Largest pdf that wasn't a gimmick for me was a 7200 page, 265 MB medical records pdf spanning 11 years. It was scans of scans of printed records lol. The OCR processing was a nightmare.
Once upon a time, early 1980s, I wrote a "one-line" BASIC program that was a Text Adventure game. The program had a one-line processing engine, and as many DATA statements as you like. Each DATA statement contain numerous fields and represented a room / situation.
No way, I was also thinking of the ARM CPU architecture reference manual! I needed to use it in one of my computer science classes.
Try TriCore :D Part 1 and part 2 technical reference manual, both with over 4500 pages.
That file singlehandedly made buy another RAM stick
4:56 I've seen that PDF before. It actually said "Hello World" on all the pages, just most PDF viewers can't display that.
Fun fact. TI Sitara errata sheet is longer than their datasheet. Of course, both in PDF
Turns out they have a public GitHub with the source code, so tgey did indeed do this programmatically. Quite an achievement!
I wrote a script to write out the permutation of 10 for my math class for extra credit. I tried to put it all into a text file then to a pdf but i accidentally made 1 page per number. I literally couldn't open the file so i just kept it as a txt file instead
the PDF for my 73 Buick Centurion is pretty damn big... In total. I don't remember how big though.
Libgen Nonfiction has 5 English PDFs that are 100k+ pages. The largest one (by page count) is 144,218
that game pdf take me back to "choose your own adventure" book.
time before I know about internet that was the most surprising book I found in the school library.
The largest pdf i've ever seen is the control flow of a game represented as a directed graph (i thought it would be useful when i made it) the graph has dimensions of 1km by 1m. Entirely impractical but quite funny when my code finished processing and generated the graph.
I still remember the biggest PDF I encountered in the wild. It was a database of prime numbers with special properties. In total, it was more than 300,000 numbers over 19,148 pages. Why it was distributed as a PDF, I'll never know.
While not a large document in page count at under 650 pages, the largest PDF I have in my fast archive is NTRS 19730024039 "Study of Alternate Space Shuttle Concepts Volume II Part I Concept Analysis and Definition" from Lockheed for NASA at over 820 megabytes. Its companion (Volume II Part II) is another 649 pages at nearly 250 megabytes.
Time to print those PDFs!
My company has a tool which creates a PDF with GTIN barcode labels to stick on fashion items. One (tiny) page per item, for a big purchase order (i.e. what we are ordering from our supppliers, to later sell to customers). So when you order like 200 articles, and 1000 items for each, you easily get 200000 pages. For smaller orders, people do print them (on a label printer). (For the big orders we usually got some third-party printing service where we deliver the data in a different way.) I think we should have used the pagegroup trick to make these less huge files, as many pages are just duplicated, but we didn't yet.
The largest PDF file I've ever seen is 174GB. It was generated by a bug in a PDF printer which just kept printing out a single PDF file endlessly, with the user completely unaware. It would've been bigger, but 174GB was all that remained on the file server at the time. We were able to respond and solve the problem in 15-20 minutes, but oh my god I've never seen anything like it before
Well, not a pdf but, I generated a file that output to a text file 0 to 2,147,483,647 with "" for translation strings. Took forever to generate but, sadly, I cannot open it because the only program I could partially manage to open it was notepad++. (There is a 2 GB filesize limit on opening.) The file is "22.9 GB (24,658,692,654 bytes)" I saved it to my external drive just to say I have it. :P
Try vim. I use it at work to look at log files and stuff (it doesn't limit the filesize you can open and tends to perform better on very large files).
Notepad++ only has 2GB size limit on the 32-bit version. Download the 64-bit version and you should be able to open it
Can you, perhaps, add hundreds of massive pdfs to chatgpt and ask it to make smth with them?
That PDF game made me think of the old "Choose Your Own Adventure" and the "Which Way" books that I saw in the library when I was in elementary school....
Those were so enjoyable back then. I liked a but of control.
It kinda got out of control in school text books.jeje.😊
@@v.prestorpnrcrtlcrt2096
I wonder if people still read them, and even write them??
I think the biggest PDF I’ve ever worked with is the 900 page DC employee salary list
Which of course now is somehow leaked?
...and half of them are still working from home playing video games when they should be working.
@@johanponken nope, it’s public information. Looking up “DC Public Body Employee Salaries” should bring up the page.
I always wondered how big PDFs can get and how many pages you can actually pack into them, considering thr file size. Never knew that you could play games on them as well. Mind blown 😂
Now I'm curious to see how big other file formats can get I remember a few years back looking for the largest PNG out there. I forgot what it was thought.
Theoretically PNGs can be 2147483647 x 2147483647 pixels in size, multiply that by 4 for an uncompressed 32bit image. It won't fit on any storage medium in the universe. However, most editors and viewers imply some limitations, 64K x 64K pixels should be sufficient.
@@Kobold666 challenge accepted. And just how big in size would 64k x 64k be? 👀
@@TheArtofKAS That would be less than 16GB.
@@Kobold666 just enough to fit on a random flash drive. Got it. Thank you my friend 👌🏾
Now I want to port Myst to PDF...
When I was about 6, I remember using Word on a school computer and trying to see how big i could make a document by just [ctrl+A] [ctrl+c] [ctrl+v] over and over until I accidentally crashed the computer. I was so impressed because I was so curious and youn but anyways this video reminded me so thought i'd share
I wanna beat the PDF game!! How cool of an idea!
I knew as soon as i saw the thumbnail that this had to be an export from MSDN 😂😂😂
Wait until someone finds the 1.9 billion page Excel file
Man you need to see my course lectures!
I have a 10,0000-page PDF that is full of SCANNED pages. It's construction documents (bue prints and spec sheets for components etc) for a large building, and the file is just over a gigabyte. It's ridiculous. I cannot share it for various reasons.
That game reminds me of the choose your own adventure books that I read as a child. I never did get the good endings.
and i thought my 300 page lecture slide/notes pdf was huge
What subject?
@PWingert1966 My biggest ones were my computer science assembly class and my neurobiology class. I concatenated all the lecture slides for open note tests when Zoom classes were mandatory. ctrl + f is my hero lol
I love how you credit the stock images and the AI prompts.
"we've changed our terms and conditions, please just read this 78k pages pdf."
Funnily enough I tried to do the same thing with the Unreal Engine Github repository since it was just small enough to fit within GPT's upload limits, and GPT kinda sucks for Unreal when compared to other programs since a lot of the documentation is restricted from viewing prior to agreeing to Epic's developer agreements.
Needless to say it didn't really work since GPT would just give up after a few attempts when trying to search through such a massive repository to find the answers to rather broad topics.
I definitely will check out undying dusk, looks like a cool game 😎
What would be a better way to store information? My company runs reports that scour our entire database (which is spaghetti code) which literately freezes production.
Doesn't your database solve that problem?
And alternatively, making a program that reads and displays some serialized data, like FlatBuffers, Protocol Buffers, or even something like JSON if it's not that large
Still shorter than many companies' terms and conditions
You can just merge multiple PDFs together so as long as there is no defined limit, you can create PDFs as large as you like
I was recently preparing pdf with ~32000 pages, including maps and images, with resulting file size of 32 gigs. It was unusable at all.
just when i thought pdf viewer couldn't possibly a game engine
Attorneys deal with large pdfs regularly. When you request production of communications between people, you often get pdf export of emails from 10s of relevant people for years and years of entries. It's a nightmare to having to sort through to find evidence so I always prefer to handle such large evidence in pst or msg or other native email files. I'm sure there are legal reference books that are millions of pages long out there. It's not uncommon for a legal refence series to spank 10s of THICK books spanning multiple bookshelves.
But if they need to produce all these documents for say a NTG or NTP, wouldn't they be individual files, not one massive combined pdf?
My complete medical history is coming up on 1k pages soon 😅
So... a 200,000 page text adventure, cool. I gotta look into Undying Dusk. I have fond memories from many years ago of playing text adventures on my Commodore 64 (yes I'm that old)
That game looks cool!!
If we're talking filesize, the biggest PDFs will be basically image files. I scanned some of my books and the filesizes are larger than most of the ones mentioned in this video despite having a fraction of the pages.
The larges i've seen and actually used was a report from a certification lab for a complex standard. The short version of the report was around 100 pages, and the detailed version was around 10 000 pages.
I was legit thinking you were going to go out and surprise us by instead going to page size. Pdfs support ridiculous page dimensions.
In Acrobat 7.0, you can change the UserUnit size of a pdf to 75000 at most, giving you a page dimension size of 15 million by 15 million inches at most. That's enough to cover a sizable chunk (about half) of Germany
It's not really surprising to have a tiny PDF file generating an obscene amount of pages, because PDFs are Postscript, and Postscript is a Turing-complete programming language, so you basically have a loop that generates zillions of pages...
Cool, a game in a PDF!
ThioJoe is the guy who is the best at staring into your soul
at work we have some pdf's with 140k pages, they literally take ages to load up and it crashes like 7/10 times
i found one that is like 700 pages only
I bet Boeing or Airbus must have some really huge pdfs. The complexity of a modern plane is unimaginable.
0:05 the 2.2 gd level editor guide
thats only 200 pages long
@@F6347_VRit’s the largest pdf I have ever seen
I’m interested in hearing about chat gpt’s troubleshooting performance with thee specialised dataset? Worth it?
Biggest pdf i've seen is areoind 5m pages. Containing printing work for a infustrial printer. Works better than you would expect
Your average digital music theory book:
heh... that pdf game is like hyperstack from the mac.
Do a review of PDF software! What are people using to make these documents?
IrfanView→PDFreDirect -- old-school
5 thousand pages. It's my Cpp Textbook.
A lot of textbooks easily reach the order of the thousands of pages. I think 10000 is the upper limit before they start to split them into volumes rather than a single book.
This is the most ThioJoe video over. Ofcourse it's gotta be Microsoft that has a 78k page PDF lmfao.
They aren't public, so I can't share them, but I work with multi-thousand page PDFs at work every week. One projects we have at work has surveys that get mailed out and I have to parse the PDF before printing. Those survey PDFs are anywhere between 3000 and 12000 pages depending on how many recipients we have that week.
Does it take more ram than Chrome?
what about current documentation (split up) if you were to put them together
I would guess the longest PDFs are stuff like government documents and reference books.
Brb, just gonna print the 78k PDF from my phone... using 3G mobile data 😬
I don’t know what’s going on with RUclips’s audio tracks feature. I’m a German in Germany and it always selects the Polish audio. Maybe it decided to do that because it’s geographically a close language (and German is not available), except these languages are not similar at all. I added English as 2nd language to my Google account but somehow it still decides to use any other language but the original English audio track. It’s really annoying and maybe I’m not the only one. I don’t think this feature is any good to reach a bigger audience the way it doesn’t quite work atm.
is the pdf game not a choose your own adventure book but with thew enhancements that a pdf can bring
idk man...
Once, I got a PDF sized 100 MB from my material professor, no one can open it..
need a .zip of this
And that's just the EULA
I'd love to see a series about the largest, biggest, etc file of all kinds of different formats. Why for example can't I make Photoshop files larger than 10kx8k (or whatever amount exactly) pixels?
Why does Google sheets only allow 5000 rows while Excell goes up to 1048576 rows? Is there a max size that databases can have?
The Full Human Genome Project is also downloadable. Not sure how large it is, but believe you can download it as a txt file that is dozens of MB.
All this needs paperless ngx
Largest I've experienced is my motherboard manual which was 147 pages
I'd imagine the longest theoretical useful PDF would be a full printout of Wikipedia in the length of 100,000,000s of pages. There's got to be a tool out there that can do that.
You could probably download all the articles on Wikipedia (the text only) and make a massive PDF out of it, if we're counting only useful documents. They have a single-file dump that currently decompresses to 86 GB (pages-articles-multistream.xml.bz2). Wikipedia in English has 6.7M articles, if you include all languages it's almost 60M.
This is a topic that only a nerd would enjoy. I love it. Just the most random thing to geek over but it's actually kind of insane. Just imagine if these were printed.
My servers error log is at apx. 1.2M pages , 55 lines per page. Yes it's a pdf and weights 1.4GB
One could possibly make a turing machine that run on PDF
Imagine a document about programming compiler titled "20,000 Pages Under The C"