Acropalypse Now - Computerphile
HTML-код
- Опубликовано: 27 сен 2024
- Researchers stumbled upon a simple but worrying bug. Cropped images from Pixel phones contained a great deal of the original image in the cropped file. Drs Steve Bagley & Mike Pound explain.
Mike's sources:
/ itssimontime
/ david3141593
www.da.vidbuch...
Proof of concept: acropalypse.app/
Waiting for someone to spot that I fixed my typo on the text messages illustration but didn't fix it on the original -Sean
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottsco...
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com
All you have to do is crop the image, take a screenshot of the cropped image, crop the screenshot, and then send the original cropped image because you couldn't remember which was which.
They called me mad when I used GIMP for cropping all my images!! Well now who's the crazy one?! Ahaha AHAHAHAHA
but are you sure GIMP isn't affected?
@@NoNameAtAll2 yup it is.
Source: trust me bro
😂😂😂
@@NoNameAtAll2 I've used gimp to crop screenshots on desktop and the filesizes genuinely do get a lot smaller once cropped
@@NoNameAtAll2 yes
There's a very similar issue in biology. The mechanisms that reads genes can miss its equivalent of an iEnd, a stop codon, and keep on reading until it reaches another stop codon. Some viruses actually use this as a way of compressing their genome by having a protein whose full-length version and "cropped" version do different things.
I'm loving the editing on this one. The sequence at 3:20 is particularly cool
Yes, Sean did a good job.
Yep, as soon as I saw that I instantly understood the concept 👍
I was gonna comment this too. Very cool!
6:31 in addition to removing metadata to protect the user privacy, another reason to re-encode images that are uploaded is to ensure that they are actually images, as well as constraining image sizes, protecting website itself from harm.
Reencoding image files is a neat one-stop solution to ensure that they are (technologically) safe to consume.
Funny part is that about 20 years ago i had a website for images online, and I had a problem with people hiding rar files inside their pictures, so I implemented the re-encoding part that is needed.
So my service fixed this issue 20 years ago, to be fair it died maybe 2 years later but still!
@@gunnargu Yeah, in 2006 I had an issue with users uploading files that were not images and also started re-encoding on upload before saving. It was an event website for a one-time event, so it was only around a couple years, too!
The only gotcha is if the image reencoder itself has security vulnerabilities that an attacker could exploit by uploading a maliciously crafted file.
@@majorgnu well actually that's not that big of a deal except if that vulnerability allows somehow to alter how the reencoding of all future images will be done.
Great dynamic in this video with Steve and Mike! Good format
I don't know. Should have left him in the background.
It's almost as funny as when Microsoft thought it would be a good idea to save a Word document's edit history in its file. It was not.
At least for me, there's some instances when recovering the history would have been a fantastic idea. But I can see that it could be radically abused too.
It’s a great idea but there’s needs be a way to save for sharing.
Or in pre-SP2 Windows XP there was a "feature" to send system pop-ups via the network. They looked weird, but it was enough to scare someone.
@@4.0.4 the Messenger service was new for Windows NT!
@@andrewahern3730 Yeah its called exporting to pdf. Word docs aren't technically supposed to be shared as final documents, they're more like source code or photoshop project files.
Funny part is that about 20 years ago i had a website for images online, and I had a problem with people hiding rar files inside their pictures, so I implemented the re-encoding part that is needed.
So my service fixed this issue 20 years ago, to be fair it died maybe 2 years later but still!
I was big into steganography as a kid, we might have "interacted" in the past.
Ok but that text message conversation at 5:42 is actually kind of hilarious ngl
DIVA! lol.. it was a sweet bit of humour thrown in.
Going to Nottingham for a masters in CompSci in September. This channel has always been on my recommendation list. Will be great to learn from you guys in real life.
This is sort of reminiscent of the situation with people using social media-supported "stickers" to cover up information on pictures uploaded to those platforms. So you'd upload a picture to the site, and place a sticker over the bit you don't want to be visible, but it turns out those stickers were implemented as extra bits outside of the actual pixel data, so all the original pixel data is just there. Same with putting black boxes over text in a PDF editor or something. It really does play with the definition of "bug" in that something could totally be following the spec, but it's not reasonable to expect people to know those details of the spec and factor that in when doing normal things like editing a picture on their phone.
Similar with botched censorship of PDF files, where people try to draw black boxes but the original text is still under there. Some high profile information has leaked out this way....
@@AySz88 yeah I did have a redacted legal agreement sent to me where they'd just drawn black boxes everywhere...
I just deleted the boxes 🤣🤣🤣
@@ChrisLee-yr7tz what did they say?
Let's be serious, most of the time it's being used by OF egirls promoting their stuff.
@@keepyoursins Maybe he was not supposed to make that comment and he was being watched, and will never come back with an answer for us.
I really hope he is okay.
Props to the editor for the diagram at 3:25
best video since a time. those 2 rock together
The banter from these two is gold.
I'd love to see more videos like this one where there are two experts discussing a topic in a conversational tone while presenting it to the camera. Probably depends on the people involved whether that kind of setup is going to work but at least in this case it worked great.
I suspect the switch of the crop/overwrite behaviour may have either come through accomodating SSDs or no longer worrying about accomodating HDDs. In the case of truncate and replace, either it was beneficial for HDDs because it made it more likely to find contiguous blocks (for read/write performance) and the code was deleted/changed to be simpler, or it resulted more blocks changing on SSDs and thus would wear down the drive faster and so was changed, but they forgot to truncate the data after the end.
Another possibility, unrelated to SSDs is that the library handling the image and/or cropping did so by mapping the file into memory, and simply manipulating it in-place there, forgetting to shrink the mapping at the end of the crop. This would save memory because at any time the OS can drop unchanged parts of the file from memory for free if it needs memory, and write the changed parts back to the file at the end or if it needs that memory too. Also, on load thanks to IEND, it would never need to page-in the data after the end of the image because it's never accessed so the large output file would not cause ram usage to increase.
I would still classify it as a bug, since the previous behaviour didn't result in extra large files and the change didn't replicate that.
11:35 The chunk headers in PNG contains the data size for that chunk. Each chunk is
SIZE: 4 bytes,
TYPE: 4 bytes
DATA: (SIZE) bytes
CHECKSUM: 4 bytes.
So you'd just start at the beginning, check the first chunk, skip the data and the checksum, and you're now at chunk 2. Do the same for this and the remaining chunks. When you reach IEND, you can be sure it's the real one. Any data after this isn't part of the image.
Two of my favourite Computerphile presenters presenting together. Yay! 🎉
Thank you!
Loved this tag-team combo. Do more!!! Also, brilliant editing by the video editor.
Screenshot-> Crop -> Screenshot-> Send
Absolutely SMASHING breakdown. Well done. You know what else? In the 90's, I remember being concerned over the fact that data headers would merely "ignore" old data and NOT zero it out. Sure, it was "faster" back in the day, but obviously... it leads to HUGE issues.
Null data should be overwritten at the end of each 'X' hour period. Or, some systems should zero-out unused bit space upon EVERY instance a file is saved. Again, yes, this is "slow". But obviously... the alternative consequences are far, far more detrimental.
🐲✨🐲✨🐲✨
I love the play on words. One of my favorite films and a great piece here to educate us. I just did a few tests and made sure my file sizes dropped way down when I cropped with the software(s) I use with any regularity.
when you see steve and mike in the thumbnail you know somethings rumbling
"The default changed" And this is exactly why I am always explicit with my parameters when making certain types of function calls.
Sounds like a feature to me. A joy to see.
The Carmageddon 2 reference is very much appreciated
On some phones, you can adjust the crop and even increase the portion shown after the fact.. In that sense it is a "feature", but then it should have to be damn guaranteed to not include any extra bits when you send the image!
Also a side note, I do not know how many times people have told me that I am paranoid only to have the truth to be revealed years later. 😅 I think we need a word for that!
You got "Alex Jones'ed"?
@@f.f.s.d.o.a.7294 well that would better fit the opposite of what he describes, wouldn't it?
Funny enough I've known about this bug for years, sometimes PNGs exported from photoshop will show masked data when used as a material map in 3ds max. It's distorted with blocks of colour and other artifacts in the "transparent" areas but enough to tell what was there. It's nice to finally have an answer for this bug. Cheers
That thumbnail is a masterpiece
gotta miss this duo, best duo in computerphile
That's so weird anyways. I didn't know any phones did that. For Samsung I know for example that it crops, makes an entirely new file called like "originalnamehereTEMP" to send that off through Share and loads of other phone just straight up save the full screenshot regardless and then you cropping it just makes another file.
3:28 edition showing what is happening with file is a great adition, tks for that :)
Nice to see you both again
This same cropping/redacting issue was happening in PDF files. When you redact a PDF file, the file is not actually modified permanently, and the redaction is "reversible"
Funnily enough, when you mention the text document at 8:00, this is exactly how microsoft word worked (with I assume the same kind of end header) until 2003 when they realised what trouble having data you don't want inside a file could cause.
Now I'm sort of glad that my phone only does the "save a new copy" as opposed to overwrite
How is it possible that this a new discovery, I’ve been dealing this “bug” for years because the iPhone X takes screenshots that are larger than discord’s file size limit and cropping them doesn’t do anything. There’s literally forums full of people asking why this is the case because it’s annoying, how did that somehow never make its way back to the people who make these systems if this is such a big problem??
I remember about 20 years ago, I downloaded a cropped image from the internet. In Windows I set it to show thumbnails. When I looked at the thumbnail I noticed it was the original uncropped image
When the video starts and you see Steve and Mike sitting together, you know sh*t has happened.
I thought this was gonna be about the movie apocalypse now.
7:36 very cleverly timed ad break
"maybe stand in the back and be a bit blurred"
A+
I particularly love exploits where the software works as intended, but the intended behaviour still leads to a bad outcome. The best example I can think of is using the visited pseudo-class to make visited or non-visited links transparent, and using mix-blend-mode to build a boolean truth table in CSS that results in 2^n letters being displayed on top of each other, with exactly one letter not being transparent. After rotation, translation, and overlay, this looks exactly like a captcha. The browser correctly blocks the malicious JS from probing the pixels of the captcha, which would leak whether each of the n websites are in your browsing history. But you don't need malicious JS, since when a human sees a captcha, the human solves the captcha, performing the attack for us. The JavaScript safeguards work properly, but if the human is tricked into copying a random-looking code into a text box, we don't care about finding a JS exploit or even if JS is entirely disabled. A bug-free browser can't prevent social engineering tricks like this, and it can't be fixed without breaking CSS/HTML5 compliance.
Easy solution: Convert the image to BMP, crop, then save as whatever format you wish.
Upvote if you paused and read the messages that got uncropped? Lol.
guess we have a new winner of the "underhanded c contest"
Superb video. Immensely entertaining and very knowledge-dense.
The "text" files in MS Word format often have this same sort of issue. If you run the "strings" command on the file, you will see stuff that was not supposed to still be there.
I'm a bit paranoid, but I've always assumed that cropped photos had this problem.
So the golden rule is always specify every parameter, even if the value happens to be the current default.
Fascinating video! Superb explanation and great editing, too.
These animations are amazing
Apocalypse Now is one of the best anti war movies imho, good choice for the video title :)
That's not bug in PNG format to be designed to allow additional information at the end of the file. Many file formats allow that and it's often to allow future and/or proprietary extensions of the format. That's bug in the phone software leaving information that was never intended to be left there. Saving new data over the old file and not cropping the result to appropriate length is sign of programmer's incompetence.
Wha't the most funny is, it doesn't even help anything with the phone's flash memory. Even in the best case it means worse wear of the flash memory than if the file was cropped.
I mostly cropped in the hope of saving bytes... what a fool. I believe most services run optimizers on all our uploads otherwise they would lost precious disk space
You'd think, privacy issues aside, you'd want to truncate the file just to save file size, if you were going to upload or text the file. Though maybe that's why it only changed recently. Image files, even at full resolution are not considered large... at least for a phone's screenshot I guess. I think my phone still downsamples files taken by the camera before sending over text message...
I remember coming across this issue on Windows photo apps, Android photo apps, and iPad cloud based photo apps. It was nothing critical but the size of image file will either not change or blow up after cropping. It didn’t happen with all photo apps at any given time, but this has been going on for 10 years I think in some of the apps. Anyway, I usually check file size after cropping, after noticing this bug years ago, but only when I am sending it by email or backing up externally. Word to PDF exporter and some free PDF writer apps have a similar problem. In most cases, writing to a new file and deleting the original fixes the issue.
On a Samsung phone you can go back to any cropped image in your gallery later and revert it to the original image - so it's storing the original data somewhere.
This is absolutely a bug. When I take a screenshot, or any photo on my phone for that matter, and I crop it, when I then send the photo, I expect no other additional data besides the bytes relevant to the final photo to be sent. This is an utter breach of privacy, and I'm very surprised to see tech giants such as Microsoft and Google allow customer data breaches like this to happen so easily. If you send a 16 pixel by 16 pixel PNG image, and it's 3 megabytes in file size, absolutely a data breach has occurred.
Having two pundits in the same video might not be the best idea.
GrapheneOS user here, I love bugs like this that screw over companies globally distributing spyware like Google.
This has been a known issue for ages.
Many years ago, I was on a dating site. Sometimes someone would send me a set of photos not in a .ZIP file, but as images in a Word document. Very often the person had used Word's image crop facilities, which do not remove any of the original image data, and you can simply go into each image's properties and uncrop back to the original.
And that's how you sussed out the fatties.
The bug reminds me of the old photoshop exif thumbnail problems in the early 2000s. That whole mess drove home how important crossing a file re-encode barrier is for anything requiring redactions.
This type of problem was actually known very early in the png life cycle... I remember a method for appending file data to a png that would allow you to read it as a png, but also to read it as only the appended data (I can't remember the second file format, but it had some variable length header and looked for a flag to find the readable part, similar to how mp3 tagging works)
Zip files store their directory metadata at the end of the file and encodes the locations of file entries as relative offsets, so you can append a zip file to a png without any modifications and the result is both a valid png and a valid zip.
@@remuladgryta I remember hearing about this like 10-15 years ago. Somebody made a png containing text explaining all this.
I think the zip file contained source code for some kind of cryptography that was in the process of being banned.
In context back then, really clever. In hindsight, kinda terrifying given what else could be shoved in the zip. Like a zip bomb, or something self executing
(IIRC, several languages can be packed as self executing zips, including Java)
@@remuladgryta that was the one, yup... been so long
@@ttthttpd more than 20 when I heard about it, since it was before the international standard. and yeah for the time very hairy
new to me that the old extra data isn't lost on the save
I may revenoticed it recently that croping a image for a job registry did not diminish the size of the file. I did dismiss that as totally irrelevant.
the IEND seems like try except. it doesn't make it more robust. it just makes the bugs invisible.
Saw them two in a same picture . Instant like.
once you send the photo online, it only sends the cropped part. you need access to the raw data on someone's hard drive to get access to uncrop the image. the file you upload can be much smaller than the original: and we havent somehow just used some sort of super magic compression method. its simply not there. maybe there was an issue on pixel where it was sending the full image for some reason, but it would be extremely obvious that this was happening based on the filesize alone. mass hysteria happening here
This reminds me of the 2008 Underhanded C Contest
careful when lifting phones...
within the second of timestamp 6:03 the phone screen lights up.
And with the max resolution I can read the guidelines back at the wall and I can just about almost read more than that
Thanks - I did check there wasn't anything important on the screen, tho some might wonder what P&T is lol -Sean
i've always been paranodi abotu this kind of stuff so whenever i want to crop something, i take a screenshot, copy the bit i want, paste it onto a new file before sending it lol
My general treatment of images I took on a phone is to import them into a computer for cropping, if I'm going to crop it at all. I can use the GIMP to take a section to the clipboard, open a new file with the contents of the clip and go from there. It's then up to me what I do with the original image-either leave it or delete it. Either way, my cropped image shouldn't contain any extraneous data outside the cropping area. However, I'm aware you don't always want to get back to a computer for that to do the "heavy lifting". It's just the way I've always edited images.
On the bright side, it's not that difficult to detect the presence of extra data past the end. No need for a deep understanding. Just parse it at a high level. Since each chunk specifies its size, you can skip that far, and then you should be at the next chunk (I'm simplifying slightly, but only very slightly). Repeat until you find the IEND chunk, and then see if its end is the end of the file.
Love your channel
5:49 Mike is a menace 😂
I am dumbfounded how this has not been noticed before. Way have we not noticed that file we crop do not get smaller. This makes no sense to me. I have used snipping tool day after day after day at work, and I have never noticed that the files was unnecessarily large.
Nobody noticed that their cropped photos were still six megs??
It would be a fun project to write an AI that figures out the width from just feeding the flattened RGB data. Humans can do this very easily. If I give you a slider to play with, you can just adjust it until the rows line up perfectly. Initially nothing makes sense, then there will be a point where the photo is just slanted, and you correct it. An AI should be able to that based on the repetition, and the fact that consecutive lines have similar information, be it text, photos, line art, geometric shapes, texture, etc. Unless the photo is some random noise, you should be able to adjust the width until you get a glitchy image, then fine-tune it until you get a perfect one. I've reverse-engineered width before this way from flat byte arrays.
There is a technique, named "RarJpeg", that relies on the fact, that some sites doesn't crop or check somehow uploaded images. So, it's possible to simple append a RAR archive to a JPEG image to upload some arbitrary files. Especially this is usable when a site doesn't support uploading of arbitrary files but supports uploading of images.
"Acropalypse" is a very dyslexia unfriendly word. I basically cannot read that word, it comes out as Apocalypse almost no matter what I do.
Mmh on Samsung phones you can actually revert cropping. I don't know if they do that through additional metadata (like, original image + stack of post-processing settings), or using a sepatate backup copy. But I thought that the bug was related to that, sharing the image with the "cropped info" instead (which I used to be scared by).
Yeah, I thought I was going crazy when I found out about this "bug" because samsung phones tout this as a feature of cropping (I think with all formats) and it's mysterious to me how they do that.
As you said, they could be doing it with a separate copy but I've always been skeptical of cropping an image to delete unwanted parts of the image (save from a proper image editing tool) though on samsung phones cropping *does* reduce file size.
@@Tomyb15 For my phone, this is an option in the settings, "Save original screenshots" with the description
"This lets you revert to the original screenshots after editing them in Gallery, but it uses more storage space."
So from that description, I assume that they simply save the original screenshots in a way that they're _hidden_ from your default Gallery app, but are still very much there. I doubt they're saved in the same file, that just seems like it'd more trouble than it's worth.
That's just called saving the original crop.
I know the Samsung Galaxy A10e has this error. I think it was to make it easier to revert the image back to its original, but it should've given people the option in settings.
I noticed I could take a photo on my phone, crop it, send it to my mom's phone, open the photo editor, and it would give me the option to revert the cropped image back to the original.
There is historical precedent where the exit held a low resolution thumbnail. And cropping in LR wouldn't corp the thumbnail. So you would leak content outside the crop. Or you also showed if you did any image manipulation.
Aggressive overwriting is also a problem due to the wear on SSD disks from this.
Makes me wonder what else could be jammed in between those IEND footers and just be completely ignored by encoders and security tools...
Well, that's frustrating, I have limited data access and have been cropping images to reduce the data I'm trying to send.
imagine people finding out that for a long time deleting files on a windows would just flag the occupied space to be rewriteable and removing filesystem references to recognize the space as a file while leaving the data of the file completely intact until some other process trying to find some space to save a file to randomly choses this space because it was convenient for some reason lol
ah damit should have waiting until watching all of the video lol of course this was mentioned
I cropped a pdf in preview a couple of days ago (I think before this bug was announced) and got a warning saying “the data is hidden, but not erased. It may be visible on other operating systems”. I guess Apple were aware of this kind of issue already?
Ah, Mike "I know I'm being a diva" Pound
Very interesting. Makes me want to learn more about file formats.
@6:45 "image metadata".. Correction; its data that Facebook and Twitter don't want to give away for free, but they will happily sell it to anyone who asks.
The safest way to crop photos is to take a screenshot instead.
Every file api I’ve ever used has had truncate by default. The implications of changing an existing default like that should be predictable: bad.
I definitively agree with the security issues but I am unable to understand why no one got disturbed by the size aspect of that ???
Really strange breakthrough, the point about how many security concerns there are regarding images uploaded to the net innocently containing potentially undiscovered private content is particularly strange...
Hope some helpful resolutions (yes, shamefully intended) are found
“Oh dear” 8:36
its also worth noting, if this doesn't get brought up, that PNG being able to contain extra data after the image ends is actually an intentional feature that was designed to support things like data embedding, such that you could store something like a 3D model or an image library archve, as a png file, and be able to preview the contents of it in a standard image reader, without needing any proprietary software or special formatting or complicated documentation... you just stuff a zip or tar file after the IEND block and have the picture be representative of that file's contents, and you can easily preview everything in your file explorer's thumbnail without even opening it...
It is a security problem, but its one that is really a fault of people writing software for being too lazy to handle correctly, in my opinion.
The correct way to store extra data in a PNG file is with a custom chunk type. The PNG spec states that the IEND chunk must appear LAST.
also the IEND block isn't just 4 characters, its actually an 8 byte binary identifier that happens to have IEND readable in the middle of it, however the actual identifier has a very specific set of bytes in a specific order, and its very VERY unlikely to ever show up at random in a file in the same way your exact IPv6 address is extremely unlikely to show up at random in a file unless it somehow was put there to represent your IPv6 address, and it can't show up in a PNG file (that was generated to spec anyway) at random, because the file format specification does specify that no matter how the file is generated, that particular string must not appear as a result of any of the stages of file generation, and there are specific checks to be done to split compressed segments apart if it somehow does, so that they don't end up in the file...
However... the PNG specification... is like a thousand pages or more long... and most people don't actually support the full format... most people write quick and dirty code to make a file that can successfully be read as a PNG, without actually meeting the standard. I've run into myriad problems with software because program A and B each use their own variation of PNG, but program C only knows how to read program A's version and its own C version.... because there are a thousand different ways a PNG file can be made, and most software only supports stuff that's on the first page of the spec. Those other versions are important though, because they have certain features and better compression that other versions don't support, even though they're supposed to all be supported.
PNG foundation has its own library that handles every version and method of PNG, but for various reasons, most people choose to use a 3rd party PNG library or roll their own instead.
I mostly think it's really weird that the default moved away from truncate. That just seems kind of silly to me. Secure by default should definitely be the standard. And not just security, but also not wasting disc space.
I have a problem with writing with truncating is when I have zero free space (it happens often) & edit and save a file, it becomes zero bytes. I think it can be mitigated by modifying filesystem drivers to not release truncated blocks until the file descriptor is closed.