Stanford Fake Image Scandal - The Xerox Hypothesis

Pete Judo

Просмотров 84 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 1 ноя 2024

Комментарии • 626

@danielschein6845 3 месяца назад ⁺²⁴¹
I don’t think the western blot machine did this. I think the image it took was probably uploaded, copied, shrunk, and otherwise moved around dozens of times before it finally ended up in a published paper. Any one of these steps could have gone through a bad image compression algorithm.
There is a simple solution here. Publish the original photo. It shouldn’t have any artifacts.
@realGBx64 3 месяца назад ⁺²
Well depends on what your upload size limits are.
@dxq3647 3 месяца назад ⁺¹⁴
I completely agree. During my time as a researcher, this kind of issue happens to everyone at some point. Sometimes the original document can look very different than the one that is printed. Sometimes the digital copy fails to show you a change but the printed version does.
@unbekannter_Nutzer 3 месяца назад ⁺¹⁶
Uploading, copying and moving files does not compress files.
Furthermore, said Xerox compression only works, if you have multiple close duplicates regions in the same picture. Else, you won't save any space, if you transfer individual pictures.
I doubt text processors have such compression been build in. Of course it might be hard to reconstruct, which machines have been in use 10, 20, 30 years ago. Maybe it is sometimes possible to investigate via the accounting department - I guess such machines are expensive and used over many years.
@Dudemon-1 3 месяца назад
...if it's still available.
The paperless zealots are destroying records left and right.
@bruceanderson9003 3 месяца назад ⁺¹
Publish the original photo means reducing the amount of information, bad move. Everything in the process has some pixel limit. Every digital step is different from the original.
The original negatives from a chemical film is your only choice. Now differences down to the diameter of a silver nitrate molecule can be detected. Everything else is invalid because of a physical chain of custody corruption.
Remember. Standard lithographic printed pictures are 80 dpi. Who cares what a Xerox did; these were from magazines with 4 color plates to keep aligned each to the others.
@wompastompa3692 3 месяца назад ⁺²²⁷
Reminds me of the (perhaps apocryphal) story about how Mandelbrot's prints kept coming back different from what he expected because the print technicians kept "cleaning up" the blobs they thought were errors.
@speed65752 3 месяца назад ⁺³¹
That's a terrible way to lose your sanity😂.
@PhilFogle 3 месяца назад ⁺¹⁶
It's not that uncommon in science, for outlier points to be discarded because they were considered instrument error...but the issue here is about text being altered by the software, not about images being duplicated. He's not off the hook...
@diatonicdelirium1743 3 месяца назад ⁺³
@@PhilFogle One outlier can be just that, two are suspicious, and three don't exist without a good explanation.
@Unhelpful 3 месяца назад ⁺³⁵⁰
I watch these videos as consolation for me not getting into the ivies as an undergrad
@metrazol 3 месяца назад ⁺¹⁴
Eh, it was overrated...
@tainicon4639 3 месяца назад ⁺²³
Having worked with people who went to ivies… they are often less educated than the state school people. More confident… but less prepared for their work.
@denvan3143 3 месяца назад ⁺¹⁹
The ivies have been overtaken by plagiarists who can’t answer the question “What is a woman?”
@tainicon4639 3 месяца назад ⁺⁸
@@denvan3143 could it be because, to paraphrase tom Lehrer, “there has never been a greater hot bed of celibacy” when referring to the Ivy League.
@theangledsaxon6765 3 месяца назад
@@denvan3143haha yes! They’re all stupid because they’re progressives and I don’t like liberals!
@joshmeyer8172 3 месяца назад ⁺⁸¹
I, for one, find it completely plausible that a systemic data compression issue has been compromising scientific research for years or even decades without anyone noticing. Just another reason why Elizabeth Bik's work is so important.
@marenjones6665 3 месяца назад
Who is she? Do you have a link for her work?
@justinsayin3979 3 месяца назад
@@marenjones6665 YouT*be has a sister corporation called G**gle that can help you with that. And Pete Judo has done videos about her before.
@kagitsune 2 месяца назад ⁺³
I also find it really plausible. We know the malice, toxic incentives exist, and just plain intellectual laziness in academia, I can speak from personal experience they exist in product engineering too. 👀
@nozrep 2 месяца назад
yes i agree it is totally plausible and possible. Nevertheless, I also posit that scientific fraudsters would quickly use it as another cover and another excuse, if they get caught. Because, well, humans are clever sometimes. So it is important in my view to catch the machine errors and also continue to hold the science frauds accountable, if they get caught, and not allow them to use a real mechanical ink copier issue or digital scanner issue as additional cover to their fraud. It is almost as it the complexity of the situation squares itself, or something.
@nozrep 2 месяца назад
@@marenjones6665no link, but ummmm have you heard of the google search bar in which one can type words? i mean… you do appear to be capable of typing a comment.😏
@gaithdroby500 3 месяца назад ⁺¹⁰⁸
Correction: some labs still use actual film for western blotting and develop the exposed films like the olden days. Those exposed films need to be scanned.
@firstlast-pt5pp 3 месяца назад ⁺⁴
Xerox and that model are not the only ones on the market
@renerpho 3 месяца назад ⁺⁴
@@firstlast-pt5pp The bug at one point affected basically all Xerox models. Given Xerox's market share, the chance that a machine with this bug is used to scan images like these is high. Whether that bug will do anything to those images (which don't contain text) is another matter...
@davidmam 3 месяца назад ⁺⁷
You don't use a photocopier/document scanner to scan blots. You would use a film scanner to a lossless format if you want to quantify it.
@sehrgut42 2 месяца назад ⁺⁵
@@davidmamOh honey. You think labs have MUCH more money than they do.
@churblefurbles 2 месяца назад
@@sehrgut42 Scanners have been cheap for a long time as they weren't built for speed like copiers, and there is no need for that kind of compression directly connected to a computer.
@brendanloconnell 3 месяца назад ⁺²¹⁶
I work at a massive medical university, and all of our gel scanning equipment still prints to a thermal printer most of the time. Theoretically, we can save to a USB but most of the time we just print it out and scan it due to requirements that the thumb drives be encrypted. So it is plausible. Still probably need to retract some of those papers.
@mattmexor2882 3 месяца назад ⁺²²
Yeah. I think it's a hypothesis that can't be quickly dismissed. There's a good chance things were printed and scanned. It happened a lot even in cases when it didn't have to be done. And the background of things produced by the ink blot recording and printing processes could be quite different from a photograph. One would need to more fully explore the pattern matching algorithm.
@dsfs17987 3 месяца назад ⁺²⁰
it seems much more likely the compression algorithm used in creating the digital document - I mean the PDF or Word document, where the picture was inserted, would result in this effect, but it would need to be quite a low quality photo/scan in the first place, and the scientists I work with, all don't care about the file size so they use TIF, even though the image they are saving comes from a digital camera sensor and is compressed to jpeg in the middle, it isn't raw camera sensor picture
that being said - even low quality jpeg compression is very unlikely to produce similarities in that scale, because it works in the 8x8 or 16x16 pixel blocks, so a likelyhood that a 200x300px area happen to encode identically in a jpeg even when the source image was different, is just very unlikely to happen, not completely impossible, just very very unlikely
that xerox thing was someone at xerox going overboard with the idea of saving memory, and it would have been noticed in image compression, because image compression has been used several magnitudes of times more often than those dodgy xeroxes
@Pho7on 3 месяца назад ⁺⁷
Mine stored the images but you had to explicitly select "raw" output. A cursory search of the manual said it uses open source software and some patented chemiluminescence algorithm. Looking through the patents, it doesn't specify how it's adjusting the image, just that it does. It's possible the open source software included some compression that caused duplication, or their proprietary algorithms did so. It seemed plausible to me that this is a wide-spread issue and maybe our monitors have caught up to some bad images :)
@MrIgorkap 3 месяца назад ⁺⁹
@@Pho7on But occam's razor is the elephant in the room.
@Pho7on 3 месяца назад ⁺¹⁰
@@MrIgorkap You may find artifacts in all sorts of images, compression or generative or otherwise. I don't see how "naive use of software or firmware" requires more assumptions than a conspiracy of individuals in a nobel prize-winning lab to cover up messy blots.
@RobertHawthorne 3 месяца назад ⁺⁷⁴
I'm retired now. But for years I worked for an engineering company and did a lot of Xeroxing of documents with a lot of numbers. Over time there were issues where people had issues with some of my calculations shown in copied documents. Nothing major, but always a pain in the posterior to have to go back and show my original work was right. Never occurred to me or anyone I worked with that the Xerox machines could produce this kind of error. Damn!
@Dudemon-1 3 месяца назад ⁺⁸
A job I got as an undergrad was copying maps by use of a rectifying projector because the Xerox machines distorted things.
But that was more the scaling than image element distortion.
@wkgurr 3 месяца назад ⁺⁴⁰
The issue is not so much whether dodgy software leads to image degradation that might be interpreted as image manipulation. The issue is this: The degradation very often seems to occur exactly where the authors of the paper require it to occur and in a way that supports the author's hypothesis put forward in the concerned paper. If this is an issue of dodgy software compression algorithms then these image alterations should appear randomly and not predominanlty in places that fit the author's hypothesis. If a band on.a Western blot in the lower left corner is identical to a band in the right top corner of the same blot this is certainly not an error in the image processing algorithm but an active image manipulation performed by a human with intent to deceive.
@Pho7on 3 месяца назад ⁺⁸
It's definitely suspicious but I would start looking at other western blot images published in the same period captured by the same instrument. Is it so vital that we trash a lab as quick as possible before we investigate whether this is a systemic issue in the field? Given my own training I am questioning it myself.
@absurdengineering 3 месяца назад ⁺³
Um, that’s precisely how maximum PDF compression works in Acrobat. No human involvement needed. It extracts isolated “glyphs” from the document and “deduplicates” them according to a visual similarity criterion that makes sense for Roman letters and numbers. Unfortunately, it will unify stuff that looks like blots. That is easy to check: in the PDF file the deduplicated blots will refer to the same bitmap resource. That way you know that the PDF render did it and not the author of the source document.
@wkgurr 3 месяца назад
@@absurdengineering I have been publishing scientifc papers for 20 years and I can assure you pdf or any other software compression will not magically place appropriate bands in a Western blot in the appropriate place. Western blots are done by humans and if they are fake they have been faked by human intervention. The issue is a bit more complex than could be explained by any Xerox hypothesis (which is a red herring). If you had been doing the kind of research that supports scientific hypotheses by results of Western blots you'd know what I'm talking about. Saying - oh it was the pdf compression algorithm that did it, is like saying - the dog ate my homework. A proper and correct Western blot is an effort that can easily take weeks or months to produce. So the incentive to fake is quite pronounced. To you a band on a Western might just be a smudge on a white background but these bands are just the end result of a very complex and lengthy process. You should go to a lab and ask to run a few Westerns yourself. You'd soon understand. You'd be there for at least a few months and your results would still look dismal. And you'd be tempted to fake.
@wkgurr 3 месяца назад
@@Pho7on Only a person who has not used Western blots produced by their own effort can argue in this way. Go to a lab and run your own Westerns and you'll find out what the issue is. It is not at all an issue of image capture. Image capture is totally fine. It is when bands get duplicated (or inverted and then duplicated as some "clever" "scientists" do) and then appear in the exact place where they will have to appear if the author's hypothesis is correct This is when the fish starts to smell. The Xerox hypothesis has been put foreward by people who never got their hands dirty in running their own Western blots.
@JohnVKaravitis 3 месяца назад ⁺²³⁵
Imagine having your criminal conviction being the result of this bullshit.
@BobfromSydney 3 месяца назад ⁺¹³
This sounds like the paper version of the Horizon IT system.
@davelordy 3 месяца назад ⁺¹
@@BobfromSydney I think you mean the 'H8rizQn 1T syst8m'
@small_joys2022 3 месяца назад ⁺⁴
Highly unlikely
@v3rlon 3 месяца назад ⁺³
Or having the axe murdered conviction overturned because of this.
@MrSpirit99 3 месяца назад
@@small_joys2022 it's not unlikely at all. That bug was there for decades, and it effected every Xerox machine in the world, no matter the scan quality (it was only less likely with higher ones) Now imagine every document that was scanned and does not exist in its original form anymore...
@wallycola5653 3 месяца назад ⁺²⁵²
I think you are assuming that he was using the most up-to-date Western blot imaging machines, but for many old machines, Western blots and other gel electrophoresis images are often printed physically and then physically documented in a lab notebook. Those might then have been xerox'd. Source: I've worked in many labs with older equipment.
@DJVARAO 3 месяца назад ⁺¹³
You mean photo scans? The Xerox issue messes with characters not with the actual WB images. It's a silly excuse.
@meneldal 3 месяца назад ⁺²⁸
@@DJVARAO It's not doing OCR but pattern matching, so it could happen on anything that the algorithm finds, it's really hard to tell if that specific version would affect this or not without testing.
@justhecuke 3 месяца назад ⁺¹⁶
@@DJVARAO not bands look very similar. So it wouldn't be surprising for a too coarse pattern matching algorithm to match some bands together.
In the presence of poorly implemented code, many silly things are possible.
@gwordscience9465 3 месяца назад ⁺⁷
@@justhecukewhat is even the difference between a letter “l” or a hyphen in certain fonts, and a blot? The faulty algorithm clearly doesn’t know or else pattern matching would not occur.
@falsch828 3 месяца назад ⁺⁸
Not only scanners have the potential to apply JBIG compression. It's part of the jpeg file standard.
@AUser-t6n 3 месяца назад ⁺⁹⁸
Super dodgy that he cited an unrelated paper to support the theory. Meanwhile we haven't yet seen a single verified example of _complex shapes_ getting duplicated in such a way.
@michaellew1297 3 месяца назад ⁺¹⁸
The blocks erroneously replaced in the architectural drawings show replacement of areas that are more complex than individual characters.
@hairyott3rr 3 месяца назад ⁺¹⁶
Lol if you think a barely tangentially related citation to support a point is "super dodgy", welcome to academia. Go pick a paper and see how many of the citations are actually exactly what they imply. Just another result of the perverse incentives in the industry of academia. You get recognized/rewarded/paid by the citation (impact factor). Scratch a lot of backs, especially your own.
@gwordscience9465 3 месяца назад ⁺³
The problem is not that the citation had nothing to do with the Xerox but. Instead, both examples show how honest errors can just happen, be beyond the researcher’s control and may be hard to notice.
@trapkat8213 3 месяца назад ⁺⁷
@@hairyott3rr There is much more at stake here than when covering prior art in an ordinary scientific paper. He has been accused of fraud, and in his defense he cites only one paper and it turns out that paper is irrelevant. I also find that highly suspicious
@TheYoshieMaster 2 месяца назад ⁺¹
The shapes shown near the start of the video to have been duplicated are actually not very complex looking, at least to a computer. Black and white horizontal smudges. A single letter is often a substantially more complicated shape.
@BronwynKirby-d5u 3 месяца назад ⁺¹⁴⁷
As some other people commented - for older papers yes it is possible the images were captured as physical "photo" which needed to be scanned.
@archip8021 3 месяца назад ⁺¹⁰
Some of the duplications are full color images so I don't think so.
Also none of the lossy compression algorithms mentioned use detection of characters in the image.
Widely used are TIFF files, to a lesser extent PNG and the raw files from the machines. These are all lossless image formats, saving the image data as is.
@bytesizebiotech 3 месяца назад
@@archip8021 I always use TIFF for exporting my data.
@hwhack 3 месяца назад ⁺³
@@archip8021 use of color is irrelevant. It simply isn't a factor. An example is color banding that happens in . X264.
The algorithms mentioned all use pattern matching, which is similar at char matching.
You've provided no evidence the machines are using lossless compression. Many ( even my oscope) machines use lossy compression by default.
@anmolt3840051 3 месяца назад ⁺²
@@hwhack Yes, exactly. Lossless compression should be used for such sensitive images
@marionettekent 3 месяца назад ⁺¹
we are still using films. I like it better in the sense that it is sometimes faster. It is usually more sensitive compared to the imagers I have used so far.
@KyriosHeptagrammaton 3 месяца назад ⁺³²
This still raises the huge question of "What are peer reviewers doing, exactly?"
They should have caught this ahead of time. Otherwise, why even include the diagrams in the document?
@diatonicdelirium1743 3 месяца назад ⁺²
Those peer reviews didn't know what they were supposed to be looking at - except maybe spotting that a repeated image is suspect when it claims to be a different sample. You would not notice this unless it is on the same page or you happen to have both pages open at the same time.
@franzprussia2567 3 месяца назад ⁺¹
The error was in the noise of empty space of a western blot, no one looks there
@profdc9501 3 месяца назад ⁺¹⁹
If one of these old Xerox scanners still works, maybe one could scan in some Western blots and see if this happens.
@renerpho 3 месяца назад ⁺²
Good idea. Watching the original Xerox video, it was a bit difficult to initially reproduce the bug, but once it was understood, reproducing it became quite easy.
@FiLo-nb3pr 3 месяца назад ⁺⁶⁴
As a scientist who has been publishing papers for over two decades (not as many as a Nobel laureate, but still quite a few), all of them filled with pictures and graphs, there are no two figures that even look similar enough to warrant further checking. So, yeah there's definitely something fishy there... The most probable explanation is that one of his students tried to cook things up. No matter how carefully you check, you can never redo all of their calculations, at some point you just check plausibility. I can report of having caught a student doing so (luckily before submission), but obviously I can't guarantee that no such case slipped my attention. Practically no one can.
@dxq3647 3 месяца назад ⁺¹
Sometimes it is as simple as the digital version and printed version showing two different things. I've had a case where my digital document had the most recent figure, while the print version (of the digital document) had a much older figure. Sometimes I email things back and forth and formatting gets completely destroyed.
@FiLo-nb3pr 3 месяца назад ⁺²
@@dxq3647 I see. But isn't this a case which leads to even less similarity between figures?
@TheEudaemonicPlague 3 месяца назад ⁺⁸
@@dxq3647 If emailing loses formatting, you're doing it entirely wrong. I keep seeing comments here, making ridiculous statements like this. If all of these are coming out of academia, we're in trouble.
@dxq3647 3 месяца назад ⁺⁵
@@TheEudaemonicPlague Lmao, academia is a shitshow. I will be the first to admit it.
@dxq3647 3 месяца назад
@@FiLo-nb3pr Could be a simple editing mistake. Copied the wrong image over and never realized it.
@BobfromSydney 3 месяца назад ⁺⁹⁸
I'm feeling completely infuriated by the idiotic design of the compression algorithm. Scanning documents is very important in business and legal contexts and a dodgy scanned image could result in someone getting the wrong bank account details added to a financial service, resulting in payments going to the wrong place. I'm sure there are many other contexts where a copy of something like a contract being scanned wrong could have extremely harmful consequences. What the hell were the idiots at Xerox thinking? Just spend a few more dollars installing more memory in the giant cubic meter sized machines you greedy bastards.
@passerby4507 3 месяца назад ⁺¹⁹
You probably don't understand how big a raw image is. A 1920x1080 color image works out to 6MB, which is literally impossible to handle in earlier machines. What you should be upset about is how bad the compression algorithm is.
@dmitriitsunenko9055 3 месяца назад ⁺²⁴
Exactly my thoughts. How did the algorithm make it to production when it cannot even replicate alphabet properly?
@bjorntorlarsson 3 месяца назад ⁺²
Since a decade or so, everyone scans by just photographing the document with their smart phone and press that "share" icon. I can't imagine there emerging any security problems from that procedure... Unless I think about it.
@BobfromSydney 3 месяца назад ⁺⁵
@@passerby4507 A bit of both, I can see why the algorithm is more "efficient" than .jpeg or .tiff etc. but I don't think it's justified to attempt such a high level of compression when you are losing so much fidelity. As for image sizes being impossible to handle - that's a hardware limitation which is caused by using less or inferior hardware to build the copier.
@KitagumaIgen 3 месяца назад
@@bjorntorlarsson It might be somewhat of a risk as soon as one were to save and image in some such lossy compression file format.
@veranikakv 3 месяца назад ⁺¹⁵
As someone who works in the lab with pretty old equipment I never needed to make a *copy of a print of a picture* of my gels. Some western blots use photo film, some transilluminators just take a picture from the attached camera anf save it or print it out on a thermopaper. There is no need to make xerocopy of a print as you can always print the original.
@allangibson8494 3 месяца назад ⁺⁷
You do need to scan printouts to incorporate images into documents.
Ensuring you have the right image in the right place is the authors responsibility.
@veranikakv 3 месяца назад ⁺⁴
@@allangibson8494yes you're right, I misunderstood the mechanism of how and when the error occurs, my bad
@firstlast-pt5pp 3 месяца назад
@@allangibson8494- he is in no hurry to show the originals 😊 and Xerox and a particular model is not the only one on the market at the time
@diatonicdelirium1743 3 месяца назад
@@veranikakv No, actually insightful! It shows that methods and data can be beyond suspicion, until that rare event of publication/multiplication. This is why proof-reading is no fun - I've done my share of that for my daughter's publications.
@Oler-yx7xj 3 месяца назад ⁺²⁵
If there was some scanning or a weird compression format, I would assume there should be either the original, from which the copy was taken, or the original image file, in the original format, left. Given that this appears to affect a lot of papers, there should be originals for at least one of them
@MDNQ-ud1ty 3 месяца назад
The hypothesis is BS used to cover up the fraud. Don't you think if these extremely expensive copiers were so error prone that no one would buy them, especially Harvard? If it were one error in 10,000 where it didn't matter then yes, but when there is many errors constantly and a whole context of fraud then no.
What it is is the fraudsters trying to create plausible deniability... it will probably work because people love to cover for fraudsters. No one wants to believe the entire system is fraudulent and since so many people actually partake in the fraud they too want it covered up.
@CristianConsonni 3 месяца назад ⁺³⁷
You never scanned a document in your life? Lucky you! Besides this, the problem is not the scanning per se, but image compression. We do not know what image processing pipeline they used, and of course in principle can be done after the images were extracted from the machines. In the latter case, the team should be able to provide the (higher quality) original images. In any case, it should be fairly easy to prove or disprove the hypothesys if they are able to document in detail their image processing pipeline and provide the original images.
@othmanhassanmajid8192 3 месяца назад ⁺¹
Remembering using lino cuts, cyclostyling, xerox, fax, photocopies and 😂ocr
@dnch 3 месяца назад ⁺⁷
exactly, and while they are scientists, people would be surprised how PC / technically iliterate they can be
@strech5412 3 месяца назад ⁺⁴⁹
A 25yo doing the work that should have been done by 50yo’s, … 25 years ago. Thanks Dr Judo!
@rabidwallaby84 3 месяца назад ⁺⁸
The fact that the images were what was wrong tells me it's probably not the printers...
@johntrombley2647 3 месяца назад ⁺³³
We used to use a scanner or photocopier for gels, Western blots, TLC, and the like. Small lab, low budget.
@dxq3647 3 месяца назад ⁺³
Even well funded labs can be very stingy when it comes to equipment.
@bigboi1004 3 месяца назад
It's not an amazing workflow to begin with but I don't find it implausible. And nobody using a machine called a *copier* would think that it makes copies with randomly-altered text (and potentially images when they get treated as symbols), so I doubt that most people who scan a thing would scrutinize the output from the copier that closely.
What I wonder is why Xerox would make this a thing rather than just build their car-sized cubes with more memory. Their machines are used for all kinds of important documents where altering text can have huge consequences. What happens when someone scans a prescription and now a patient gets 80mg when they should get 30? What happens when money that should go to account #662957010 goes to #662957070 instead? Scientific papers' authors are getting accused of fraud here, and not only are some genuine, rigorous scientists getting false accusations, but this potentially provides plausible deniability for fraudsters.
@kablamo9999 3 месяца назад ⁺⁷
What a terrible solution to implement. You expect that a copy machine would do just that, copy something.
@renerpho 3 месяца назад ⁺²
That's what made that bug so hideous. There are companies and archives who scan all their incoming mail, and then throw away the original because they don't want to store paper. The only reason why we haven't heard about many such cases is that companies have no interest to make such issues public, instead dealing with them internally.
@darkaryn 3 месяца назад ⁺¹³
That is one big the reason why you do quantitative measurements, no matter how your immages get messed with, there are tables and graphs and if they are fraudulent too well ... Never trust pretty immages.
@squib3083 3 месяца назад ⁺¹⁵
Xerox scanning issues have absolutely nothing to do with what is obvious image manipulation.
@artembrodskiy4876 3 месяца назад ⁺¹⁶
Pete Judo out here doing the work of saints at 25 years old, time to start my influencer career
@albertyu750 3 месяца назад ⁺¹⁸
Used gel imaging systems (ChemiDoc and iBright) before. I'm 99% sure the system just takes a raw image, as you can export the image as TIFF. It is possible that repeating artifacts can be introduced if you export in another file type (JPEG or PNG). We also don't know the image processing work flow that the WB images underwent prior to ending up on the published paper. Though, if they are even semi-diligent, they should've kept raw image files around and kept a record of the image processing work flow. Seeing as the published paper is years old now, unfortunately raw images and recorded work flows are going to be near impossible to find.
@AlexandruVoda 3 месяца назад ⁺³
PNG is a lossless compressed format so it should not exhibit any issues.
@albertyu750 3 месяца назад ⁺¹
@@AlexandruVoda true
@avematthew 3 месяца назад ⁺²
I agree, but those raw image files are so big, haha. Every time I clean my machine I am so tempted to delete old, irrelevant, Western images.
I never have, because I always think about how much of a pain it was to get them and feel like I should keep them.
I could certainly see someone deciding to delete them, even if I think it's the wrong choice.
I feel better about how much hard drive space they take up after watching this, that's for sure.
@vylbird8014 3 месяца назад ⁺³
The offending algorithm is JBIG2, and it would be used within a TIFF file.
TIFF is a container format, so it supports a variety of different compression methods both lossy and lossless. JBIG2 is one of them.
@aloice 3 месяца назад ⁺²
Adobe Acrobat's OCR feature does similar . So Don't think of this as a hardware issue, it's 100% software and I think there's no one organization to blame. It'd be a lack of awareness as people run their documents through whatever pipeline required to publish
@GhostFS 3 месяца назад ⁺¹
Two thing:
1) It's not an analogic problem, it's fully digital problem, so the fact that no scanning is involved it's not invalidating, it's making it worse. Taking a picture with a phone and sending with wattzapp could be affected as that have servarl compression and decompression steps involved.
2) Organic and not organic or letter/number don't have big effect. Two organic element that are similar could be confused as much as letters. Probably happened also in the carnivorus plant, not in the picture as a whole, but having many different tentacles like and small bulb... bulbs could be duplicated if you zoom to pixel resolution.
@alexhajnal107 3 месяца назад ⁺⁸
What continues to astonish me is that lossy compression is ever considered acceptable for technical or scientific data and documents.
@renerpho 3 месяца назад ⁺¹
If this is related to the Xerox bug then lossy compression isn't the issue. The bug was such a big deal because, despite what Xerox initially claimed, it appeared even in scans made with the highest possible quality setting (marketed as being essentially lossless).
@alexhajnal107 3 месяца назад
@@renerpho I've encountered JPEG compression used on schematics and blueprints on myriad occasions.
@xelaxander 3 месяца назад ⁺⁷
If you understand German, the talk by Davis Kriesel how the Xerox issue was discovered is hilarious.
@timalexander1811 2 месяца назад
*David Kriesel
@mickmoon6887 3 месяца назад ⁺⁴
Exactly scientific journals never publish the full methodology for their studies including the type of scanners, sensor equipment for their papers idk how this isn't common practice despite having peer review processes in modern academia its also one of the reason why modern science has replication issues how can you even replicate if full methods are not given transparently and fully
@AlexandruVoda 3 месяца назад ⁺²
I am in favour of full transparency of the entire process. Going forward it should be demanded of all studies.
@Pengochan 3 месяца назад ⁺³
I think the argument about the Xerox issue was more generically refering to the type of problem than specifically Xerox scanners. I.e. somewhere in the chain from generating the image to publication some compression or enhancement algorithm modified the images. The reasoning that this should have affected other work too is quite convincing.
But it does bring up another issue, that we should have certified applications and process chains to faithfully reproduce information. Due to recent developments in and increased application of AI-enhancements of images the issue might arise again.
And yes, i can easily envision that some time during final editing of a paper some AI-based image enhancement "improves" some blurry image to meet the print/publishing standards.
OTOH, if the journal is at fault producing the original images that were sent to the journal could clarify that, and knowing scientists it's not so unlikely that data still exists somewhere.
@BewegteBilderrahmen 3 месяца назад ⁺²
The issue with the scanners is text related, and while it's technically possible to happen that in the same set of scan/copies the scanner would recognise similarities and erroneously copy image parts there is little evidence for the scanner having sufficiently large memory to randomly copy blots all the time.
The issue in the presentation was about numbers and letters being scanned wrongly in the same batch of copies.
@davidshettlesworth1442 3 месяца назад ⁺⁹
Thank you for an excellent video. I learned a lot today. Carry on.
@bjorntorlarsson 3 месяца назад
I love what he does!
This should be big news. Science fraud, I hate it instinctively! From back in the 17th century with so little means to detect a science fraud, I haven't heard of any. The people involved had moral standards and honor back then. They feared God.
@MStrong95 3 месяца назад ⁺³
JBig2 is certainly an interesting image compression algorithm... I could see a hypothetical situation with a heavily compressed JPG image getting a round of JBig2 like compression from a PDF software due to default settings or say a less technically savvy person trying to shrink the PDF file size so that their paper is small enough to meet upload requirements. I feel like this wouldn't necessarily be an issue but then again not everyone is going to notice multiple rounds of lossy image compression
@a_martynovich 3 месяца назад ⁺¹
There doesn't have to be a Xerox scanner/printer involved. You write your paper in, say, LaTeX, insert the images, this all then gets converted into PDF where the compression happens, and JBIG2 is one of the supported formats. Had he shown the actual raw images vs the images after the compression where the artifacts are visible, we wouldn't have to doubt anything.
@Asiago9 3 месяца назад ⁺¹
As someone from Gen Z, it surprises me you haven't ever scanned in a document before, because it's so much clearer than if you take a picture of it
@platynowa 3 месяца назад ⁺¹¹
I am a molecular biologist. Western blot machines do not scan anything, they have digital camera built in and they take photos. The older types really had a normal digital camera mounted inside the machine.
@mariapospelova4096 3 месяца назад
What file format are the pictures from the machines and how big they typically are?
I would imagine they are processed/resized/converted to a different format afterwards.
@platynowa 3 месяца назад ⁺³
@@mariapospelova4096 They are by default TIFF. But we always compare the photo with what we see in the transilluminator. This is simply good laboratory practice, to adjust brightness, contrast etc. for the photo to look as the gel really looks like.
@hexarith 3 месяца назад ⁺¹¹
@@platynowa Ah, TIFF… the most misunderstood format out there. If you didn't knew, let me explain you this: TIFF by itself is just a container format, not much unlike ZIP. And the images inside a TIFF can be encoded in all sorts of ways. You can have uncompressed raw, lzw raw, but also JPEG and even JBIG2 (the algorithm in the Xerox issue) is supported just as well. TIFF is a terrible choice for scientific data, and it's a minor scandal, that this mess has become the de-facto standard in the world of biomedical imaging.
Don't use TIFF; unless you know - for sure, and not based on ill assumptions - exactly what you're doing. Chances are, you don't know what you're doing.
So don't use TIFF!
@trapkat8213 3 месяца назад
@@hexarith I am amazed that a piece of equipment designed for this kind of analysis, which I admittedly know nothing about, doesn't come with a lossless image compression algorithm, or at least the option to use one. My cheap digital camera can do that!
@gj9169 3 месяца назад
@@trapkat8213 It probably comes with the option to use it. But even when I worked in software development for image acquisition, analysis and manipulation people didn't have wide spread knowledge of these things. Also I had to take courses in a biochemistry lab in university, not long ago, and the samples were literally put in a scanning machine. The scanning machine then only sees a blurry gel, which it probably recognizes as background and tries to compensate for, and a few stripes which could as well be "I" or "l" characters. So this whole thing seems really plausible. BUT if thats true.. well then they all need to refresh their computer science courses. And set their scanning machines to image mode instead of text mode. #digitalNatives
@junorus 3 месяца назад ⁺³
Why do you stick so much with Xerox scanner? Any machine that does any type of compression could have similar artefacts. That is why row data should always be saved and preserved, not only like: JPG. I can easily see valid data being processed through making pictures and the compression inducing artefacts'. Does that surly happened this time? No idea. It is dodgy that multiple images seems to have similar type of artefact? Yes, but no, since the image processing will likely happen to be the sample in the research group. Then bad practice will happen again. This is why good practices must be promoted.
@tony9146 3 месяца назад
Keep up the good work! It is important to report on the issues with these institutions that are held to be so immutable and arbiters of reality.
@HydeSkull 27 дней назад ⁺¹
Fraud occurs in science because
1. Its easy to do.
2. Academics is a power hierarchy
3. Peer review is not designed to catch fraud or bad science
4. Peer review is method to navigate the power hierarchy
5. Political pressure
6. University pressure
7. Follow the money
@Suburp212 3 месяца назад ⁺²
JBIG2 was one of the worst error introducing events in global history. We still do not know what books and databases are now corrupted, possibly including stuff like NASA space mission calculations. Unrelated to this story, but still super interesting event.
@c.gerdes-wocken 3 месяца назад ⁺⁴
There might be maschines, that print out the western blot images, requiering the scaning, or that some type of image compression software was used in the workflow, maybe even unnoticed. Alot of older software had to do such tricks behind the users back, as the computing power of PCs back then was quite limited. Could also be the layouting software of the journals publisher, as the files get optimized for printing. But I still think, occams razor points more towards somebody intentionally messing with the data. But you are right, if there would be such a pattern algorithm involved somewhere in the processing of western blots, that would be a giant problem for all the corrosponding science fields.
@mariapospelova4096 3 месяца назад ⁺³
It doesn't have to involve xerox scanner. Jbig2 is a compression algorithm that can be implemented by various software products. Pictures made by the machine were likely processed after they were made - resized/converted/cropped. Article print layout could also be converted to pdf, for example, before printing. Pdf can use jbig2.
Besides from that, there could be other algorithms that have similar effects.
So the explanation actually sounds fairly plausible to me.
@readme373 2 месяца назад
A chemiluminescence imaging machine wouldn't be saving in jbig2 format sorry
@erdngtn9942 3 месяца назад ⁺¹
Image you get another 10 years in prison extra instead of what you were sentenced because of this error. Unless it’s all bs.
@MI-wc6nk 3 месяца назад ⁺⁵
It should be easily reproducible or falsified.
@Houshalter 3 месяца назад ⁺¹
The xerox scanner corruption issue is a massive issue for old documents that have been digitized. The originals are often destroyed, and few people know about the issue. So many records are now untrustworthy and corrupted, and no one knows.
@press2701 3 месяца назад ⁺⁵
OK, but seriously if image fidelity, resolution/contrast, is CRITICAL to an experimental thesis, would you reproduce it using sloppy 70's era xerography, digital image compression, whatever? I think not. My PhD thesis had a few critical x-ray images. I spent over 200$ (this was 1990, when 200$ was a lot of money to a grad student) to MAKE sure the images were high quality in every physical thesis copy. (The dept insisted, not me).
So, sorry, I'm not buying it. The Stanford folks were either sloppy and stupid (unlikely). Or maliciously trying to bias their story in a paper, hoping nobody would notice. Or if somebody did, they would let it slide, 'everybody does it', thinking. Just like current politics "we lie, yeah, ok, but what about their lies" under the guise of "it's ok to lie if the other guy is too". No shame left in our institutions, or am I (hopefully) wrong?
@alanklement5165 3 месяца назад ⁺³
Another great video. Keep up the work.
@v4thjhr 3 месяца назад ⁺⁶
If this researcher were honestly and truly trying to get to the bottom of the issue instead of just offering any excuse to try to get this to blow over, he would sit down with as many people as he can who were responsible for the process of coming up with the blots and getting the results into the form used in his papers. Walk through the steps in detail, show the work being done in a way thats reproducible and at the end you should be able to see cases of such artifacts occurring. If the artifacts are never seen no matter how hard these different people try to produce them, then it becomes much more likely that a manual process (ie pasting in manually) is what happened. But then doing all this just to arrive at that conclusion would be an even bigger embarassment for him.
@MechMK1 3 месяца назад ⁺¹³
Honestly, this at least seems plausible. It's still possible the study was fake, but it's not evident beyond reasonable doubt.
I'd like to see the result of replication studies.
@readme373 2 месяца назад
If you've ever used a chemiluminescence machine for western blots...0% they would be saving files in jbig2 format
@AlexandruVoda 3 месяца назад
Kudos for addressing this. I was among the ones who commented about this on the previous video. Without knowing more it was a plausible hypothesis and it would indeed have very wide ranging consequences. It would be ideal if we could have acces to the original images as outputed by the machine. I doubt it outputs jpegs. The machine probably doesn't apply any compression and probably uses TIFF. Meaning the compression step would happen when the paper was assembled and the images were transcoded into something importable. Or some extra editing (data manipulation) happened in-between.
@johankritzinger4206 14 дней назад
Extremely worrying , both ways. More scrutiny is obviously needed.
@RenatoUtsch 3 месяца назад
Note that it's not only when the image is formed by the machine that compression happens. When you insert the image in a word document, or convert it to a PDF, the images are always recompressed in the process. It was common to also have low file size limits for what you could submit to journals, so it's also possible that images would be compressed before being attached to the paper (which would be further compressed when converted to pdf). So a scanner is not necessary for this kind of artifact to happen.
@celeste3296 3 месяца назад
Im only 28 and ive been nearly brought to tears by dealing with company processes that require printing something and scanning it in somewhere else. I've had a boss that does press releases by printing the document, then using the scan and email feature of the printer to send it to me. Ive been handed documents to fill out for very important things like health insurance that are almost unreadable because they are a copy of a copy of a copy of a copy of a copy.
You are really, really, underestimating the immense inefficiency of people who don't get tech when you say there being no need to print and then scan is evidence this isn't what happened.
@kris1123259 3 месяца назад ⁺⁶
As a software engineer I've seen my fare share of glitchy compression algorithms and if the ClowStrike mess taught me anything is that it is possible for these bugs to be present in a lot of machines all across the world.
@diatonicdelirium1743 3 месяца назад
Not just Xerox, I verified similar problems with our HP office machines many years ago.
You're also forgetting that the machines are actively searching for 'repeats', the plant image happens to be random enough not to yield matches (or matches that are actually close enough to the original).
@daria.746 3 месяца назад ⁺³
I agree with many people that it’s not unlikely the images were printed and rescanned. Even in todays more modern world I know my university still heavily depends of printing even when not necessary. HOWEVER:
It is still very unlikely that the machines manipulated images as this is NOT what has been detected so far, despite heavy investigations. Additionally, it is even more unlikely that the algorithm would alter the results exactly in such a way that it benefits Südhof's argument. It would be interesting to check if the image manipulation correlates to especially relevant parts of paper as this would highly suggest a fraudulent intent - which still is most realistic to assume.
@angrytigger83 3 месяца назад ⁺²¹
Throwing all medical research under the bus to save your reputation for the win
@cafulbror 3 месяца назад ⁺²
I could imagine there being a compression algorithm in the writing/editing process for these papers as they will have been shared widely to draft, edit and review. However, the response to this is to find the original figures, not just immediately say its likely due to some phenomenon seen in the compression of numbers/letters. I also think its very strange that a compression algorithm would duplicate an image that would support the hypothesis of the paper.
@lesialyls 3 месяца назад ⁺³
Keep digging, the problem is far worse than you realise.
@treelight1707 3 месяца назад ⁺²
So, if the Western blots were a Xerox issue, what about the duplicate images that were rotated by 90 deg? Does Xerox scanners do that too.
@renerpho 3 месяца назад
Rotate matching patterns? Yes.
@treelight1707 3 месяца назад
@@renerpho 😄
@fluffymcdeath 3 месяца назад ⁺¹
Software stupidity is a very real concern and not an out there type of hypothesis. How could bad software go out the door and not be noticed for so long? We just saw a small update shut down airlines and hospitals etc. Sometimes testing is not done (or at least done adequately). If a thing looks superficially like it's working it can escape into the wild and be there for quite a while before it is discovered. The JBig2 thing is a demonstration of how far a bad behaviour can go before someone notices and investigates.
@tessatalmi4252 2 месяца назад
Some old western blot photo boxes only print you a copy of the picture and don't give you the digital version...like really old ones..
@CapsAdmin 3 месяца назад ⁺¹
If the pattern matching compression algorithm is general and not just on characters, the obvious thing to look for are duplicates and artifacts in places that would have an impact on the results of the study.
@IOJFJM 3 месяца назад ⁺¹
I used films to register WB until as recently as 2013. The devloped films needed scanning. However, the background being identical it's very unlikely still.
@joshuapatrick682 2 месяца назад ⁺¹
Someone citing something unrelated to the topic? That’s the Academic way I remember!!!
@jjones503 Месяц назад
Makes me wonder how many irs audits failed and fines applied because of this xerox issue.
@amansawhney3318 3 месяца назад ⁺¹
I think it is reasonable to expect for some of the examples where the image is all white that a compression algorithm would replicate the same noise pattern multiple times. Even the wavelet approach used by JPEG would probably do something similar.
@meneldal 3 месяца назад ⁺¹
JPEG is too stupid, you can't have references to other blocks. But low quality JPEG can result in similar noise being compressed the same (but they always would end up the same way, even on entirely different images, as long as they are aligned with the block boundary).
@chrisstott3508 3 месяца назад
It's like the Excel recalculation bug, but much worse, because we quite reasonably assume that copiers just copy!
@MrFugasi 3 месяца назад ⁺⁵
LaTex Editor was introduced in 1985 to make it easier to make Books and Articles. This would include typesetting as well as introduction of digital pictures. In 2004 when I was in school for my Master's degree this was in full use and was the standard for all journals. You would have created your paper in LaTex and submitted it to the journal in LaTex format. Digital cameras were a thing back then and to take a picture, print it on film, and then scan the picture back on a crappy Xerox printer is just an absurd notion and defies belief. The level of incompetence to do such a thing even 20 years ago would have resulted in such poor quality pictures that it would have been unsuitable for the journal. This Xerox issue may have been a thing but to think that someone didn't look at their own papers after publishing is absurd. This may be an excuse if it happened once, but for it to happen repeatedly is not acceptable and pushes into fraud.
@renerpho 3 месяца назад
When you turn your LaTex document into a PDF, one of the supported image formats is JBIG2...
@demibee1423 29 дней назад
Imagine legal documents or treaties or contracts where the scanning/compression converts "and" for "or". Imagine if the original was then destroyed after scanning, to save space. Archivists have been fighting this battle for years.
@markolson4660 3 месяца назад ⁺²
I can easily see an algorithm like this being created where, basically, OCR is being done on the image and the image of the text is replaced by smaller instructions on how to generate it.
The PDF file format has this capability, and when you have a PDF of some text and turn on OCR, the resultant PDF (which does *not* throw away the image of the text) looks exactly like it should, but if you do a cut and paste of the image's text, you get the OCRed text which may or may not closely resemble the image's text. If you use a PDF editor to edit the text, depending on the font used, you will often see the edited paragraph in a different font. There can be a lot of complexity going on under the hood.
The idea that a compression algorithm would do that to parts of images is just scary.
@dxq3647 3 месяца назад
Yes. I've seen this as well.
@vylbird8014 3 месяца назад ⁺¹
PDF is a bit of a mess internally, which is why copy-pasting seldom works right. It was designed for a very specific purpose - to provide a consistent and reproducible presentation of a printed page for the publishing industry. Before PDF different renderers and printers might interpret postscript instructions very slightly differently, which was a real headache - a 1% difference in rendered font size means a document might look great when the editor prints it off to review, then destroy a ten-thousand-document bulk print run when the edge of the text isn't quite fitting on the page. So PDF fixes that by embedding everything into the file - a cut-back postscript language, images, fonts, color space definitions, transformations, information on paper size, everything. Unfortunately that is all it was made for - the idea of getting text back out of the document was never important. So it stores instructions on how to render the page, but not things like reading order - are those two adjacent lines of text supposed to be a continuous sentence, or parts of two columns? PDF doesn't care. It's since been revised many times in an effort to fix this problem to address problems like screen readers not working, but the limitation is too fundamental to solve. It's just part of the design.
Incidentally, one of the compression methods supported within PDF is the JBIG2 of number-mangling fame. But any sensible person writing software producing PDFs knows never to use it in lossy mode. And ideally never use it at all.
@kvikende 3 месяца назад ⁺¹
I love your videos. Thanks for making them!
@yeetyeet7070 3 месяца назад ⁺¹
that David Kriesel talk is a local legend
@theondono 3 месяца назад
Just an important fact for the people discussing wether it was the actual machine. The duplicates were among *different* samples, thus would be in different printouts from the machine.
The scanning error needs to be happening on the full document (or at least one containing all the images) for cross-sample reproduction errors to happen.
@dotnet97 3 месяца назад
Videos on things like this help a lot with the anxiety of publishing as someone still relatively new to research. If people like this feel no shame publishing potential lies and not doing everything they can to disprove accusations, I shouldn't be panicking over imagined situations of, 10 years down the line, being called out for possibly having made some mistake in published peer reviewed work that I have done completely honestly and for which the important code to reproduce is all open source (plus retaining the code used for the original figures).
@gzaffagnini 3 месяца назад
Thanks for this interesting video. FYI, before the digital Western Blot imagers came around, the only way to reveal Western Blots was to develop them on actual photographic film in a dark room (pretty much like developing photos the traditional way). The films were then digitised by scanning them or taking a picture of them with a digital camera. So, theoretically the incriminated blots _may_ have passed through a Xerox scanner to be digitised if they were originally developed on film. As you pointed out, however, it remains unclear (at least to me) whether the pattern matching algorithm of the Xerox machine would have made the same mistakes with photographic patterns (bands of a Western Blot) as with the letters. Not taking sides here, just adding some relevant information (I'm old enough to have developed WBs on films myself...).
@ABaumstumpf 3 месяца назад
Thanks for checking that.
Had commented under the initial video cause well - it is something that should be checked and ruled out or the liars will simply keep on lying as long as they have any chance of deflecting.
@erdngtn9942 3 месяца назад
Who else could this have happened to? Not just science but law, law enforcement, etc. seems a huge lawsuit should be incoming.
@blackbeard3449 3 месяца назад ⁺⁸
Even if we assume the xerox scanner hypothesis is true, that only makes this a case of gross negligence instead of actual malice, not much better if you ask me.
@unixux 3 месяца назад ⁺¹
You should’ve probably talked to someone with a bit of knowledge. The issue isn’t limited to “old Xerox scanners”, it’s very pertinent to all equipment that has any imaging internally. They all rely on some sort of internal representation and absolute majority use compression, of which jbig2 is extremely common. If I had to estimate, 90% of all equipment and 100% of all budget imaging equipment can have these problems.
Just open any google book scan or any bad quality book scan for that matter.
@nezbrun872 3 месяца назад ⁺¹
"I'm 25, I've literally never scanned a document in my life"
You remind me of a graduate we took on 30 years ago. He was getting complaints from customers that documents he'd scanned to send out to them had the right number of pages, but all the pages were blank.
Can you guess why?
@firstlast-pt5pp 3 месяца назад
Hospitals in north America still "fax" medical documents in 2024
@quill444 3 месяца назад
He was using white toner. ⬜⬜⬜
@mrackerm5879 3 месяца назад
A very common (and sloppy) method to edit something out of a picture is to grab a portion of the image from another location and do a simple copy and paste. For most things, it doesn't matter such as when the image is only trying to convey an idea, but when the image is the data, it matters.
@zurabee788 3 месяца назад
Basic grad student here .. would it be better to save a portion separately and insert it as a picture then??
@SomeOne-p6f 3 месяца назад
The very thing that you don't want in a scanner...pattern matching. Who thought that would be a good idea? I can only think that it was management, who thought to increase profits and use less memory. The fall out from this is just off the charts.
@abuferasabdullah 3 месяца назад
Unbelievable. Great episode 👍🏼
@yiskanight 25 дней назад
Another great video, thanks! One thing: my university has tons of printers with a combined scanner available, most students have a laptop and don't bother with owning a printer (I think that's common?) and many profs, here, like hard copies handed in. Printing 10c a page, scans 3c. It's super common to scan on a big ol' printer, I wondering why this is strange to some?
@skipper6528 3 месяца назад ⁺⁶
Best channel for university malpractice
@zagaberoo 3 месяца назад
Finding no duplication within a *single* organic image doesn't really say much about a table with several potentially highly-similar organic images in it.
@osakaandrew 3 месяца назад ⁺¹
Recall fMRI was a software issue that caused swaths of published papers to get pulped. I haven't a clue how likely this would be here, but it's hardly unprecedented.
@simonxag 3 месяца назад ⁺⁷
This is a lot more plausible than you seem to be assuming.
1. That compression algorithm is beyond appalling. Even JPEG is lossy: it would have compressed the image (including the photo) but information would be lost and (worse) artefacts would have been created, This Xerox algorithm can't handle the photo but seems to be looking for similar discrete elements against a white background and substituting one for the other (6 is not 8 but what the hell!!!): the substituted element is not altered; the software can't know an "8" more than a blot, it just copies;; apply that to blots and you have (by definition) duplication.
2. You are very young and computer technology has moved extremely fast, The "paperless office" was , for along time, a joke. As an undergraduate I coded with punched cards and later as an office worker the fax machine was how you sent documents rapidly. In Blade Runner's Sci-fi future Deckard still asks for a "hard copy". Much of the complete shift to an online world happened as late as 2020 (something must have happened that year ! ).
@Wlerin7 3 месяца назад ⁺³
>something must have happened that year
If only someone were still alive who remembered. Alas, so much has been lost, like tears in rain.
@vylbird8014 3 месяца назад ⁺¹
You actually figured it out. That is indeed how JBIG2 works. It creates a 'glyph dictionary' containing the pixel layout of each symbol that appears on the page, and a list where to draw each one to reconstruct the original image. In lossless mode the dictionary contains each unique glyph even if there's one a single pixel difference, so there's no problem. In lossy mode it will condense symbols down if they are very very similar, differing only by a pixel or two, as such small differences are usually just the result of scanning noise, moire alignment or dirty paper. But it also creates the possibility of condensing symbols that really do differ only by a couple of pixels, like 8, B 6 and 0.
@jasonsurra8077 3 месяца назад ⁺¹
he cheated. good try. he will never admit it. his life is too comfortable and he is a celebrity. he just needs to reproduce the results under supervision to prove that he did not rather than cite the xerox problem. cheap shot at defending himself.
@currawong2011 3 месяца назад ⁺²
I would have thought that if such machine error was not uncommon, that a vast spectrum of papers in manyy areas would be showing the same errors....so far, seemingly not.
@Sashazur 3 месяца назад ⁺²
Papers would be a drop in the bucket compared to business content. You would think that if this was a real issue it would have come up already.
@vylbird8014 3 месяца назад
Most programmers in a position to be developing software that might use JBIG2 are also knowledgeable enough about the subject to know the danger and avoid ever using JBIG2 lossy mode. Xerox are a notable exception. It's not known exactly how such a well-known issue was missed during development, but someone decided to use lossy JBIG2 in a production product.
@renerpho 3 месяца назад
@@Sashazur The original Xerox talk discusses this. David Kriesel expected no such reports to come out, even though he had been told (under condition of anonymity) by people working for multiple companies that they were affected and were looking for advice how to deal with it. These companies have no interest in making such issues public, even if they weren't at fault. I actually expected to hear about this in science, specifically, because there's a more open mindset, compared to corporations.
@renerpho 3 месяца назад
@@vylbird8014 Correct me if I'm wrong, but the big issue with the Xerox bug was that, unlike Xerox had initially claimed, the bug would also appear when you set the scan quality to "highest", which they marketed as essentially lossless.
@mina86 3 месяца назад
10:00 - no one programs stuff from scratch. If they need a machine to take pictures, they design the machine and take existing piece of code for handling the pictures. It’s not unthinkable that whoever manufactured it, ended up using the same or similar code to Xerox. This is not to say that the defence stands, but rather to point out that insane things get sometimes programmed.
Not to mention the image was probably further processed by other tools and any one of them could have this compression algorithm.
@drmadjdsadjadi 3 месяца назад
I am beginning to think that Sudhof’s Nobel Prize research needs to be examined more thoroughly.
@anathardayaldar 3 месяца назад
1:32 "Many of you are far more qualified that I am."
lol
@binaryboy 2 месяца назад
This potential problem is deeper than a scanner issue. It's the compression itself, which
means it could happen even with a software to software conversion. Don't stop at a scanner.
@sebastianahrens2385 3 месяца назад
There is a very interesting talk by David Kriesel on the CCC (a German computer science convention) from YEARS ago, where Xerox machines were responsible for the same kind of stuff. Numbers suddenly changed in documents, causing an untold amount of economical damage.
IF that happened again here, it appears those things are unreliable AF.
@falrus 3 месяца назад
If there was a physical paper original involved, theoretically it might be found and rescanned. We have to archive our lab notebooks.

Следующие

Автовоспроизведение