I love the conclusion of this video. In popular media there is often an obsession with "great minds" and "geniuses" and "breakthroughs", but as you point out science is actually a process of many small contributions and little steps forward (and backward). Very well said!
"Genius breakthroughs" have their place too, at least in certain fields. While they may receive a disproportionate amount of public attention, they also have great impact and influence on research itself. This is clearly the case for example in theoretical physics and mathematics.
THIS! Thank you one of my biggest pet peeves. Holywoods and the general medias obsession and depiction of big breakthroughs as these acts of singular genius by briliant minds who were just enlightened by a stroke of genius in a moment of clarity. I assume people love this for the same reason they love 'briliant' billionairs advancing the world. If it's those genius people who will progress the world with their briliant mind then I (who ofc is not a supergenius) don't actually need to do anything. 'I couldn't anyway that is for geniuses!' So it's quite comforting/avoids feeling any pressure and having to concern yourself with 'boring' ish like 'process' and 'incremental advancements' or what role I could play in making the world better.
@@jameson44k 'Genius Breakthroughs' in my experience are scientific advancements over months and years explained through one single moment in that year long process. Like claiming the asassination of duke Ferdinand caused World War 1.
@@DarkHarlequin Popular media may depict it that way, I don't really pay attention to them. Mature scientists and mathematicians, on the other hand, tend to care about the steps to breakthroughs and about correct attributions for discoveries.
As a PhD candidate in physics, I can tell you that after reading hundreds of papers, we all learn one very important thing -- none of us know how to write well :)))
"The first law of papers states that science is a process. Do not look at where we are now, but look at where we will be two more papers down the line." - Two Minute Papers, in basically every video :) Great video!
@@olgierdvoneverec4135 Not exactly. Simulating water in TMP's context is about graphics processing which has more to do with fluid dynamics, whereas the simulation here is actually about the molecular dynamics algorithm, one of the most common few-body chemistry algorithms in existence.
Great video. I'm assuming the reason the water simulation paper is so high is probably because it sets the precedent for the process of computer modelling molecule interactions through e.g Monte Carlo simulations. This is a technique used by pretty much all chemical simulations - I used it in my research project on Metal Organic Frameworks.
Yep. I have cited this paper myself. It is the standard water model paper that is commonly cited when using any of the standard water models (SPC, tip3p, tip4p) and since water is the most common solvent in biological and chemical systems.... well. And whenever you tune the interaction parameters of the existing models or develop your own water model, this is the gold standard data set you are comparing against. Also, these water potentials are commonly used in most force field (i.e. interaction model) families. - A phD student in computational chemistry
One of my dad's friends who works in programming, spent the better part of 12-ish years creating, optimising and utilising modeling programs that simulate water waves. My super vague memory of his explanation was, because surface waves are effected by the forces of everything around them. Simulating them was extremely difficult thing to achieve with the minimal processing power they had back then
@@eryana9437 could not agree more, as a fellow computational chemist. If you use molecular dynamics simulation for example you may end up citing this paper.
Completely ignoring the content of the video (which is really amazing) but I'm loving your editing. Graphics are super engaging, matching the pacing so nicely. I literally have been taking notes of cool ways you keep everything fresh with different animations etc:D Ultimately the content matters more than the editing, but I think this is a great example of how editing can make the content just that little bit more engaging.
100% agree!!! Love the awareness and the comeback of science as I think it was meant to be in the past. Trying to understand the world as a symbiosis between nature and humanity. All equal all important all to be taken care of.
"Earth Scientists don't like writing papers" As an earth scientist, very true. I'd rather go hit a rock with a hammer. But in all seriousness, I suspect the lack of earth science on the list comes from it being such a new and fast changing field. By the time any one paper gets cited ~500 times another paper describing a better method has been written!
One other method I've heard of to rank papers is not by their citations, but by the number of citations their citations had, so 'child' citations as it were. This could drastically change the list, removing many of the 'method' papers and potentially including papers which spawned entire branches of scientific enquiry.
It would also be interesting to see some kind of h index of the work (so instead of looking at the h-index of the author over his papers, you would do that over the children of a paper)
Yeah, doing a serious analysis of that graph would be awesome. Deepest lengths favored older papers and difficult and complex processes, more branching suggested x, Nobel prize studies stood out for this reason, and overloading the computer and crashing with no results favored earth science
The funny part is that scientific fields are so quirky and divided with bad search functions that he wasn't even able to just get all most cited papers by himself. It might be bc auf the monetarisation models of peer-review-magazins or the dividing nature of having so many magazins.
Interestingly with some of these big discoveries like DNA's structure, I think a lot of these concepts almost become like "public domain" science, so to speak. People learn these things as a part of their formal didactic learning, and it seems superfluous in the modern day to dig out and cite these papers.
people also stopped citing the papers for the techniques papers for bioscience that are in this video (like sds-page, western blot...) because they also kind of become "public domain" knowledge because it's just so widely-used
Yes, exactly! Many of these things are public knowledge now, we all know what DNA is (sort of), so when you mention DNA in a paper, you don't have to cite it to prove that DNA is a real thing. So over time, it just becomes accepted. Kinda unfair but at the same time, I would love to discover something that can become public knowledge like that :D
Yeah. If you cited Origin of Species for the concept of Evolution, it would look like your paper had a minimum literature review requirement and you were padding
I expect the list is definitely missing some of the big Machine Learning papers: Hochreiter and Schmidhuber (1997) which introduced LSTMs is at 60k+ citations, Vaswani et al (2017) introducing transformers is at 35k+, Bahdanau et al (2014) introducing attention is at 20k+ and Krizhevsky et al (2012) which introduced CNNs for image recognition is over 100k now
I'd wager that a big part of why ML papers are doing so well is the same as why most of the papers in this list are there: their application to the biomedical field.
@@GalaxyInfernoCodes It also has a very similar structure to other experimental sciences, where certain techniques (skip-connections, attention, cnns, GANs, ReLUs, RNNs) are standards. These are the things that usually work (->citation) or, surprisingly, perform badly (->citation). The amount of stuff that generally works is miniscule, so you will see the same papers cited all the time for theoretical analysis, architecture upgrades or applications.
Did not know about the Higgs Boson paper, interesting. The Jorgensen et al paper (which i've cited in my thesis and MD papers) is so high because anyone who has ever performed solution based molecular dynamics simulations (e.g. protein MD) will have cited it with respect to the rationalising the water model they used.
There's also the problem of "fake citations" which are not, as one would expect, works cited but not actually read. Fake citations are citations that the author was blackmailed into doing if they wanted to see their own paper published. This apparently happens a lot, especially in universities, and goes hand in hand with professors with tenure blackmailing PHD candidates and junior researchers into quoting them as co-authors of a research they didn't take part in.
Oh wow. I didn’t know that happens-that is, the blackmailing of Ph.D. students by their tenured professors. 😨 Currently in grad school but in the social sciences, though.
@@layla-talmedina5733 these things are rare, but I've heard enough stories that I'm pretty confident that any abusive situation like that that you can think of has probably happened. Thankfully, not most of the time, but universities are very good at keeping their scandals hidden.
It sadly seemst to be pretty common. And in the cases I encountered it not beacuse people are forced to do it, but because some authors, especially in social sciences, seem to have an inner want to splash around with fancy citates, mostly from famous Philosophers. Unfortunately for them I studied Philosophy and read some of the works they cited. They obviously didn't read these works, because the citations were either used in cases were they didn't make any sense at all or were completely misinterpreted to fit the thesis of the author or simplified to an point were they didn't make any sense anymore, which then was pointed out as an flaw; presumably to show what great advances the authors made in their fields. Well, at least the latter was done to Platos "Politeia", who himself with high probability did this trick with the positions of his/Sokrates philosophical "opponents", the Sophists.
It happens sometimes in smaller field when papers get submitted for review. Since the field is so small it is very likely that the experts selected for review also have relevant papers in the field. Typically the reviewers remain anonymous but you can often tell when one of the reviewers tells you that you also should have cited the 4 papers from that specific author. Not hard to guess in that case who one of the reviewers is :) Most often you just want your paper published and not argue with a reviewer so you just include the citations and be done with it.
Can you find the top 100 for humanities subjects? I think it would be really interesting to see how they breakdown between subjects, even if you didn't read them all
Imagined Communities by Benedict Anderson (about the origins of nationalism) is widely understood to be the most cited humanities journals/books. It is a must read!!
I literally had a lecture about the contents of this video this morning! What amazing timing. It was mostly focussing on ecology/conservation scientific papers.
I love the idea that becoming a great researcher does not mean to find something great yourself but to discover something that helps others finding things.
As a 2nd year chemistry undergraduate in the UK, I found this video to be very interesting - for me, the single greatest scientific publication of all time would have to be E.J. Corey’s formulation of retrosynthetic analysis, which has enable the synthesis of very complex organic molecules to be broken down into known precursors that can be used as a means to gain access natural products that are otherwise scarce and not bountiful from plant sources.
As a chemist that use Density Functional Theory in my own undergarduate thesis. It is pretty simple actually. But you really need to see a physical system and their model through quantum mechanic POV. I think that particularly hard for you because climate and weather modeling use classical mechanic POV instead quantum mechanic POV.
@@MrNicoJac imagine throwing electrons at a bunch of nuclei (protons and neutrons) and finding the optimal way in which they’d want to sit. From there, you can figure out the quantum mechanical makeup of the system (which is otherwise very expensive to compute accurately) and determine properties of the chemical system! :)
The first rule of DFT: if you don’t know what you’re doing, just use B3LYP/6-31+G* as a black box solution The second rule of DFT: when the first rule inevitably breaks down, call a computational chemist ;)
@@AdreaSnow this. As an organic chemist - B3LYP is almost all I need for single molecules. Also, I might ask, your use of the term “expensive” refers to “takes long to compute/a lot of computing power”?
I love your conclusion about how science is ultimately a collaborative human endeavor that's done incrementally! I'm doing my PhD in physics education research, and my work isn't intended to cause huge breakthroughs or get tons of citations. It's designed to be a tool for teachers and professors to make the environment of physics better and to make learning it more fulfilling. I love doing research because I don't just have a direct impact on a few classrooms or departments, I have a tangential impact on the thousands of students who learn physics every year. I have so much love and respect for the people who work in the labs every day and publish their increments to move science as a whole forward. It's a shame that instead of celebrating that, physics continues to project the idea of geniuses who change everything overnight.
Hey, in Cognitive Science in all studies measuring Reaction Time we essentially have to measure and report the distribution of handedness among our participants. We use "handedness inventories" because it's actually moreso a spectrum than a binary if you dig into that and so we all cite that article! - I've cited it myself.
You should try to use page-rank to find the most impactful papers: Many big discoveries are big because they launch their own sub-field (i.e. Einstein's theory of relativity). This induces a "cambrian explosion" in that field, which often leads to many papers being published and superseding each other. In total, the number of citations for the original paper will be low, because it was replaced with an updated version which is now cited by everybody, but in terms of total impact these are way more crucial to science. Page rank is a way of modelling these transitive citations on a large scale (you could also compute the transitive hull of citations, but I have a feeling that this is going to blow-up computationally)
As someone who switched from a PhD in molecular biology (and was familiar with most of those top bio papers in the list and have used their techniques) into deep learning research in the last few years, I must point out that the mentioned "dominance" of the biosciences (11:04) in terms of citations is a total mischaracterization at this point. Initially, I thought I'd leave my comment there, but seeing as I'm a scientist I wanted to see if I could corroborate the claim with some evidence. So I went online and found 33 deep learning papers within *just the last 10 years* that have >12,000 citations (or just about; 2 papers have ~11k). I also pulled out 6 classic deep learning papers that should also fall into this top 100 list (well over 12k citations each). That makes 39 deep learning papers to match the 39 bioscience papers in Nature's top 100 list back in 2014, except deep learning did it largely in *just 10 years* (2012-2022; or 6 years (!!) if you go by publication date -> 2012-2018) whereas biology has had ~70 years to do the same. In other words, deep learning is at least an entire order of magnitude faster at accumulating citations than the biosciences at this point. However, Simon's observation that a lot of top cited papers describe methods is also absolutely true in deep learning! Below I include the list of 39 papers from deep learning in the format [title], [first author], [year] - [citation count at time of writing; April 16 2022]: New (>= 2012; past 10 years): count = 33 Explaining and harnessing adversarial examples, Goodfellow et al., 2014 - 11,109 Semi-Supervised Classification with Graph Convolutional Networks, Kipf et al., 2016 - 14,122 Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, Radford et al., 2015 - 11,824 Mastering the game of Go with deep neural networks and tree search, Silver et al., 2016 - 12,604 Human-level control through deep reinforcement learning, Mnih et al., 2015 - 18,995 Neural Machine Translation by Jointly Learning to Align and Translate, Bahdanau et al., 2014 - 23,057 Attention is all you need, Vaswani et al., 2017 - 39,581 Faster R-CNN: towards real-time object detection with region proposal networks, Ren et al., 2015 - 41,452 Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Ioffe et al., 2015 - 35,811 Adam: A Method for Stochastic Optimization, Kingma et al., 2014 - 103,576 ImageNet Classification with Deep Convolutional Neural Networks (AlexNet), Krizhevsky et al., 2012 - 106,136 Distributed Representations of Words and Phrases and their Compositionality (Word2Vec), Mikolov et al., 2013 - 32,822 Generative Adversarial Networks, Goodfellow et al., 2014 - 43,242 Deep Residual Learning for Image Recognition (ResNet), He et al., 2015 - 113,256 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin et al., 2018 - 36,781 Visualizing and understanding convolutional networks, Zeiler et al., 2013 - 15,007 Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, He et al., 2015 - 15,184 Dropout: A simple way to prevent neural networks from overfitting, Srivastava et al., 2014 - 35,027 Auto-encoding variational Bayes, Kingma et al., 2013 - 19,708 Rethinking the inception architecture for computer vision, Szegedy et al., 2016 - 19,059 Going deeper with convolutions, Szegedy et al., 2015 - 38,930 Very deep convolutional networks for large-scale image recognition, Simonyan et al., 2014 - 76,765 You only look once: Unified, real-time object detection, Redmon et al., 2016 - 23,594 Fully convolutional networks for semantic segmentation, Long et al., 2015 - 26,340 Fast R-CNN, Girshick et al., 2015 - 19,658 Rich feature hierarchies for accurate object detection and semantic segmentation, Girshick et al., 2014 - 23,117 Sequence to sequence learning with neural networks, Sutskever et al., 2014 - 18,404 Learning phrase representations using RNN encoder-decoder for statistical machine translation, Cho et al., 2014 - 18,504 Glove: Global vectors for word representation, Pennington et al., 2014 - 27,318 Efficient estimation of word representations in vector space, Mikolov et al., 2013 - 28,197 SSD: Single shot multibox detector, Liu et al., 2016 - 21,032 MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, Howard et al., 2017 - 12,698 Mask R-CNN, He et al., 2017 - 18,084 Older (
I enjoy reading very old papers that contain few references or none at all. For example, Riemann's famous paper about the number of primes where he formulates the Riemann hypothesis mentions Euler, Gauss, and Dirichlet in passing but has only one reference: (Jacobi, Fund. S. 184)
I think mathematics especially has a culture where you only cite something if you use a result from it, while in some other sciences it is much more common to give a brief overview of the field where you give lots of citations.
what i think is a fascinating thing scientific papers is the age range depending on what you're studyin. the the extremes are earth science in which you can use papers that are over 2 hundred years old, and nano technology in which anything before 2015 is considered outdated.
i didnt expect this video to make me cry. i feel like ive just seen science in a new light. the quote by newton still has me tearing up as i type this. humans are a collaborative animal and it is so beautiful to view science as a human endeavor, built up in little steps. thank you for this video
I think you might be wrong on the machine learning prediction. Since the release of the rankings you used, a few of AI papers have already cracked the top 5, with over 100k citations (and these papers are only 6 years old). Machine learning papers are only getting more dominants in the field. Speaking as a computer science researcher.
@@emmazhang2418 yeah but the prediction was made as an 'acknowledging change since the list was made' section of the video. So he was accounting for recent changes. He even directly shows one of the machine learning papers I was referencing in the video despite it not being on his list.
I think your discovery is spot on: science does not progress by the work of genius, but by the solid and honest work of those who have dedication, if not necessarily talent. We celebrate the seminal points and the singular achievements because humans are biased that way. It contributes to the general misunderstanding of science and technology.
Reminds me of a comment one of my old professors made which went something like. “My most downloaded paper is closing in on 30 downloads. That means everyone in that particular field has downloaded it at least twice.” And I feel THAT sums up so much of academia.
Super interesting! I used some of those techniques myself! I never even though about the fact that they could be the most cited papers. This is the type of information I didn't know I wanted to know. Great video as always!
Listening to this video while reading papers in my Lab ☺️Molecular biology is definitely the most exciting field. It’s funny hearing Simon say techniques I’m very familiar with but he isn’t. Definitely tables reversed with the atmospheric physics.
As a materials scientist, I would probably say that the graphene and nanotube papers are mostly about the methods, rather than some fundamental discovery. At least in the same sense as the bio papers. Just because something is a method doesn't mean it wasn't a revolutionary finding in it's own right.
For the water paper, water in molecular dynamics (MD) simulations can have important interactions with your protein/carbohydrate/compound of interest. The water molecules are treated classically as a predefined number of point charges (the number in the name for the TIP models) as treating it with QM is too costly. A water model with fewer simulated points is exponentially faster but potentially less accurate. Every MD paper's methods will cite the water model used and, as mentioned in some comments below, this is the initial gold standard paper comparing different water models indicating they don't vary THAT much (depending on context ofcourse).
Jorgensen was my orgo professor at Yale, if his long-winded ramblings were any indications his work on simulating water was the basis of a lot of computational chemistry for biomolecules. Basically, all other work on simulating carbohydrates, proteins, etc needs to use this work on water to figure out what conformations molecules will take in an aqueous environment. I think it was especially important to understanding the thermodynamics of protein folding
Men I'm professional biotechnology engineering and this is a gold mine to me! I'm very happy I could found this video, thank you for supporting science. Greetings from Chile!
Seems like many of the recent papers from the machine learning/CS field are missing from the list. For example, Alexnet has more than 100k citations in less than one decade. If you list the rate of citations (citations/no. of years since paper published) I assume that list will be dominated by deep learning.
I think you are right that deep ML/DL would dominate the list when using the the rate of citations y on y. But ranking papers from such a vast umbrella which is science is very difficult. Its like the debate with top 10 sportsperson of all time. It will always come with some sort of unwanted/wanted bias.
Not sure if that's true. There's a lot of papers on SARS CoV2 that got cited several 10K times in the short timeframe from early 2020. 100K citations in a decade is "only" 10K per year.
Your videos always communicate science in enjoyable and engaging ways, where it's much easier to understand these topics. I'm actually a biology student, so this video was really interesting. I do research in a plant biotechnology lab, and yes, we do use assays 🙂
Very good video, I am a social scientist specialising in bibliometrics and the mapping of science. What you say about single authored papers is actually very unique.
I think a great thing to look at would be in-domain highest Z scores of citations. The citation norms between different areas are just too big to actually compare them in this way. I'd also be very interested in seeing some of the softer-science papers found using this method.
Great video! There's a Machine Learning paper that has over 100k citations. It was published in 2016 and it's called "Deep residual learning for image recognition", He, Kaiming, et al.
I am a material scientist working with density functional theory and felt pretty offended at 7:00! Now in all seriousness, you got me interested in looking up those physical chemistry papers, because they might be up my alley :)
Simulating water could be high up because of computer and movie graphics. It is one of those problems a lot of people have to solve when simulating reality. I guess a lot of people have tried to find better ways, building on-top-of the solution suggested in that paper.
I think (though this is just an hypothesis) that as the water paper is an early form of molecular dynamics simulation it might be a good reference for new md programs, I also had to learn about that paper so I guess people also really like it xp
Reading a scientific paper is like randomly grabbing a harry potter book out of sequence and opening said book at a random page and trying to make sense of the plot without having prior knowledge of the franchise
As a geology student it is a little sad to see zero Earth science papers in the list, but I think it was worth it for that little quip towards the end there lol
This video is incredibly admirable - not only is it good watching and, of course, good for your channel, but I think these types of videos are really necessary in our age - science is changing (and not always for the better) and being grounded in how and why certain papers make it into history is always a great thing to have at our fingertips.
Great video and I can appreciate that a lot of difficult reading went into making this one! I'm a PhD student at Cambridge, doing some molecular simulation in my project, so I can hazard a guess the reason that "comparision of simple potential function for simulating liquid water" paper is so highly cited is because: the choice of force field/potential function is crucial to molecular simulations (both molecular dynamics and monte carlo techniques), as it's what describes the forces between atoms in our in-silico system (simulation box). It was actually surprisingly difficult to get a force field to accurately reproduce the properties of water through simulation - in fact, there are some 80,000+ studies of this alone in the molecular simulation literature. Secondly, water is pretty much in any biological/most chemical systems, so there must be a lot of scientists, who go back and cite this when describing the choices of force field in their study - perhaps, I should too! Hope that explains it :)
your videos are such an inspiration for me! As a newish RUclipsr in the similar sector I have learnt so much from you and your content really is incredible! Massive appreciation!
Water simulation covers things like Oceanic shipping-cost optimizations, avoiding infrastructural pipes bursting from water-hammers, visual effects, and particle flow simulation in general, which extends to pretty much anything that involves movement in a fluid, like all of aerodynamics for example, which means it also gets used in most vehicle design. TLDR: Fluid simulation is a very massive, very profitable field, and many methods are probably based on the ones in the paper.
As a person recently graduated with PhD I wonder how much time did it take to even skim through those papers. A scientific paper is not an easy read even if it's in your area of research and here so many where from unfamiliar fields for you. P.S. Great job! Interesting video. Now I wonder if some scrapper with greedy algorithm(i.e. jump to the reference with the most citations) starting from some random papers could find most entries from this list.
Congratulations on your graduation! So as I'm sure you found during the PhD, a lot of the paper reading was truthfully 'topping and tailing', i.e. reading the abstract, introduction and summary/conclusions. For a fair few papers though I did make an effort to read the main body to try and understand, though for those out of my academic background, it was at this point that I had to tap out as it just became impenetrable. Having said that, for a surprising number of papers, they were actually easy enough to understand all the way through!
The water simulation paper is definitely in the field of statistical mechanics/thermodynamics. So I don't think it's used for macroscopic simulations like the flow of the river. I'm afraid the dreaded physical chemistry is involved, which inherently scares me as a biochemistry student.
6:55, a DFT paper by Axel Becke!? He's a chemistry prof at my school (Dalhousie), and I only recently learned that he's one of the biggest names in DFT when my solid state physics prof said "Yeah, the guy who developed everything we're learning in this topic? He's in the building next door." lolol
Biology is the coolest science, it literally draws the knowledge of all the other science, chemistry, physics, and maths and even computer science and statistical analyses together. I love the interdisciplinary it has. Amazing. I'm so proud to be part of it.
I am studying physical chemistry for my PhD and the explanation that I got for using different water models is that essentially it localized the partial charge of the oxygen atom in water in different places which allows one performing computer simulations to capture different chemical properties more accurately.
Since you found the Neighbour joining Method of phylogenetic analysis interesting, you might find maximum likelihood and bayesian phylogenetic genetic analyses more interesting. These techniques along with a few others have made biology and computational analysis almost inseparable.
As you indicated, some spots on the list would be overtaken by AI papers. A wonderful example would be the paper "Imagenet classification with deep convolutional neural networks" by Krizhevsky et al. (cited over 100k times), published in 2012. It brought deep neural networks to the foreground of AI research, overcoming existing hardware limitations and probably marked the end of the latest "AI Winter".
Is it really that difficult to find top ranked papers in classic scientific database search tools like Scopus or WoS? (I wouldn’t even think about any other way before going there).
Great video! interesting review for sure! One thing that is really interesting to discuss here is gender inequality and how most (if not all of these authors) are men. Despite women in STEM have grown significantly (or actually started to be recognized by their work), there is still a long way to go...
Very true, we need progress in gender equality. It takes time before women spread in male-dominated fields in large enough numbers that all the young girls can find a female role model. And this is just an example of a reason why this is a hard problem.
Something to note is that there are a LOT of papers out there that are just…junk. They don’t really add anything to the body or work and published for the sake of putting it on the resume. In fact, these papers are often published as ‘I double checked someone else’s work and found that there was evidence to support it’. These papers get almost no citations if any since they are only there to pad a resume. Science is a cutthroat world underneath the veneer of collaboration, and many places won’t hire you without a certain publication count.
It's important to remember that a paper can also have a great number of citations if it's published by well known people but has gross inaccuracies or illogical conclusions. Other papers may cite it with the intention of contradicting it.
The insight regarding the quote "we stand on giants from our past", was brilliant! We actually stand on multitudes of unknown heroes. You've become a legend in my book 🥳
I think your meta-scientific thoughts would actually be worthwhile sharing in a journal dedicated to the philosophy or history of science. This video cannot be cited, but I'm sure many people who don't even know your channel exists would love to cite your findings.
For those that don’t know, the first author generally writes the article, the second/third are the editor, and all that follow just give feedback and have worked on the project, by doing the lab work and analysis to help the main research move forward.
Interesting that you mentioned machine learning. As someone who works with various machine learning algorithms frequently and busy finishing my master thesis on this topic, the relatively new transformer model and subsequently the BERT model (Bidirectional Encoder Representations from Transformers) have seen an immense number of citations, given the generality and overall success of these models. Currently, the paper on transformers (Attention is all you need paper, Vaswani et al., from 2017) has already reached 37,497 citations at the time of this post and the BERT model has reached 35,000 citations (Bert: Pre-training of deep bidirectional transformers for language understanding, devlin et al. 2018), according to google scholar anyway. Given how new these papers are, that is insane. Although I am new to this channel, I think it would be interesting to discuss these papers and overall the development of machine learning as one of your video topics. These models also currently bring state of the art performance in various fields, especially in NLP, so it is still very relevant too. Amazing job reading all those papers btw, that must have been exhausting.
11:50 That paper has the parameters to compute/simulate some water models have been used in many molecular dynamics (MD) simulations either to compare water properties obtaining through simulation/experimental or to study of all kind of systems where water is present. The papers related to MD always (or should, at least) cite the source of the force field parameters. Then whenever one MD simulation has water computed using, for example, SPC, TIP3P or TIP4P water model (very, very commonly used), that Jorgensen's paper will be cited.
For the water simulation paper, it's probably so high up because it sets the ground for computational biology, as all biological and biochemical reactions/interactions in cellular organisms take place in aqueous media. This can include protein modelling, DNA/RNA modelling, determining cellular biochemical reactions such as cellular respiration and photosynthesis, and the modelling of drug interactions in pharmaceutical science and medicine (and very importantly, vaccines). I've come across the importance of modelling water-based systems a lot during my PhD studies in Chemistry. Excellent video as always :)
Great video Simon! One thing I think it’s interesting to consider beyond what you covered in the video is the dynamics of citations over time than many of these top papers get. If you look in the Nature article describing these top 100 papers, you can scroll through graphs of the number of citations each paper has got over the year and there are some interesting trends. For most of the papers, rather than steadily accumulating citations over time in an every-increasing manner, the citation counts peak 10-20 years after the original publication date. For example, No.19 on the list is a paper describing the molecular biology technique called “Southern Blotting”, used to detect specific sequences of DNA in a sample, that was first published in 1975. You can see the graph of citations over time increases up more than 2,000 citations per year around the year 1987, and then declines to less than 100 per year in the present day. This drop after the late 1980s was likely caused by the invention and propagation of a technique that now everyone has heard of: the Polymerase Chain Reaction (PCR), which is basically a much easier method of doing the same thing as a Southern blot, and is still widely used to this day. So in some cases this “peaking” of citation counts can represent one foundational technique being superseded by a new and improved technique, but I also wonder how a much more vague factor plays into this - a discovery or technique becoming such common knowledge that people stop bother to cite the original paper. I’ve published a few papers featuring molecular biology techniques for working with DNA, but we didn’t cite the original 1953 Watson and Crick paper describing the structure of DNA, nor the 1944 Avery et al paper that established DNA as the genetic material. These are fundamental to my work, but to the point of being so established that they fall into the same category as “the sky is blue” and “water is wet” - we just don’t feel the need to bother citing the original papers. Even when we use PCR, we don’t cite the original papers describing the technique. if every paper that used PCR cited the original papers from the 1980s, I’ve no doubt they would be among the most cited ever, but no one seems to bother. Of course, if all scientific papers referenced all their statements, no matter how mundane, they would become quite a chore to read, but I think it’s interesting to consider how this distorts the record of how influential some papers have been.
I suspect the findings of the water simulation paper are used for the simulation of more complex molecules. For example, you don't want to calculate protein dynamics in a vacuum, so you include a "water box" around it to give it a more native environment.
Well done! I always thought this would be interesting, but couldn't be asked to do it myself. And great communication of what was in the litterature for the general population.
As a PhD student in Cancer Biology, I find this so interesting to watch. As soon as I saw the top six papers I thought pretty much exactly what Alex said. We gotta understand that RNA/DNA/Protein trinity! Such a fun video!
This was marvellous. I'm not a science guy, but I can appreciate good work and a fascinating story. Along with some history of course, and all of that I loved in this video. Thank you and thank the men and women who wrote those Scientific papers.
11:23 I've cited that! (The Oldfield paper on the Edinburgh handedness inventory) In cognitive neuroscience studies, it is important to have only right-handed participants since left-handed participants have a higher rate of reversed lateralization. For example, the language center of the brain is typically in the left hemisphere, but left-handed people have a notably higher rate of it being in the right hemisphere (although right-handed people can have reversed lateralization and not all left-handed people do). So the Edinburgh handedness inventory is a simple survey to give participants (e.g. "How do you open a jar?" "always right-handed; usually right-handed; either; usually left-handed; always left-handed"), that spits out a quantitative evaluation of "how right-handed" they are. So that's probably a large part of why it is up there.
I love the conclusion of this video. In popular media there is often an obsession with "great minds" and "geniuses" and "breakthroughs", but as you point out science is actually a process of many small contributions and little steps forward (and backward). Very well said!
"Genius breakthroughs" have their place too, at least in certain fields. While they may receive a disproportionate amount of public attention, they also have great impact and influence on research itself. This is clearly the case for example in theoretical physics and mathematics.
THIS! Thank you one of my biggest pet peeves. Holywoods and the general medias obsession and depiction of big breakthroughs as these acts of singular genius by briliant minds who were just enlightened by a stroke of genius in a moment of clarity.
I assume people love this for the same reason they love 'briliant' billionairs advancing the world. If it's those genius people who will progress the world with their briliant mind then I (who ofc is not a supergenius) don't actually need to do anything. 'I couldn't anyway that is for geniuses!' So it's quite comforting/avoids feeling any pressure and having to concern yourself with 'boring' ish like 'process' and 'incremental advancements' or what role I could play in making the world better.
@@jameson44k 'Genius Breakthroughs' in my experience are scientific advancements over months and years explained through one single moment in that year long process. Like claiming the asassination of duke Ferdinand caused World War 1.
@@DarkHarlequin Popular media may depict it that way, I don't really pay attention to them. Mature scientists and mathematicians, on the other hand, tend to care about the steps to breakthroughs and about correct attributions for discoveries.
I agree very much. Indeed.
As a PhD candidate in physics, I can tell you that after reading hundreds of papers, we all learn one very important thing -- none of us know how to write well :)))
Define “well”.
Lol. Need a hand?
bro , phd machine learning here , what is writing ?
@@wadougamer 😂😂😂😂🤣🤣🤣🤣🤣
Hahahah as a scientific translator (sorry to all scientists y'all are very very smart), but it's true
The et al family sure does produce a lot of stellar researchers! excited to see what they do in the future.
The spread of disciplines they are knowledgeable in is also astounding! From economics to physics to biology they have dabbled in everything!
@@Thats_quite_cool They even did take credit in my papers :O
If this is a joke then ok. ET AL means 'and others'
you just killed the mood @@overlordprincekhan
@@overlordprincekhan such is the nature of a scientist to ruin a joke
"The first law of papers states that science is a process. Do not look at where we are now, but look at where we will be two more papers down the line."
- Two Minute Papers, in basically every video :)
Great video!
Hold on to your papers!!! What a time to be alive :)
Glad to see a fellow scholar in this comment section
@TLM brah
Coincidentally, I think a look at his videos can give a good idea as to why simulated water is so high on citations.
@@olgierdvoneverec4135 Not exactly. Simulating water in TMP's context is about graphics processing which has more to do with fluid dynamics, whereas the simulation here is actually about the molecular dynamics algorithm, one of the most common few-body chemistry algorithms in existence.
Great video. I'm assuming the reason the water simulation paper is so high is probably because it sets the precedent for the process of computer modelling molecule interactions through e.g Monte Carlo simulations. This is a technique used by pretty much all chemical simulations - I used it in my research project on Metal Organic Frameworks.
Monte Carlo simulation in electrical engineering too.
Yep. I have cited this paper myself. It is the standard water model paper that is commonly cited when using any of the standard water models (SPC, tip3p, tip4p) and since water is the most common solvent in biological and chemical systems.... well. And whenever you tune the interaction parameters of the existing models or develop your own water model, this is the gold standard data set you are comparing against. Also, these water potentials are commonly used in most force field (i.e. interaction model) families.
- A phD student in computational chemistry
One of my dad's friends who works in programming, spent the better part of 12-ish years creating, optimising and utilising modeling programs that simulate water waves.
My super vague memory of his explanation was, because surface waves are effected by the forces of everything around them. Simulating them was extremely difficult thing to achieve with the minimal processing power they had back then
@@eryana9437 could not agree more, as a fellow computational chemist. If you use molecular dynamics simulation for example you may end up citing this paper.
Yeah I’m in biophysics and we use Monte Carlo simulations for microscopy. Pretty sure it’s big in finance too 😂
Completely ignoring the content of the video (which is really amazing) but I'm loving your editing. Graphics are super engaging, matching the pacing so nicely. I literally have been taking notes of cool ways you keep everything fresh with different animations etc:D Ultimately the content matters more than the editing, but I think this is a great example of how editing can make the content just that little bit more engaging.
It was a 15-minute long video, but it felt like even less than 5. 😱
Reminds me of johny harris
@@gbombmr6125 He's a legend. Probably the only RUclipsr that can get me to watch 20 min videos in one sitting.
Hey its Dr. Trefor Bazett! You're like one of my favorite Mathematics RUclipsrs :)
thanks for your math lessons on youtube.
Your channel is a treasure chest that should be protected at all cost!
His channel is like a big density functional theory.
@Elsa ♪ WTF
100% agree!!! Love the awareness and the comeback of science as I think it was meant to be in the past. Trying to understand the world as a symbiosis between nature and humanity. All equal all important all to be taken care of.
I’ve seen that protected at all costs comment all over the internet
the one piece.
"Earth Scientists don't like writing papers"
As an earth scientist, very true. I'd rather go hit a rock with a hammer. But in all seriousness, I suspect the lack of earth science on the list comes from it being such a new and fast changing field. By the time any one paper gets cited ~500 times another paper describing a better method has been written!
Very true!
Ooga booga
Ordinarily I would expect "I'd rather go hit a rock with a hammer" to be a sarcastic statement, but I suppose you mean that literally.
One other method I've heard of to rank papers is not by their citations, but by the number of citations their citations had, so 'child' citations as it were. This could drastically change the list, removing many of the 'method' papers and potentially including papers which spawned entire branches of scientific enquiry.
It would also be interesting to see some kind of h index of the work (so instead of looking at the h-index of the author over his papers, you would do that over the children of a paper)
Yeah, doing a serious analysis of that graph would be awesome. Deepest lengths favored older papers and difficult and complex processes, more branching suggested x, Nobel prize studies stood out for this reason, and overloading the computer and crashing with no results favored earth science
The funny part is that scientific fields are so quirky and divided with bad search functions that he wasn't even able to just get all most cited papers by himself. It might be bc auf the monetarisation models of peer-review-magazins or the dividing nature of having so many magazins.
Interestingly with some of these big discoveries like DNA's structure, I think a lot of these concepts almost become like "public domain" science, so to speak. People learn these things as a part of their formal didactic learning, and it seems superfluous in the modern day to dig out and cite these papers.
people also stopped citing the papers for the techniques papers for bioscience that are in this video (like sds-page, western blot...) because they also kind of become "public domain" knowledge because it's just so widely-used
Yes, exactly! Many of these things are public knowledge now, we all know what DNA is (sort of), so when you mention DNA in a paper, you don't have to cite it to prove that DNA is a real thing. So over time, it just becomes accepted. Kinda unfair but at the same time, I would love to discover something that can become public knowledge like that :D
Yeah some things are just assumed as prior knowledge. I was once looked at funnily for citing a paper from the 1800's
Yeah. If you cited Origin of Species for the concept of Evolution, it would look like your paper had a minimum literature review requirement and you were padding
Really good point
As soon as you said biology I just knew Alex was inevitable
so true
oh your here?!
You should do your own Top 100 bioscience research papers, we'll love it
@@munawwarshaikh8010 seconded!
I guess Alex had 'assay' about biology...hehehe...I'll walk myself out, thanks.
I expect the list is definitely missing some of the big Machine Learning papers: Hochreiter and Schmidhuber (1997) which introduced LSTMs is at 60k+ citations, Vaswani et al (2017) introducing transformers is at 35k+, Bahdanau et al (2014) introducing attention is at 20k+ and Krizhevsky et al (2012) which introduced CNNs for image recognition is over 100k now
Yeah, I was also surprised; Resnet 100k+, dropout 30k+, BERT 30k+
Makes sense, the ML paper "market" has been booming in recent years.
I'd wager that a big part of why ML papers are doing so well is the same as why most of the papers in this list are there: their application to the biomedical field.
@@isaacjackiw9711 ML is applicable to literally every industry on earth, not just biomedical
@@GalaxyInfernoCodes It also has a very similar structure to other experimental sciences, where certain techniques (skip-connections, attention, cnns, GANs, ReLUs, RNNs) are standards. These are the things that usually work (->citation) or, surprisingly, perform badly (->citation).
The amount of stuff that generally works is miniscule, so you will see the same papers cited all the time for theoretical analysis, architecture upgrades or applications.
I remember how excited I was when a paper of mine went over 100 citations for the first time :)...... these numbers are insane!
That's nothing. My paper of theoretical quantum calculus in higher dimensions of n has over 10k citations.
@@leochinchillaa impressive!
That's awesome! Great accomplishment :)
Epic win!! Idk why that other guy was trying to put you down, but that's great!!
@@leochinchillaa
love reading this then going to op’s channel and seeing “The fortnut experience”
Did not know about the Higgs Boson paper, interesting. The Jorgensen et al paper (which i've cited in my thesis and MD papers) is so high because anyone who has ever performed solution based molecular dynamics simulations (e.g. protein MD) will have cited it with respect to the rationalising the water model they used.
There's also the problem of "fake citations" which are not, as one would expect, works cited but not actually read.
Fake citations are citations that the author was blackmailed into doing if they wanted to see their own paper published.
This apparently happens a lot, especially in universities, and goes hand in hand with professors with tenure blackmailing PHD candidates and junior researchers into quoting them as co-authors of a research they didn't take part in.
Oh wow. I didn’t know that happens-that is, the blackmailing of Ph.D. students by their tenured professors. 😨 Currently in grad school but in the social sciences, though.
@@layla-talmedina5733 these things are rare, but I've heard enough stories that I'm pretty confident that any abusive situation like that that you can think of has probably happened. Thankfully, not most of the time, but universities are very good at keeping their scandals hidden.
It sadly seemst to be pretty common. And in the cases I encountered it not beacuse people are forced to do it, but because some authors, especially in social sciences, seem to have an inner want to splash around with fancy citates, mostly from famous Philosophers. Unfortunately for them I studied Philosophy and read some of the works they cited. They obviously didn't read these works, because the citations were either used in cases were they didn't make any sense at all or were completely misinterpreted to fit the thesis of the author or simplified to an point were they didn't make any sense anymore, which then was pointed out as an flaw; presumably to show what great advances the authors made in their fields. Well, at least the latter was done to Platos "Politeia", who himself with high probability did this trick with the positions of his/Sokrates philosophical "opponents", the Sophists.
Your first sentence doesn't make any sense.
It happens sometimes in smaller field when papers get submitted for review. Since the field is so small it is very likely that the experts selected for review also have relevant papers in the field. Typically the reviewers remain anonymous but you can often tell when one of the reviewers tells you that you also should have cited the 4 papers from that specific author. Not hard to guess in that case who one of the reviewers is :) Most often you just want your paper published and not argue with a reviewer so you just include the citations and be done with it.
Can you find the top 100 for humanities subjects? I think it would be really interesting to see how they breakdown between subjects, even if you didn't read them all
Agreed! I don’t even do humanities but I’m so curious !!
My guess it will be more big ideas but ones that provide dominant frameworks. Idk what subjects would dominate though. My guess is epistemology?
Imagined Communities by Benedict Anderson (about the origins of nationalism) is widely understood to be the most cited humanities journals/books. It is a must read!!
In the discounter store, toilet paper section.
No fuck humanities
I literally had a lecture about the contents of this video this morning! What amazing timing. It was mostly focussing on ecology/conservation scientific papers.
I love the idea that becoming a great researcher does not mean to find something great yourself but to discover something that helps others finding things.
As a 2nd year chemistry undergraduate in the UK, I found this video to be very interesting - for me, the single greatest scientific publication of all time would have to be E.J. Corey’s formulation of retrosynthetic analysis, which has enable the synthesis of very complex organic molecules to be broken down into known precursors that can be used as a means to gain access natural products that are otherwise scarce and not bountiful from plant sources.
As a chemist that use Density Functional Theory in my own undergarduate thesis. It is pretty simple actually. But you really need to see a physical system and their model through quantum mechanic POV. I think that particularly hard for you because climate and weather modeling use classical mechanic POV instead quantum mechanic POV.
cfd and finite element analysis is already a bottomless well. Going to a quantum mechanics pov sounds scary
Can you explain the general idea to someone who never made it past highschool level physics? 😅
@@MrNicoJac imagine throwing electrons at a bunch of nuclei (protons and neutrons) and finding the optimal way in which they’d want to sit. From there, you can figure out the quantum mechanical makeup of the system (which is otherwise very expensive to compute accurately) and determine properties of the chemical system! :)
The first rule of DFT: if you don’t know what you’re doing, just use B3LYP/6-31+G* as a black box solution
The second rule of DFT: when the first rule inevitably breaks down, call a computational chemist ;)
@@AdreaSnow this. As an organic chemist - B3LYP is almost all I need for single molecules. Also, I might ask, your use of the term “expensive” refers to “takes long to compute/a lot of computing power”?
I love your conclusion about how science is ultimately a collaborative human endeavor that's done incrementally! I'm doing my PhD in physics education research, and my work isn't intended to cause huge breakthroughs or get tons of citations. It's designed to be a tool for teachers and professors to make the environment of physics better and to make learning it more fulfilling. I love doing research because I don't just have a direct impact on a few classrooms or departments, I have a tangential impact on the thousands of students who learn physics every year.
I have so much love and respect for the people who work in the labs every day and publish their increments to move science as a whole forward. It's a shame that instead of celebrating that, physics continues to project the idea of geniuses who change everything overnight.
Hey, in Cognitive Science in all studies measuring Reaction Time we essentially have to measure and report the distribution of handedness among our participants. We use "handedness inventories" because it's actually moreso a spectrum than a binary if you dig into that and so we all cite that article! - I've cited it myself.
You should try to use page-rank to find the most impactful papers:
Many big discoveries are big because they launch their own sub-field (i.e. Einstein's theory of relativity). This induces a "cambrian explosion" in that field, which often leads to many papers being published and superseding each other.
In total, the number of citations for the original paper will be low, because it was replaced with an updated version which is now cited by everybody, but in terms of total impact these are way more crucial to science.
Page rank is a way of modelling these transitive citations on a large scale (you could also compute the transitive hull of citations, but I have a feeling that this is going to blow-up computationally)
This is an excellent idea.
As someone who switched from a PhD in molecular biology (and was familiar with most of those top bio papers in the list and have used their techniques) into deep learning research in the last few years, I must point out that the mentioned "dominance" of the biosciences (11:04) in terms of citations is a total mischaracterization at this point. Initially, I thought I'd leave my comment there, but seeing as I'm a scientist I wanted to see if I could corroborate the claim with some evidence. So I went online and found 33 deep learning papers within *just the last 10 years* that have >12,000 citations (or just about; 2 papers have ~11k). I also pulled out 6 classic deep learning papers that should also fall into this top 100 list (well over 12k citations each). That makes 39 deep learning papers to match the 39 bioscience papers in Nature's top 100 list back in 2014, except deep learning did it largely in *just 10 years* (2012-2022; or 6 years (!!) if you go by publication date -> 2012-2018) whereas biology has had ~70 years to do the same. In other words, deep learning is at least an entire order of magnitude faster at accumulating citations than the biosciences at this point. However, Simon's observation that a lot of top cited papers describe methods is also absolutely true in deep learning!
Below I include the list of 39 papers from deep learning in the format [title], [first author], [year] - [citation count at time of writing; April 16 2022]:
New (>= 2012; past 10 years): count = 33
Explaining and harnessing adversarial examples, Goodfellow et al., 2014 - 11,109
Semi-Supervised Classification with Graph Convolutional Networks, Kipf et al., 2016 - 14,122
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, Radford et al., 2015 - 11,824
Mastering the game of Go with deep neural networks and tree search, Silver et al., 2016 - 12,604
Human-level control through deep reinforcement learning, Mnih et al., 2015 - 18,995
Neural Machine Translation by Jointly Learning to Align and Translate, Bahdanau et al., 2014 - 23,057
Attention is all you need, Vaswani et al., 2017 - 39,581
Faster R-CNN: towards real-time object detection with region proposal networks, Ren et al., 2015 - 41,452
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Ioffe et al., 2015 - 35,811
Adam: A Method for Stochastic Optimization, Kingma et al., 2014 - 103,576
ImageNet Classification with Deep Convolutional Neural Networks (AlexNet), Krizhevsky et al., 2012 - 106,136
Distributed Representations of Words and Phrases and their Compositionality (Word2Vec), Mikolov et al., 2013 - 32,822
Generative Adversarial Networks, Goodfellow et al., 2014 - 43,242
Deep Residual Learning for Image Recognition (ResNet), He et al., 2015 - 113,256
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin et al., 2018 - 36,781
Visualizing and understanding convolutional networks, Zeiler et al., 2013 - 15,007
Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, He et al., 2015 - 15,184
Dropout: A simple way to prevent neural networks from overfitting, Srivastava et al., 2014 - 35,027
Auto-encoding variational Bayes, Kingma et al., 2013 - 19,708
Rethinking the inception architecture for computer vision, Szegedy et al., 2016 - 19,059
Going deeper with convolutions, Szegedy et al., 2015 - 38,930
Very deep convolutional networks for large-scale image recognition, Simonyan et al., 2014 - 76,765
You only look once: Unified, real-time object detection, Redmon et al., 2016 - 23,594
Fully convolutional networks for semantic segmentation, Long et al., 2015 - 26,340
Fast R-CNN, Girshick et al., 2015 - 19,658
Rich feature hierarchies for accurate object detection and semantic segmentation, Girshick et al., 2014 - 23,117
Sequence to sequence learning with neural networks, Sutskever et al., 2014 - 18,404
Learning phrase representations using RNN encoder-decoder for statistical machine translation, Cho et al., 2014 - 18,504
Glove: Global vectors for word representation, Pennington et al., 2014 - 27,318
Efficient estimation of word representations in vector space, Mikolov et al., 2013 - 28,197
SSD: Single shot multibox detector, Liu et al., 2016 - 21,032
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, Howard et al., 2017 - 12,698
Mask R-CNN, He et al., 2017 - 18,084
Older (
I enjoy reading very old papers that contain few references or none at all. For example, Riemann's famous paper about the number of primes where he formulates the Riemann hypothesis mentions Euler, Gauss, and Dirichlet in passing but has only one reference: (Jacobi, Fund. S. 184)
@Miles Doyle no
@TLM Mathematics is real, this crap is not.
References != citations
I think mathematics especially has a culture where you only cite something if you use a result from it, while in some other sciences it is much more common to give a brief overview of the field where you give lots of citations.
@Miles Doyle why u r gae?
what i think is a fascinating thing scientific papers is the age range depending on what you're studyin.
the the extremes are earth science in which you can use papers that are over 2 hundred years old, and nano technology in which anything before 2015 is considered outdated.
Would love to see a philosophy/social sciences version of this, you could definitely learn a lot!
So would I!
Social sciences? Yuck
@@Nick-lx4fo weirdo
@@Nick-lx4fo We live in a society
@Rohan saxena I’m not very good at making RUclips videos, but you’re right, maybe I will when I have the time
i didnt expect this video to make me cry. i feel like ive just seen science in a new light. the quote by newton still has me tearing up as i type this. humans are a collaborative animal and it is so beautiful to view science as a human endeavor, built up in little steps. thank you for this video
I think you might be wrong on the machine learning prediction. Since the release of the rankings you used, a few of AI papers have already cracked the top 5, with over 100k citations (and these papers are only 6 years old). Machine learning papers are only getting more dominants in the field. Speaking as a computer science researcher.
he did say the list was outdated
@@emmazhang2418 yeah but the prediction was made as an 'acknowledging change since the list was made' section of the video. So he was accounting for recent changes. He even directly shows one of the machine learning papers I was referencing in the video despite it not being on his list.
Absolutely love this concept. Would be great to see the top 10 - 100 papers per decade or maybe across a few different disciplines.
I think your discovery is spot on: science does not progress by the work of genius, but by the solid and honest work of those who have dedication, if not necessarily talent. We celebrate the seminal points and the singular achievements because humans are biased that way. It contributes to the general misunderstanding of science and technology.
Reminds me of a comment one of my old professors made which went something like. “My most downloaded paper is closing in on 30 downloads. That means everyone in that particular field has downloaded it at least twice.” And I feel THAT sums up so much of academia.
Super interesting! I used some of those techniques myself! I never even though about the fact that they could be the most cited papers. This is the type of information I didn't know I wanted to know. Great video as always!
Listening to this video while reading papers in my Lab ☺️Molecular biology is definitely the most exciting field.
It’s funny hearing Simon say techniques I’m very familiar with but he isn’t. Definitely tables reversed with the atmospheric physics.
Hahahaha I came here to comment this, how the tables have turned 😂🙌🏼
As a materials scientist, I would probably say that the graphene and nanotube papers are mostly about the methods, rather than some fundamental discovery. At least in the same sense as the bio papers. Just because something is a method doesn't mean it wasn't a revolutionary finding in it's own right.
Woo! A fellow Materials scientist :3
Such a cool concept of a video, liked and subscribed.
For the water paper, water in molecular dynamics (MD) simulations can have important interactions with your protein/carbohydrate/compound of interest. The water molecules are treated classically as a predefined number of point charges (the number in the name for the TIP models) as treating it with QM is too costly. A water model with fewer simulated points is exponentially faster but potentially less accurate. Every MD paper's methods will cite the water model used and, as mentioned in some comments below, this is the initial gold standard paper comparing different water models indicating they don't vary THAT much (depending on context ofcourse).
You learned that popularity is a bad judge of good science. I am glad that someone out there is testing what common sense tells us is true.
Jorgensen was my orgo professor at Yale, if his long-winded ramblings were any indications his work on simulating water was the basis of a lot of computational chemistry for biomolecules. Basically, all other work on simulating carbohydrates, proteins, etc needs to use this work on water to figure out what conformations molecules will take in an aqueous environment. I think it was especially important to understanding the thermodynamics of protein folding
Most nostalgic view on biotechnology of the last century!
I know this is going to be worth my time already!
Men I'm professional biotechnology engineering and this is a gold mine to me! I'm very happy I could found this video, thank you for supporting science. Greetings from Chile!
Seems like many of the recent papers from the machine learning/CS field are missing from the list. For example, Alexnet has more than 100k citations in less than one decade. If you list the rate of citations (citations/no. of years since paper published) I assume that list will be dominated by deep learning.
I think you are right that deep ML/DL would dominate the list when using the the rate of citations y on y. But ranking papers from such a vast umbrella which is science is very difficult. Its like the debate with top 10 sportsperson of all time. It will always come with some sort of unwanted/wanted bias.
@@antaripgiri142 totally agree.
Also, as Simon mentioned citation doesn’t necessarily mean impact.
Not sure if that's true. There's a lot of papers on SARS CoV2 that got cited several 10K times in the short timeframe from early 2020. 100K citations in a decade is "only" 10K per year.
i see you aim for this to be a big video in terms of views, i hope it happens as i quite enjoyed it! Great title and thumbnail
Something I've not even thought about when citing papers during my degree.
Really well put together video, thanks for sharing!
Your videos always communicate science in enjoyable and engaging ways, where it's much easier to understand these topics. I'm actually a biology student, so this video was really interesting. I do research in a plant biotechnology lab, and yes, we do use assays 🙂
Very good video, I am a social scientist specialising in bibliometrics and the mapping of science. What you say about single authored papers is actually very unique.
As a biologist, I can confirm that we use Assays all the time but also that we HATE doing it. It's a chore.
The production value of this is crazy. Appreciate your work a lot!
I think a great thing to look at would be in-domain highest Z scores of citations. The citation norms between different areas are just too big to actually compare them in this way. I'd also be very interested in seeing some of the softer-science papers found using this method.
Great video! There's a Machine Learning paper that has over 100k citations. It was published in 2016 and it's called "Deep residual learning for image recognition", He, Kaiming, et al.
This is an interesting finding, Simon. Thank you for services in the realm of science community! 😄
I am a material scientist working with density functional theory and felt pretty offended at 7:00!
Now in all seriousness, you got me interested in looking up those physical chemistry papers, because they might be up my alley :)
Simulating water could be high up because of computer and movie graphics. It is one of those problems a lot of people have to solve when simulating reality. I guess a lot of people have tried to find better ways, building on-top-of the solution suggested in that paper.
I don’t think people understand the work you put in for making this video, awesome work mate keep doing your thing🙌🏼
I think (though this is just an hypothesis) that as the water paper is an early form of molecular dynamics simulation it might be a good reference for new md programs, I also had to learn about that paper so I guess people also really like it xp
Reading a scientific paper is like randomly grabbing a harry potter book out of sequence and opening said book at a random page and trying to make sense of the plot without having prior knowledge of the franchise
simon can you do this challenge on physics papers alone? or maybe math and physics?
+
4:40 It's pronounced "colour"-ri-metry.
I can't imagine reading 100 papers. Good job dude. Subscribed.
Density functional theory.
As a geology student it is a little sad to see zero Earth science papers in the list, but I think it was worth it for that little quip towards the end there lol
This video is incredibly admirable - not only is it good watching and, of course, good for your channel, but I think these types of videos are really necessary in our age - science is changing (and not always for the better) and being grounded in how and why certain papers make it into history is always a great thing to have at our fingertips.
Great video and I can appreciate that a lot of difficult reading went into making this one! I'm a PhD student at Cambridge, doing some molecular simulation in my project, so I can hazard a guess the reason that "comparision of simple potential function for simulating liquid water" paper is so highly cited is because: the choice of force field/potential function is crucial to molecular simulations (both molecular dynamics and monte carlo techniques), as it's what describes the forces between atoms in our in-silico system (simulation box). It was actually surprisingly difficult to get a force field to accurately reproduce the properties of water through simulation - in fact, there are some 80,000+ studies of this alone in the molecular simulation literature. Secondly, water is pretty much in any biological/most chemical systems, so there must be a lot of scientists, who go back and cite this when describing the choices of force field in their study - perhaps, I should too! Hope that explains it :)
your videos are such an inspiration for me! As a newish RUclipsr in the similar sector I have learnt so much from you and your content really is incredible! Massive appreciation!
amazing! i need to do this as a challenge someday!
Water simulation covers things like Oceanic shipping-cost optimizations, avoiding infrastructural pipes bursting from water-hammers, visual effects, and particle flow simulation in general, which extends to pretty much anything that involves movement in a fluid, like all of aerodynamics for example, which means it also gets used in most vehicle design.
TLDR: Fluid simulation is a very massive, very profitable field, and many methods are probably based on the ones in the paper.
As a person recently graduated with PhD I wonder how much time did it take to even skim through those papers.
A scientific paper is not an easy read even if it's in your area of research and here so many where from unfamiliar fields for you.
P.S. Great job! Interesting video. Now I wonder if some scrapper with greedy algorithm(i.e. jump to the reference with the most citations) starting from some random papers could find most entries from this list.
Congratulations on your graduation! So as I'm sure you found during the PhD, a lot of the paper reading was truthfully 'topping and tailing', i.e. reading the abstract, introduction and summary/conclusions. For a fair few papers though I did make an effort to read the main body to try and understand, though for those out of my academic background, it was at this point that I had to tap out as it just became impenetrable. Having said that, for a surprising number of papers, they were actually easy enough to understand all the way through!
This was an incredible video Simon!! Thanks for taking the time to make it! :)
The water simulation paper is definitely in the field of statistical mechanics/thermodynamics. So I don't think it's used for macroscopic simulations like the flow of the river. I'm afraid the dreaded physical chemistry is involved, which inherently scares me as a biochemistry student.
Hmm, interesting, almost all steps ro make pharmaceutical active ingredients are carried out in non-aqueous solvemts
6:55, a DFT paper by Axel Becke!? He's a chemistry prof at my school (Dalhousie), and I only recently learned that he's one of the biggest names in DFT when my solid state physics prof said "Yeah, the guy who developed everything we're learning in this topic? He's in the building next door." lolol
Biology is the coolest science, it literally draws the knowledge of all the other science, chemistry, physics, and maths and even computer science and statistical analyses together. I love the interdisciplinary it has. Amazing. I'm so proud to be part of it.
I am studying physical chemistry for my PhD and the explanation that I got for using different water models is that essentially it localized the partial charge of the oxygen atom in water in different places which allows one performing computer simulations to capture different chemical properties more accurately.
Since you found the Neighbour joining Method of phylogenetic analysis interesting, you might find maximum likelihood and bayesian phylogenetic genetic analyses more interesting. These techniques along with a few others have made biology and computational analysis almost inseparable.
As you indicated, some spots on the list would be overtaken by AI papers. A wonderful example would be the paper "Imagenet classification with deep convolutional neural networks" by Krizhevsky et al. (cited over 100k times), published in 2012. It brought deep neural networks to the foreground of AI research, overcoming existing hardware limitations and probably marked the end of the latest "AI Winter".
Is it really that difficult to find top ranked papers in classic scientific database search tools like Scopus or WoS? (I wouldn’t even think about any other way before going there).
Beautifully made analysis, with interesting and sound conclusions too.
Great video! interesting review for sure! One thing that is really interesting to discuss here is gender inequality and how most (if not all of these authors) are men. Despite women in STEM have grown significantly (or actually started to be recognized by their work), there is still a long way to go...
Very true, we need progress in gender equality. It takes time before women spread in male-dominated fields in large enough numbers that all the young girls can find a female role model. And this is just an example of a reason why this is a hard problem.
Something to note is that there are a LOT of papers out there that are just…junk. They don’t really add anything to the body or work and published for the sake of putting it on the resume. In fact, these papers are often published as ‘I double checked someone else’s work and found that there was evidence to support it’. These papers get almost no citations if any since they are only there to pad a resume. Science is a cutthroat world underneath the veneer of collaboration, and many places won’t hire you without a certain publication count.
It's important to remember that a paper can also have a great number of citations if it's published by well known people but has gross inaccuracies or illogical conclusions. Other papers may cite it with the intention of contradicting it.
The insight regarding the quote "we stand on giants from our past", was brilliant! We actually stand on multitudes of unknown heroes. You've become a legend in my book 🥳
I think your meta-scientific thoughts would actually be worthwhile sharing in a journal dedicated to the philosophy or history of science. This video cannot be cited, but I'm sure many people who don't even know your channel exists would love to cite your findings.
it's funny, I am learning the top-cited papers in my molecular biology class now. kinda amazing
I want to resist the urge...but yknow...first
For those that don’t know, the first author generally writes the article, the second/third are the editor, and all that follow just give feedback and have worked on the project, by doing the lab work and analysis to help the main research move forward.
I'm glad I cited your channel. Great content and editing. You've just earned a sub. Cheers
Interesting that you mentioned machine learning. As someone who works with various machine learning algorithms frequently and busy finishing my master thesis on this topic, the relatively new transformer model and subsequently the BERT model (Bidirectional Encoder Representations from Transformers) have seen an immense number of citations, given the generality and overall success of these models. Currently, the paper on transformers (Attention is all you need paper, Vaswani et al., from 2017) has already reached 37,497 citations at the time of this post and the BERT model has reached 35,000 citations (Bert: Pre-training of deep bidirectional transformers for language understanding, devlin et al. 2018), according to google scholar anyway. Given how new these papers are, that is insane. Although I am new to this channel, I think it would be interesting to discuss these papers and overall the development of machine learning as one of your video topics. These models also currently bring state of the art performance in various fields, especially in NLP, so it is still very relevant too. Amazing job reading all those papers btw, that must have been exhausting.
11:50 That paper has the parameters to compute/simulate some water models have been used in many molecular dynamics (MD) simulations either to compare water properties obtaining through simulation/experimental or to study of all kind of systems where water is present.
The papers related to MD always (or should, at least) cite the source of the force field parameters. Then whenever one MD simulation has water computed using, for example, SPC, TIP3P or TIP4P water model (very, very commonly used), that Jorgensen's paper will be cited.
This was a fascinating video, and the conclusion really drove it home.
For the water simulation paper, it's probably so high up because it sets the ground for computational biology, as all biological and biochemical reactions/interactions in cellular organisms take place in aqueous media. This can include protein modelling, DNA/RNA modelling, determining cellular biochemical reactions such as cellular respiration and photosynthesis, and the modelling of drug interactions in pharmaceutical science and medicine (and very importantly, vaccines). I've come across the importance of modelling water-based systems a lot during my PhD studies in Chemistry.
Excellent video as always :)
best idea ever on youtube 💗 just I wish you do the same focusing on renewable energy field maybe top 10 or 20
Great video Simon!
One thing I think it’s interesting to consider beyond what you covered in the video is the dynamics of citations over time than many of these top papers get. If you look in the Nature article describing these top 100 papers, you can scroll through graphs of the number of citations each paper has got over the year and there are some interesting trends. For most of the papers, rather than steadily accumulating citations over time in an every-increasing manner, the citation counts peak 10-20 years after the original publication date.
For example, No.19 on the list is a paper describing the molecular biology technique called “Southern Blotting”, used to detect specific sequences of DNA in a sample, that was first published in 1975. You can see the graph of citations over time increases up more than 2,000 citations per year around the year 1987, and then declines to less than 100 per year in the present day. This drop after the late 1980s was likely caused by the invention and propagation of a technique that now everyone has heard of: the Polymerase Chain Reaction (PCR), which is basically a much easier method of doing the same thing as a Southern blot, and is still widely used to this day.
So in some cases this “peaking” of citation counts can represent one foundational technique being superseded by a new and improved technique, but I also wonder how a much more vague factor plays into this - a discovery or technique becoming such common knowledge that people stop bother to cite the original paper.
I’ve published a few papers featuring molecular biology techniques for working with DNA, but we didn’t cite the original 1953 Watson and Crick paper describing the structure of DNA, nor the 1944 Avery et al paper that established DNA as the genetic material. These are fundamental to my work, but to the point of being so established that they fall into the same category as “the sky is blue” and “water is wet” - we just don’t feel the need to bother citing the original papers. Even when we use PCR, we don’t cite the original papers describing the technique. if every paper that used PCR cited the original papers from the 1980s, I’ve no doubt they would be among the most cited ever, but no one seems to bother.
Of course, if all scientific papers referenced all their statements, no matter how mundane, they would become quite a chore to read, but I think it’s interesting to consider how this distorts the record of how influential some papers have been.
I remember briefly meeting Simon at Uni, really nice dude. Keep up the good work!
The Editing is crazy🔥💯
I was surprised to see physical chemistry (my field) as being relatively high on the list.
Imagine being a lab tech in 1951 going through a bad breakup and then O H Lowry, et al. drops Protein measurement with the Folin phenol reagent
I suspect the findings of the water simulation paper are used for the simulation of more complex molecules. For example, you don't want to calculate protein dynamics in a vacuum, so you include a "water box" around it to give it a more native environment.
Well done! I always thought this would be interesting, but couldn't be asked to do it myself.
And great communication of what was in the litterature for the general population.
As a PhD student in Cancer Biology, I find this so interesting to watch. As soon as I saw the top six papers I thought pretty much exactly what Alex said. We gotta understand that RNA/DNA/Protein trinity! Such a fun video!
This was marvellous. I'm not a science guy, but I can appreciate good work and a fascinating story. Along with some history of course, and all of that I loved in this video. Thank you and thank the men and women who wrote those Scientific papers.
11:23 I've cited that! (The Oldfield paper on the Edinburgh handedness inventory)
In cognitive neuroscience studies, it is important to have only right-handed participants since left-handed participants have a higher rate of reversed lateralization. For example, the language center of the brain is typically in the left hemisphere, but left-handed people have a notably higher rate of it being in the right hemisphere (although right-handed people can have reversed lateralization and not all left-handed people do). So the Edinburgh handedness inventory is a simple survey to give participants (e.g. "How do you open a jar?" "always right-handed; usually right-handed; either; usually left-handed; always left-handed"), that spits out a quantitative evaluation of "how right-handed" they are. So that's probably a large part of why it is up there.
Interesting observations drawn on what is considered the top cited paper. Very cool!