There are several points to learn from the most cited paper: - Practical usability: make non-scientists want to use it (that's why breakthroughs aren't most cited, they aren't immediately usable outside academia) - Concise, readable: leave details non-essential for use in separate sections to be easily skipped (that's why related literature and proofs are separate) - Graphs, images: if you want to say something important, make it into a graph/image, that's what people scanning through will actually notice - Advertise it: big conferences, cite/link/use it in open-source libraries, etc. In other words, if you want to get lots of citations don't write for academia, write a manual that people on the periphery (not strictly working for universities etc.) can notice, read, understand, and easily use.
Two more points: First, the adam paper entered a feedback loop, where its popularity resulted in a lot of deep learning tutorials on the internet mentioning it. Then, a lot of people with no idea about optimization algorithms pick it because it was recommended in a tutorial, further increasing its citations. Second, the name is in the title. That wouldn't matter for lesser known methods, but when you want to use the software implementation of a method you know nothing about (based on a tutorial you read), it is very easy to figure out which paper to cite when the name is in the title.
I have a friend who is a scientist at DeepMind. He says there are two criteria for measuring performance of the researchers. One is coming up with significant research ideas that may contribute to the development of a general AI and the second is convincing fellow researchers to work on those ideas. So their success criteria is not quantitative but qualitative.
I would assume that most of the people citing this paper never read the paper. Adam is just the standard optimization algorithm in deep learning and is integrated in every deep learning library. People just use it without even knowing how exactly it works and they mention it in their papers. So obviously is gets cited a lot. I doesn't matter at all how well written the paper is in this case.
Two things off the top of my head: 1. At 12:45, about the paper being a method paper. This is very typical in CS where a paper will introduce a problem and then propose a method to solve it, rather than making some discoveries alone. Maybe it will also prove some result also. 2. Worth mentioning also that the original Adam paper had an incorrect proof that was eventually corrected (probably why its arXiv version is edited a few times). I just thought that is also worth mentioning.
I never knew the second thing before. Thanks. I think the main factors are 1. The field of ML is fast-paced, and has been booming from the last decade. 2. GD-like Optimization methods are universal on ML paper, thus most ML papers will cite. 3. Adam is implemented in famous ML libraries. So I think the real factor will be that it is a successful Optimization method, which can be universally applied in the ML field.
Your first point is common in science as well. I can't speak for physics, but at least in biology and chemistry, some of the most highly cited papers are methods and protocols. I suppose that results are only relevant for so long, while methods are cited every single time someone uses them.
Saying that this is "only" a method paper would be a great understatement. Firstly, because a lot of ML papers are "just method" papers. You develop a new method and test it on a set of well known datasets to show that your method works better than others. Another factor in the adoption of Adam (which is basically used everywhere all the time as the go to optimization technique now) was really easy to implement in popular machine learning libraries, which some other optimizers were not. Also... It's simply a great idea wrapped in a very well written paper.
It was an era of the boom of machine learning. Previously, in many fields there were non machine learning techniques to model the data, but since 2015 and on many of those techniques are barely used anymore since for example deep learning models, can do all the intermediate steps of data extraction, and you do not need many layers of algorithms to model the data. You just need one. To be honest, I think luck is also a great factor. Of course these researchers are really smart and hardworking etc, but it was the right time. It was an era of changing the methods from pre-machine learning to machine learning methods.
Having the Related Work section right before the Conclusion is common in computer sciences. This style may be specific to research subareas or simply advisors. I previously worked on crypto side channels and now on embedded system security. My previous advisor preferred "related work" as a subsection in the Introduction, while my current one prefers the other. I 100% agree with you about the advantage of method papers over discovery papers. However, it is not strange to see short but impactful papers in theoretical computer science. Hao Huang's paper in 2019 is an example. It is 5 pages long, the main body (a proof) is two pages long, and the math is at best graduate-level. But, it solves a 30-year-old conjecture of Boolean Sensitivity. The paper's value lies in its simplicity against a long-lasting problem.
Luck is such a massive factor ultimately. The contribution of the Adam paper in terms of new ideas doesn't really stand out to me. It is a small improvement over previous approaches which themselves are just various ways of implementing momentum into stochastic gradient descent (a very intuitive concept). It's a nice paper for sure but there are probably thousands of papers which have more substantial new ideas and yet end up with like 10 citations. You have to have exactly the right idea at the right time in history to have big impact. Also, it is so field dependent. In certain fields like ML, there are just way more papers overall. Let's be honest, most of them have barely any novelty and just tweak existing methods a bit. They appear in lower tier conferences and that's it. And then for a lot of the top papers by the big companies like DeepMind, the main "novelty" is that they threw 10x the compute power on the problem compared to anyone else. Like obviously you get better results if you spend $100 million on GPU clusters.
It is luck in a sense that by chance the authors were the first to stumble upon this relatively simple and in hindsight obvious algorithm that just so happened to be the most robust deep learning optimizer out there for practically all use cases. As such it is the defacto choice for training basically any deep learning method and hence is cited in essentially *every* deep learning paper that is or will ever be released. It's almost like if they had stumbled upon the idea of 'matrix multiplication' and written a paper about it. That would also be a very highly cited paper.
Adam is very important, everyone who works with DL knows that without Adam it won't work so well or it will take ages to train, making it just impractical. It's a paper with many leaks (and a wrong proof) but its impact is unquestionable. However RMSProb was quite good as Adam but it has not even cared about a pubblication lol (it is quoted from a blogpost and a coursera video).
I think this highlights the importance of practicality in paper publishing and research. I use it almost daily, or at least weekly, ADAM, or nowadays its better performing derivatives in ML Engineering. I must admit I don't understand it anymore 100%, but I have seen it is the most robust optimization algorithm. I only have twice or 3 times re-read the papers that I read during my masters thesis initially. Other optimization techniques are harder to grasp, manifold.
Geoffrey Hinton was the PhD supervisor of Jimmy Ba (one of the two Adam paper authors). I feel like this could heavily be contributed to his success as Geoffrey is literally called "The Godfather of AI". He is one of the most cited people ever and has an enormlus influence on the whole scientific community
This is not even published in a Journal!! It is just a conference paper that was not even published in a transactions conferences specialized publishing Journal. You can find its source only back in Arxiv preprint server. I suspect that it was pushed by Arxiv and Cornell University and their social media connections.
It is not surprising that method papers are cited more. Discoveries may be interesting or important but in practice it is the methods that have more impact. You can take a method and use it in a meaningful application across a variety of disciplines but what can you do with a discovery?
The Related Work section showing up late in the paper is advice I’ve seen from CS researcher Simon Peyton-Jones. He has popular Microsoft Research talks on RUclips describing this format. As you noted it allows the reader to dive into your own original content as quickly as possible.
*Special Relativity is not a highly cited paper because its common knowledge.* It goes beyond citation realm, like Newton's Laws, Schrodingers equation, Maxwells equations, etc. Nobody cites those anymore because they are bigger than citations.
It's a nice video, and I would like to highlight a point where I believe should also be looked at especially when the number of publication of the authors of ADAM paper were compared in the year they published their paper. I believe the number of publication per author can be also evaluated from the previous years of initial ADAM publication. The question is, until they reach the level of publication of their ADAM paper how intense they focused on to this project, and if we can follow it through the number of publications they published before ADAM paper. Because, if they really focused on one big project, then it must be appeared at the number of publication until they publish ADAM paper. So I believe this would be another interesting aspect to look at on this matter.
I’m not even a research student and I love this channel. I want to make special note of how good you are at communicating your ideas. You said somewhere you are from Russia, and honestly, you speak better than most native speaking English professors
The answer is very simple: because AI! It’s a go-to method in the most powerful new approach that we have, an approach that is applicable to everything.
I think citation number is not as neural as it looks like, especially in the current publish atmosphere requiring literature review. Which means if you apply a popular method, there is no harm to cite its original and this causes a citation inflation. ADAM is an optimization method and its fundamental, so it's not surprising it has high citations.
While the sheer number of citations is still surprising, the seemingly unorthodox setup of the paper itself certainly isn't if you come from a computer science background. The truly fundamental papers tend to be more mathematical in nature, but many seemingly primarily serve as a more formal documentation of source code. Since software can be so quickly iterated upon, papers within the field are at their most useful if you can read and replicate them in your own work within an afternoon.
Hi! It is an interesting phenomenon indeed. In this particular case, if you check the CVs of Durk Kingma or Jimmy Ba, you will see that they did an internship/fellowship with Google around 2014-2015, that is, during their PhD studies. I don't know all details, obviously. But probably Google had a significant influence on their research. Many other famous papers in the field (for example, "Attention Is All You Need") are fully or partially affiliated with Google. Again, I don't have much experience here. But it seems that getting an internship (or any collaboration) with companies like Google can boost your academic career. Andrey
Just leaving a comment here but I think this video was very long (Even though I finished it cuz you're nice to listen to and the topic is interesting). However surely you present the things that we can learn from the adam paper FIRST, and then explain the backstory and check the paper etc.
I looked up my two acquaintances I knew had published, there was a similar difference between those two that was mentions in this video, maybe for the same reasons.
Not to be nit-picky but the 2015 ResNet paper (Deep Residual Learning for Image Recognition) is cited over 200k times and is more widely considered *the* most cited recent paper. (I don’t know if you clarify this later in the video but I had to type this because of the title :P)
So as soon as I saw that the paper was about machine learning - it was no surprise. Machine learning is the most fashionable research area on the entire planet.
I don't know man. I look at you and you look like naive first year PhD student not as a dude with 6 years academia experience. The main reason everyone cite those guys is that Adam is everywhere, everyone is using it, that's the main reason, the quality of the text is secondary. I use it even though I know just a basic idea behind it, not every single mathematical detail. I use it as a build in algorithm in software. The reason why nobody cite your papers is simple, you do not do much of the impact in the field or your field might be too narrow or not in the trend. You might do something decent but useless for the society at this point of time. Some of my papers, which I thought would be dead are most cited and vice-versa. One paper was a students' paper, about solar plant, I help them a bit with thermodynamics and they included my name. Couple of years it was dead, and then recently started collecting citations exponentially because solar became popular and people are citing recent solar papers, not because that paper was top notch cutting edge, etc.. just not bad student paper, MSc level. And currently it has 7 citations which it collected in the past 6-7 months or so. Remember, h-index is all about the trend, how wide is the field and popular topic. I got it first year, when I started publishing and I just live with it and not being jealous towards others or searching the logic in citation rates. I just publish what I consider might be interesting to share with others and not crying when I don't get anything in return. You might also get some citations on your old papers. You never know. Look at Boris Delaunay and Delaunay triangulation :)
Wow, this number of citations looks as something very beyond the truth! Even Lotfi A. Zadeh, who introduced fuzzy logic in his 1965 paper, has roughly 140k citations according to the Google Scholar.
The sunk cost of your PhD is the cause of your bias. The bias is that academics is currently significant. It's not. The University Industrial complex profits from tuition. Tuition at high rated universities has an inelastic price curve. The ever rising profits are not shared with the highly educated laborers. They are economically dependent useful idiots. The profits are invested into alumni preferred areas like sports and football. You know this is true for literature and philosophy depts. It's now true for the science depts.
Graduate studies provide the time and space to think. That is the value of it. Now, most university administrators and indeed most professors don't think that. And apparently you don't think that either. That is fine. The point is that advances tend to come from people who have the time to think and do wild things, profit be damned. In universities or corporations. So, it is rather difficult to judge the value of such a place. For example, what is the value of Bell Labs in the old AT&T? Was it more valuable than Google right now? If you look at the bottom line, of course it was not. But if you looked at what they accomplished, that is a very different story. It is not always about the cost. I have a PhD but I did not become a professor. Wanting to have time and space to think and wanting to be a professor are two different things. I don't care if most people getting PhD want to be a professor. You don't have to, and I did not become one.
Probably worked out how to add two numbers together in the most obtuse way ever thought. ..so after that ..people who never thought about it that way...hey I've got another idea...I'll add 1 to that one and subtract 1 from the other..BUT I'll cite the original. ...see where we are going here ? ...a citation is more than likely an acknowledgement of lack of original thought....
Put it this way...Diego Maradona didn't need to go to an Institute of Sport to learn how to play soccer. ...he was a genius on the pitch a natural ...(forget his off field)...in other words... no citations required...it was obvious....
TBH, those figures are sh!!t. Not in a good quality. I think it was just hyped by some means, and it got referenced. There are millions of algorithms discovered and may be google sponsored the marketing campaign 😂 It doesn't even have that much importance though it got published.
There are several points to learn from the most cited paper:
- Practical usability: make non-scientists want to use it (that's why breakthroughs aren't most cited, they aren't immediately usable outside academia)
- Concise, readable: leave details non-essential for use in separate sections to be easily skipped (that's why related literature and proofs are separate)
- Graphs, images: if you want to say something important, make it into a graph/image, that's what people scanning through will actually notice
- Advertise it: big conferences, cite/link/use it in open-source libraries, etc.
In other words, if you want to get lots of citations don't write for academia, write a manual that people on the periphery (not strictly working for universities etc.) can notice, read, understand, and easily use.
Nice takeaways. appreciate your insightful comment.
- it also needs to be on a topic many many people care about also
- it needs to be published at the right time as it solves a pressing problem.
Spot on. Covers everything.
Make a paper about making papers and become the meta paper publisher 😂
Write a paper about citation and cite that paper in your paper making it the forst metacitation
Worked for Andrew Garfield
Pro that would be epic. Probably would go in a psychology journal?
paper
You coincidentally stabled across a fundamental fact: Reviewing ML papers is a quite successful strategy for youtube channels ;)
Two more points:
First, the adam paper entered a feedback loop, where its popularity resulted in a lot of deep learning tutorials on the internet mentioning it. Then, a lot of people with no idea about optimization algorithms pick it because it was recommended in a tutorial, further increasing its citations.
Second, the name is in the title. That wouldn't matter for lesser known methods, but when you want to use the software implementation of a method you know nothing about (based on a tutorial you read), it is very easy to figure out which paper to cite when the name is in the title.
I have a friend who is a scientist at DeepMind. He says there are two criteria for measuring performance of the researchers. One is coming up with significant research ideas that may contribute to the development of a general AI and the second is convincing fellow researchers to work on those ideas. So their success criteria is not quantitative but qualitative.
The Systematic Review of Systematic Reviews: A Systematic Review
I would assume that most of the people citing this paper never read the paper. Adam is just the standard optimization algorithm in deep learning and is integrated in every deep learning library. People just use it without even knowing how exactly it works and they mention it in their papers. So obviously is gets cited a lot. I doesn't matter at all how well written the paper is in this case.
Two things off the top of my head:
1. At 12:45, about the paper being a method paper. This is very typical in CS where a paper will introduce a problem and then propose a method to solve it, rather than making some discoveries alone. Maybe it will also prove some result also.
2. Worth mentioning also that the original Adam paper had an incorrect proof that was eventually corrected (probably why its arXiv version is edited a few times). I just thought that is also worth mentioning.
I never knew the second thing before. Thanks. I think the main factors are 1. The field of ML is fast-paced, and has been booming from the last decade. 2. GD-like Optimization methods are universal on ML paper, thus most ML papers will cite. 3. Adam is implemented in famous ML libraries. So I think the real factor will be that it is a successful Optimization method, which can be universally applied in the ML field.
Your first point is common in science as well. I can't speak for physics, but at least in biology and chemistry, some of the most highly cited papers are methods and protocols. I suppose that results are only relevant for so long, while methods are cited every single time someone uses them.
Saying that this is "only" a method paper would be a great understatement. Firstly, because a lot of ML papers are "just method" papers. You develop a new method and test it on a set of well known datasets to show that your method works better than others.
Another factor in the adoption of Adam (which is basically used everywhere all the time as the go to optimization technique now) was really easy to implement in popular machine learning libraries, which some other optimizers were not. Also... It's simply a great idea wrapped in a very well written paper.
It was an era of the boom of machine learning. Previously, in many fields there were non machine learning techniques to model the data, but since 2015 and on many of those techniques are barely used anymore since for example deep learning models, can do all the intermediate steps of data extraction, and you do not need many layers of algorithms to model the data. You just need one. To be honest, I think luck is also a great factor. Of course these researchers are really smart and hardworking etc, but it was the right time. It was an era of changing the methods from pre-machine learning to machine learning methods.
Interesting. Thanks for the comment!
Having the Related Work section right before the Conclusion is common in computer sciences. This style may be specific to research subareas or simply advisors. I previously worked on crypto side channels and now on embedded system security. My previous advisor preferred "related work" as a subsection in the Introduction, while my current one prefers the other.
I 100% agree with you about the advantage of method papers over discovery papers. However, it is not strange to see short but impactful papers in theoretical computer science. Hao Huang's paper in 2019 is an example. It is 5 pages long, the main body (a proof) is two pages long, and the math is at best graduate-level. But, it solves a 30-year-old conjecture of Boolean Sensitivity. The paper's value lies in its simplicity against a long-lasting problem.
Luck is such a massive factor ultimately. The contribution of the Adam paper in terms of new ideas doesn't really stand out to me. It is a small improvement over previous approaches which themselves are just various ways of implementing momentum into stochastic gradient descent (a very intuitive concept). It's a nice paper for sure but there are probably thousands of papers which have more substantial new ideas and yet end up with like 10 citations. You have to have exactly the right idea at the right time in history to have big impact.
Also, it is so field dependent. In certain fields like ML, there are just way more papers overall. Let's be honest, most of them have barely any novelty and just tweak existing methods a bit. They appear in lower tier conferences and that's it. And then for a lot of the top papers by the big companies like DeepMind, the main "novelty" is that they threw 10x the compute power on the problem compared to anyone else. Like obviously you get better results if you spend $100 million on GPU clusters.
It is luck in a sense that by chance the authors were the first to stumble upon this relatively simple and in hindsight obvious algorithm that just so happened to be the most robust deep learning optimizer out there for practically all use cases. As such it is the defacto choice for training basically any deep learning method and hence is cited in essentially *every* deep learning paper that is or will ever be released. It's almost like if they had stumbled upon the idea of 'matrix multiplication' and written a paper about it. That would also be a very highly cited paper.
Brutal black pill
Adam is very important, everyone who works with DL knows that without Adam it won't work so well or it will take ages to train, making it just impractical.
It's a paper with many leaks (and a wrong proof) but its impact is unquestionable.
However RMSProb was quite good as Adam but it has not even cared about a pubblication lol (it is quoted from a blogpost and a coursera video).
I think this highlights the importance of practicality in paper publishing and research. I use it almost daily, or at least weekly, ADAM, or nowadays its better performing derivatives in ML Engineering.
I must admit I don't understand it anymore 100%, but I have seen it is the most robust optimization algorithm. I only have twice or 3 times re-read the papers that I read during my masters thesis initially.
Other optimization techniques are harder to grasp, manifold.
I absolutely love this paper. You can open it, rewrite the algorithm presented in your programming language of choice, and it just works :D
Geoffrey Hinton was the PhD supervisor of Jimmy Ba (one of the two Adam paper authors). I feel like this could heavily be contributed to his success as Geoffrey is literally called "The Godfather of AI". He is one of the most cited people ever and has an enormlus influence on the whole scientific community
This is not even published in a Journal!! It is just a conference paper that was not even published in a transactions conferences specialized publishing Journal. You can find its source only back in Arxiv preprint server. I suspect that it was pushed by Arxiv and Cornell University and their social media connections.
It is not surprising that method papers are cited more. Discoveries may be interesting or important but in practice it is the methods that have more impact. You can take a method and use it in a meaningful application across a variety of disciplines but what can you do with a discovery?
I think you should start inviting guests who were cited numerous times.
The Related Work section showing up late in the paper is advice I’ve seen from CS researcher Simon Peyton-Jones. He has popular Microsoft Research talks on RUclips describing this format. As you noted it allows the reader to dive into your own original content as quickly as possible.
*Special Relativity is not a highly cited paper because its common knowledge.*
It goes beyond citation realm, like Newton's Laws, Schrodingers equation, Maxwells equations, etc.
Nobody cites those anymore because they are bigger than citations.
It's a nice video, and I would like to highlight a point where I believe should also be looked at especially when the number of publication of the authors of ADAM paper were compared in the year they published their paper. I believe the number of publication per author can be also evaluated from the previous years of initial ADAM publication. The question is, until they reach the level of publication of their ADAM paper how intense they focused on to this project, and if we can follow it through the number of publications they published before ADAM paper. Because, if they really focused on one big project, then it must be appeared at the number of publication until they publish ADAM paper. So I believe this would be another interesting aspect to look at on this matter.
Good story telling and narrative. I like your videos ! Thanks
Maybe this is such a highly cited paper, also because the proposed method is very good?
Bro, ResNet’s paper (2015) has 200k citations
Correct. scholar.google.com.tw/citations?view_op=view_citation&hl=zh-TW&user=DhtAFkwAAAAJ&citation_for_view=DhtAFkwAAAAJ:ALROH1vI_8AC
I love this channel
Great work my friend
I’m not even a research student and I love this channel. I want to make special note of how good you are at communicating your ideas.
You said somewhere you are from Russia, and honestly, you speak better than most native speaking English professors
My professor for deep learning course is one of the author of this paper. pretty cool
make a video on how to correctly study a research paper.. i think it is an underrated skill
Thanks! Noted.
“Attention is All You Need” is up there in the number of citations
U either care about the science (screw citations and impact factor) ... Or about money. Can't sit on two chairs at the same time
The answer is very simple: because AI! It’s a go-to method in the most powerful new approach that we have, an approach that is applicable to everything.
No puedo evitarlo pero este tipo se parece a Sheldon de The Big Bang Theory 13:04
Wow man your videos are fantastic.
I think citation number is not as neural as it looks like, especially in the current publish atmosphere requiring literature review. Which means if you apply a popular method, there is no harm to cite its original and this causes a citation inflation. ADAM is an optimization method and its fundamental, so it's not surprising it has high citations.
While the sheer number of citations is still surprising, the seemingly unorthodox setup of the paper itself certainly isn't if you come from a computer science background. The truly fundamental papers tend to be more mathematical in nature, but many seemingly primarily serve as a more formal documentation of source code. Since software can be so quickly iterated upon, papers within the field are at their most useful if you can read and replicate them in your own work within an afternoon.
Attention is All you Need has almost 100k citations combined over preprint, journal. its catching up quickly.
Can you elaborate more on 15:50 where you said that Google facilities aid in general research processes? Its very interesting. Great video
Hi! It is an interesting phenomenon indeed. In this particular case, if you check the CVs of Durk Kingma or Jimmy Ba, you will see that they did an internship/fellowship with Google around 2014-2015, that is, during their PhD studies. I don't know all details, obviously. But probably Google had a significant influence on their research. Many other famous papers in the field (for example, "Attention Is All You Need") are fully or partially affiliated with Google. Again, I don't have much experience here. But it seems that getting an internship (or any collaboration) with companies like Google can boost your academic career. Andrey
Just leaving a comment here but I think this video was very long (Even though I finished it cuz you're nice to listen to and the topic is interesting). However surely you present the things that we can learn from the adam paper FIRST, and then explain the backstory and check the paper etc.
I looked up my two acquaintances I knew had published, there was a similar difference between those two that was mentions in this video, maybe for the same reasons.
Not to be nit-picky but the 2015 ResNet paper (Deep Residual Learning for Image Recognition) is cited over 200k times and is more widely considered *the* most cited recent paper. (I don’t know if you clarify this later in the video but I had to type this because of the title :P)
I'm a historian / pol sci phd type-thing - I might just go ahead and cite it just to get on the bandwagon.
Soft kitty,
Warm kitty,
Little ball of fur.
Happy kitty,
Sleepy kitty,
Purr Purr Purr
what
@@nataliemreow from the tv show big bang theory
I just reacted to your last year video....same subject ;-)
So as soon as I saw that the paper was about machine learning - it was no surprise. Machine learning is the most fashionable research area on the entire planet.
I don't know man. I look at you and you look like naive first year PhD student not as a dude with 6 years academia experience. The main reason everyone cite those guys is that Adam is everywhere, everyone is using it, that's the main reason, the quality of the text is secondary. I use it even though I know just a basic idea behind it, not every single mathematical detail. I use it as a build in algorithm in software. The reason why nobody cite your papers is simple, you do not do much of the impact in the field or your field might be too narrow or not in the trend. You might do something decent but useless for the society at this point of time. Some of my papers, which I thought would be dead are most cited and vice-versa. One paper was a students' paper, about solar plant, I help them a bit with thermodynamics and they included my name. Couple of years it was dead, and then recently started collecting citations exponentially because solar became popular and people are citing recent solar papers, not because that paper was top notch cutting edge, etc.. just not bad student paper, MSc level. And currently it has 7 citations which it collected in the past 6-7 months or so. Remember, h-index is all about the trend, how wide is the field and popular topic. I got it first year, when I started publishing and I just live with it and not being jealous towards others or searching the logic in citation rates. I just publish what I consider might be interesting to share with others and not crying when I don't get anything in return. You might also get some citations on your old papers. You never know. Look at Boris Delaunay and Delaunay triangulation :)
Nice video
right place right time
You forgot to mention that the proof of convergence is incorrect :p
Transformers is all you need was published less than 10 years ago, and is cited 110k videos
Wow, this number of citations looks as something very beyond the truth! Even Lotfi A. Zadeh, who introduced fuzzy logic in his 1965 paper, has roughly 140k citations according to the Google Scholar.
looks like some papers go viral, or equivalent of that.
wait till AI starts publishing papers.
Хаха Чуркин
Jokes aside, nice content :)
AI has been gathering massive fundings since a decade so it's not surprising that people are publishing and citing papers in that field.
And there is a mistake in the paper. The bug in the proof was found years later.
Can you provide a link to the source for your comment?
@@RexPilger arxiv.org/pdf/1904.09237.pdf
@ChuScience
yeah adam is less sensitive version of Adagrad
The sunk cost of your PhD is the cause of your bias. The bias is that academics is currently significant. It's not. The University Industrial complex profits from tuition. Tuition at high rated universities has an inelastic price curve. The ever rising profits are not shared with the highly educated laborers. They are economically dependent useful idiots. The profits are invested into alumni preferred areas like sports and football.
You know this is true for literature and philosophy depts. It's now true for the science depts.
Wait what, alumni prefer sports and football? I'm not from the US so that sounds wild to me
Graduate studies provide the time and space to think. That is the value of it. Now, most university administrators and indeed most professors don't think that. And apparently you don't think that either. That is fine. The point is that advances tend to come from people who have the time to think and do wild things, profit be damned. In universities or corporations. So, it is rather difficult to judge the value of such a place. For example, what is the value of Bell Labs in the old AT&T? Was it more valuable than Google right now? If you look at the bottom line, of course it was not. But if you looked at what they accomplished, that is a very different story.
It is not always about the cost. I have a PhD but I did not become a professor. Wanting to have time and space to think and wanting to be a professor are two different things. I don't care if most people getting PhD want to be a professor. You don't have to, and I did not become one.
KennyS brother
Sir Humphrey: Yes and No.
we use it in machine learning
10:38 Just skipped randomly, I got my answer 😂
Edit: Twice now.
Probably worked out how to add two numbers together in the most obtuse way ever thought. ..so after that ..people who never thought about it that way...hey I've got another idea...I'll add 1 to that one and subtract 1 from the other..BUT I'll cite the original. ...see where we are going here ? ...a citation is more than likely an acknowledgement of lack of original thought....
Put it this way...Diego Maradona didn't need to go to an Institute of Sport to learn how to play soccer. ...he was a genius on the pitch a natural ...(forget his off field)...in other words... no citations required...it was obvious....
Андрюх , ты чет похудел как будто в своем зазеркалье
Ok Sheldon Cooper
T490s?
T14s
bro aint no way dudes surname is churkin
Noting the fact that you had never heard of it , I find it's relevance questionable .
It's only cited that much becausw of the AI/deep learning boom. Power systems don't have enough hype.
They pay attention to the quantity, but not to the quality 😂 which is totally wrong criteria
TBH, those figures are sh!!t. Not in a good quality. I think it was just hyped by some means, and it got referenced. There are millions of algorithms discovered and may be google sponsored the marketing campaign 😂 It doesn't even have that much importance though it got published.
Similar to vasp author. Many people use it. Thus, they get many citations.