What happens when people like Sussman are gone? How do we get this knowledge back? He knows all the stuff from the 60s and 70s, and I graduated with a physics degree knowing about 20% of what he spent his life figuring out.
Having had a period of my life where I had a huge interest in programming languages in general, the history of programming languages and programming language design I would say that we do have a lot of knowledge recorded, in papers, books and by looking at the actual programming languages that exists and have existed.
Isn't the main thing that all of them either had a firm understanding of math or CS? There are people that do really friggin cool things today, and in 15 years we will see them in the languages we use today. Like delimited continuations. Which languages have them? Ocaml, some schemes, GHC haskell? Scala has the weird shift reset primitives, but that counts. Delimited continuations are aver 30 years old, yet they are obviously a primitive everyone should have. Anyway, people will find ways to express complex things, and then abstractions will allow us mortals to use those cool things adapted to certain domains to simplify our programming lives. But I might be too optimistic
I have thought about that a number of times. There is perspective that we will lose when all of these people who were there at the beginning are gone. The Windows operating system used to have a better user interface than it does now. I and people my age still remember that. Knowing that it was better in the past, there's a chance that people will come to their senses and bring back or improve in a way that recognizes that. If that doesn't happen soon, then that understanding might be gone forever. Or two hundred years from now, someone will invent something that was common 10 years ago and think it is a modern marvel, while really, people were just so distracted by other things that they forgot about it--like how you can hold the middle mouse button and then scroll by dragging, just to give a concrete example.
If you have a few days on your hand, try to read "What every computer scientist should know about floating point arithmetic". (Several versions online, none of them is without typos. When you read surprising things, triple check.) It enlightened me soooo much. I got a glimpse of what it takes to understand a program using floating points and why it's so easy to get them wrong. It's unsettling. As the name suggest. I think every computer scientist should read it.
This is why static typing is a trap. Semantics about entities and their relations should not be static but dynamic. That’s where flexibility and adaptability comes from. Too many devs throw that away due to fear. “We never let you run a certain kind of mistake” feels safe, but in reality, “we can correct any kind of mistake while the program runs” is safer, and unlocks so many possibilities. Thanks to Sussman for making that trade off so clear in this talk!
Wow. This is a phenomenal virtuoso performance, showing extremely expressive programming systems that most of us can only dream about, as we struggle with horrific legacy code in our day jobs!
*spends three weeks trying to push a single 5 loc change to production and dealing with paperwork and management* *comes home to write 2400 loc 6502 emulator in one day to relax*
Second time watching this. Great talk! I disagree with the title, though. It's not _computing_ we're bad at; it's _reasoning_. Computing is planned; reasoning is unplanned.
Considering his "memory is free" comments, it is interesting that my current job is optimizing a code base to reduce the client program footprint, use less memory, reduce network packet size and latency, and utilize multiple threading to reduce user wait time for computations.
He made a remark that, surely, there're applications which still require peak performance. A bit later he describe a lot more specific goal: to reach low latency, no matter the means. If your app is under the 100ms bar of human perception all the time for all users, you don't really need any performance increases. This is the idea.
@@LowestofheDead This and the other lectures in the same series are not perfect, but they explain a lot: ruclips.net/video/90MbHphnPUU/видео.html I can see what they meant with "sheaf theoretic approach," because the set of constraints of a given problem form a topological space (you can take intersections and unions of constraints, basically equivalent to AND and OR in logic), and the degrees of knowledge you have about your problem (the intervals of approximation in the video) maybe form a sheath over that space. Or in any case they have interesting algebraic properties that can be exploited.
1:32 I think the main difference between a genome and a computer program is that the genome doesn't really determine everything a cell can do, but a lot of it only works by interaction between what the cell is "told" by it's genome and the environment. You can see evidence for this for example in finger prints, which are different even between identical twins. Computer programs can also have emerging complexity, but only by the data interacting with other data,like in a cellular automaton. But cells also have physical and chemical interaction with the environment outside of the body.
This talk is actually less about programming than it is about automated deduction systems. Which is exactly what I'm interested in because it's what modern machine learning is exceedingly bad at. ML works solely by approxmations. Not that it's bad to get an answer of 99 meters high instead of 100m for a building. But it's very bad to have the system able to mix up a bit of the current weather with the barometer reading. ML is (usually) continuous in *every* dimension. Most models don't allow for discontinuity. Which is unfortunately the base of symbolic reasoning.
love this talk 🎉 saw this computer news today and it reminded me of it ```js What has to happen for mixing and matching different companies’ chiplets into the same package to become a reality? Naffziger: First of all, we need an industry standard on the interface. UCIe, a chiplet interconnect standard introduced in 2022, is an important first step. I think we’ll see a gradual move towards this model because it really is going to be essential to deliver the next level of performance per watt and performance per dollar. Then, you will be able to put together a system-on-chip that is market or customer specific. ``` 24:49
Given two things, A) I'm on a 12 year old machine that can't run anything programed in the past 5 years, and B) I had faster webpage loading times in the days of 56k modems than I have today with 4g lte. Both of these facts tell me that SPEED AND MEMORY ARE NOT FREE! If you can't run your program on a decade old machine, you're doing it wrong.
I think you’re missing the point - he’s demoing using a language that would have originally run on the old computer he shows, so now he has millions of times more speed and memory available to explore different ways to solve problems, with the primary constraint for coding no longer being performance, but rather flexibility, redundancy, reliability, etc. This isn’t the same as what is going on in the web and presumably electron apps you might be referring to, where they are just plain inefficient and consume all available memory and cpu for zero return. He could have easily ran those lisp programs interactively on a machine from the 1980’s, and he demoed them in this video on a machine that is now over a decade old!
They are free from a developer perspective. One could allocate 16GB to resolve a problem that could be resolved in 64kb. The problem you are facing is: you want to consume products created by entities who doesn't care about how speed/memory would cost to their consumers.
"But in the future it's gonna be the case that computers are so cheap and so easy to make that you can have them in the size of a grain of sand, complete with a megabyte of RAM. You're gonna buy them by the bushel. You could pour them into your concrete-and you buy your concrete by the megaFLOP-and then you have a wall that's smart. So long as you can just get some power to them, and they can do something, that's gonna happen."
Excellent ideas. Biological 'code' is very very dense by my look. It's around 3.1 billion base pairs for DNA, where he got his gigabyte. But ACGT is base 4 so it's more like 9,000,000 terabytes of binary code? That is very very complex code for growing elbows and everything else in the expected places. You can maybe cut that down ignoring some non-coding DNA but you can also consider those sections like compiler tuning parameters effecting the transcription rates of neighboring coding sequences and folding stickiness of the chromosomes. ...time to practice some LISP.
It's pretty fun thinking about biology like a computer. Contains its own compiler code to make ribosomes. Converts DNA to RNA like source code to AST for a compiler. Each cell as its own processor with state memory from chemical signals and damage. Proteins as small programs vying for processing time to work in a crowded cell before they are killed. Each cell flooding the system with messages easily lost in the form of chemical signals. A ton of parallel I/O processing of all of those signals, noisy networks. Trashing a whole processor if it gets a virus before it can send out too many bad virus packets to the system... Not sure if it's a useful model to work off of though. Self destructing pi zeroes when they detect an errant process would be pricey.
You've made a mistake in converting size due to different bases. it's just *2, not ^2 (1 base 4 symbol is 2 base 2 symbols), so gigabyte is about right (3.1 billion pairs, each of which is 1 base 4 symbol, so 2 base 2 symbols -> 6.2 billion bits)
17:59 - YES! - That's what I missed at Uni. Math is so _unclear_ compared to code. They even put parts in sentences around the formulas. Unintelligible mess.
Drop both. Simultaneously. If you do hear one large bump instead of two distinct ones, then your assumptions are correct. Otherwise back out to add air resistance to your model or pump the air out to get the vacuum instead.
nobody gonna bring up the fact hes talking about making a pi super cluster by like @6:00 out of 50k computers for a million bucks. my math has been known to be wrong, but which board is available at -$20 a pop?
The premise of free processing and memory is that it's wrong on its face. There is no company; no government that doesn't have processing and memory as a consideration. Has this man done anything of note in the private sector? Try thinking this way in game creation. And processing cost is always about cost of implementation. If you had infinite resources you could just run an equation forever, but, of course, you would be long dead along with everyone else.
Your premise falls flat with almost all modern websites. For a private developer it's cheaper to make user spend more time than it is to spend more on developers. Bigger projects can consider system requirements. Most of them do not. Just throw it on the web and be gone.
This all assumes intelligence is mechanical. This assumes programming can go beyond a simple mechinical function. What if the brain doesn't think, what if the mind operates the brain like an engineer controls a machine. I am hearing in this lecture a comparison of a neuron to a circuit or sensor and the arrangement of circuits can do what? They just do what they do because someone designed them to do what we know how to do. I really don't know what this technology can expand into?
Intelligence definitely can be mechanical, we already have mechanical AI systems that can exceed human capabilities for many intelligent tasks, and there is no apparent limit to it. But I feels that life is not merely mechanical, or perhaps there is some transcendence from the mechanical to the living.
@@ssw4mI guess it all depends on our definition of intelligence. If I use your definition a hinge on a door is intelligent. For me, it would be the ability to solve problems. I would say a computer doesn't solve problems it just follows a program and the computer is unaware of what it is doing. The problem is actually solved by the programmer. Every time the door is opened the hinge doesn’t perform an act of intelligence, does it? Every time a computer runs an algorithm it is not an intelligent act, it is no more intelligent than what a hinge does, it only did what it was programmed to do.
@@sedevacantist1 Sure. But people who follow algorithms might have something to say about you telling them they're not intelligent :) You could also say that the ability to change approaches is a sign of intelligence. Which is obviously true enough - but again, humans routinely _don't_ do that. You could say that the ability to look at the same data and produce different results (i.e. creativity) is a sign of intelligence - but then again, if a program does that, you're going to complain it's buggy. Funnily enough, again, just like with humans. Understanding how neural networks work is a great insight into how our brains work. Even extremely simplistic models of neurons already display the same features we observe in animals. Look at how data is stored in neural networks - and you'll see where intelligence comes from. It took a lot of evolution to make intelligence work remotely reliably; again, humans are stupid most often than not - and even when they stumble upon something truly smart, they are very likely to not notice or be ridiculed for it. Our standards for software are a lot higher than evolution's :) Every time you read data from a _real_ neural network, you also modify it. We specifically disable that function in our models, because it's inconvenient. What reason do you have to believe anything about this is _not_ mechanical?
@@solonyetskithere are many well-known examples: playing chess and other games, solving protein folding, finding new methods of matrix multiplication, generating rational written content based on wide knowledge (much faster, and better than most humans can), generating artistic images (much faster, and better than most humans can). I think that AGI is not very far away at all, and we already have all the pieces more or less.
Isn't what he describes (a database of locally consistent computational worldviews which allow global inconsistency) essentially Douglas Lenat's Cyc project? (en.wikipedia.org/wiki/Cyc)
What happens when people like Sussman are gone? How do we get this knowledge back? He knows all the stuff from the 60s and 70s, and I graduated with a physics degree knowing about 20% of what he spent his life figuring out.
Having had a period of my life where I had a huge interest in programming languages in general, the history of programming languages and programming language design I would say that we do have a lot of knowledge recorded, in papers, books and by looking at the actual programming languages that exists and have existed.
Isn't the main thing that all of them either had a firm understanding of math or CS? There are people that do really friggin cool things today, and in 15 years we will see them in the languages we use today.
Like delimited continuations. Which languages have them? Ocaml, some schemes, GHC haskell? Scala has the weird shift reset primitives, but that counts. Delimited continuations are aver 30 years old, yet they are obviously a primitive everyone should have.
Anyway, people will find ways to express complex things, and then abstractions will allow us mortals to use those cool things adapted to certain domains to simplify our programming lives.
But I might be too optimistic
I have thought about that a number of times. There is perspective that we will lose when all of these people who were there at the beginning are gone. The Windows operating system used to have a better user interface than it does now. I and people my age still remember that. Knowing that it was better in the past, there's a chance that people will come to their senses and bring back or improve in a way that recognizes that. If that doesn't happen soon, then that understanding might be gone forever. Or two hundred years from now, someone will invent something that was common 10 years ago and think it is a modern marvel, while really, people were just so distracted by other things that they forgot about it--like how you can hold the middle mouse button and then scroll by dragging, just to give a concrete example.
Joe Armstrong as an example.
We build on the shoulders of Giants.
11:15 Nothing brings fear to my heart more than a floating point number
- Gerald Sussman 2011
:D
If you have a few days on your hand, try to read "What every computer scientist should know about floating point arithmetic". (Several versions online, none of them is without typos. When you read surprising things, triple check.)
It enlightened me soooo much. I got a glimpse of what it takes to understand a program using floating points and why it's so easy to get them wrong. It's unsettling.
As the name suggest. I think every computer scientist should read it.
This is why static typing is a trap. Semantics about entities and their relations should not be static but dynamic. That’s where flexibility and adaptability comes from. Too many devs throw that away due to fear. “We never let you run a certain kind of mistake” feels safe, but in reality, “we can correct any kind of mistake while the program runs” is safer, and unlocks so many possibilities. Thanks to Sussman for making that trade off so clear in this talk!
Wow. This is a phenomenal virtuoso performance, showing extremely expressive programming systems that most of us can only dream about, as we struggle with horrific legacy code in our day jobs!
Indeed. Sadly it went right over the head of half of the commenters here.
*spends three weeks trying to push a single 5 loc change to production and dealing with paperwork and management*
*comes home to write 2400 loc 6502 emulator in one day to relax*
Having glanced at functional programming and Lisp I hadn't found a reason and motivation to invest time in learning them. Until now!
Should be required viewing for any software company.
Second time watching this. Great talk! I disagree with the title, though. It's not _computing_ we're bad at; it's _reasoning_. Computing is planned; reasoning is unplanned.
Considering his "memory is free" comments, it is interesting that my current job is optimizing a code base to reduce the client program footprint, use less memory, reduce network packet size and latency, and utilize multiple threading to reduce user wait time for computations.
He made a remark that, surely, there're applications which still require peak performance. A bit later he describe a lot more specific goal: to reach low latency, no matter the means. If your app is under the 100ms bar of human perception all the time for all users, you don't really need any performance increases. This is the idea.
It's amazing how math can be described using any language in any medium and everything still works as expected.
40:34 fits well within a "strange loop" conference. twist ourselves we do and must indeed
Great words on not being religious about a particular paradigm from 39:00 min
What's really amazing about this is that he's effectively describing a sheaf theoretic approach before that was even popular.
Please say more about this!
The Wikipedia article is only written for people who already know what Sheafs are.. could anyone explain?
@@LowestofheDead This and the other lectures in the same series are not perfect, but they explain a lot: ruclips.net/video/90MbHphnPUU/видео.html
I can see what they meant with "sheaf theoretic approach," because the set of constraints of a given problem form a topological space (you can take intersections and unions of constraints, basically equivalent to AND and OR in logic), and the degrees of knowledge you have about your problem (the intervals of approximation in the video) maybe form a sheath over that space. Or in any case they have interesting algebraic properties that can be exploited.
1:32 I think the main difference between a genome and a computer program is that the genome doesn't really determine everything a cell can do, but a lot of it only works by interaction between what the cell is "told" by it's genome and the environment. You can see evidence for this for example in finger prints, which are different even between identical twins. Computer programs can also have emerging complexity, but only by the data interacting with other data,like in a cellular automaton. But cells also have physical and chemical interaction with the environment outside of the body.
Yes, a genome doesn’t do any computing at all. It’s a descriptive language.
24:31 - It's funny, he's a futurist by looking 100 million years into the past.
Great lecture!
This talk is actually less about programming than it is about automated deduction systems. Which is exactly what I'm interested in because it's what modern machine learning is exceedingly bad at.
ML works solely by approxmations. Not that it's bad to get an answer of 99 meters high instead of 100m for a building. But it's very bad to have the system able to mix up a bit of the current weather with the barometer reading. ML is (usually) continuous in *every* dimension. Most models don't allow for discontinuity. Which is unfortunately the base of symbolic reasoning.
This man is literally describing the theoretical basis for modern Explainable AI, so I'm not sure what you're referring to.
love this talk 🎉
saw this computer news today and it reminded me of it
```js
What has to happen for mixing and matching different companies’ chiplets into the same package to become a reality?
Naffziger: First of all, we need an industry standard on the interface. UCIe, a chiplet interconnect standard introduced in 2022, is an important first step. I think we’ll see a gradual move towards this model because it really is going to be essential to deliver the next level of performance per watt and performance per dollar. Then, you will be able to put together a system-on-chip that is market or customer specific.
``` 24:49
Anyone got an article for that 3-elbowed salamander?
No, but works of Michael Levin on regeneration seem relevant here
@@atikzimmerman wow cool! thanks for the reference. It is so good sometimes to open the comment section
Really nice and helpful... Thanks!
I wish this video had citations. Does anyone know what paper he aluded to at 43:05? re: cells merge information monotonically. By Radoul?
Given two things, A) I'm on a 12 year old machine that can't run anything programed in the past 5 years, and B) I had faster webpage loading times in the days of 56k modems than I have today with 4g lte.
Both of these facts tell me that SPEED AND MEMORY ARE NOT FREE!
If you can't run your program on a decade old machine, you're doing it wrong.
I think you’re missing the point - he’s demoing using a language that would have originally run on the old computer he shows, so now he has millions of times more speed and memory available to explore different ways to solve problems, with the primary constraint for coding no longer being performance, but rather flexibility, redundancy, reliability, etc. This isn’t the same as what is going on in the web and presumably electron apps you might be referring to, where they are just plain inefficient and consume all available memory and cpu for zero return. He could have easily ran those lisp programs interactively on a machine from the 1980’s, and he demoed them in this video on a machine that is now over a decade old!
They are free from a developer perspective. One could allocate 16GB to resolve a problem that could be resolved in 64kb. The problem you are facing is: you want to consume products created by entities who doesn't care about how speed/memory would cost to their consumers.
Everyone remembers Dijkstra. Few remember Mills.
"But in the future it's gonna be the case that computers are so cheap and so easy to make that you can have them in the size of a grain of sand, complete with a megabyte of RAM. You're gonna buy them by the bushel. You could pour them into your concrete-and you buy your concrete by the megaFLOP-and then you have a wall that's smart. So long as you can just get some power to them, and they can do something, that's gonna happen."
I just so want to ask him how would he program a computer to calculate irreducible 3nary operations... I bet he would even have an answer!
Is this the same version served by InfoQ or is there some remastering/cleaning process in the mix?
It’s a re-edit
Excellent ideas.
Biological 'code' is very very dense by my look.
It's around 3.1 billion base pairs for DNA, where he got his gigabyte.
But ACGT is base 4 so it's more like
9,000,000 terabytes of binary code?
That is very very complex code for growing elbows and everything else in the expected places.
You can maybe cut that down ignoring some non-coding DNA but you can also consider those sections like compiler tuning parameters effecting the transcription rates of neighboring coding sequences and folding stickiness of the chromosomes.
...time to practice some LISP.
It's pretty fun thinking about biology like a computer.
Contains its own compiler code to make ribosomes.
Converts DNA to RNA like source code to AST for a compiler.
Each cell as its own processor with state memory from chemical signals and damage.
Proteins as small programs vying for processing time to work in a crowded cell before they are killed.
Each cell flooding the system with messages easily lost in the form of chemical signals.
A ton of parallel I/O processing of all of those signals, noisy networks.
Trashing a whole processor if it gets a virus before it can send out too many bad virus packets to the system...
Not sure if it's a useful model to work off of though. Self destructing pi zeroes when they detect an errant process would be pricey.
You've made a mistake in converting size due to different bases. it's just *2, not ^2 (1 base 4 symbol is 2 base 2 symbols), so gigabyte is about right (3.1 billion pairs, each of which is 1 base 4 symbol, so 2 base 2 symbols -> 6.2 billion bits)
❤
17:59 - YES! - That's what I missed at Uni. Math is so _unclear_ compared to code. They even put parts in sentences around the formulas. Unintelligible mess.
It depends on what notation you use, math can be very rigorous and clear. Even the sentences can have a clearly defined meaning.
🤔 drop the stopwatch or the barometer?
If you drop the stopwatch how are you going to measure time? Using the barometer?
Drop both. Simultaneously. If you do hear one large bump instead of two distinct ones, then your assumptions are correct. Otherwise back out to add air resistance to your model or pump the air out to get the vacuum instead.
But ... it's more fun to compute!
imagine how crazy effective these abstractions would be when they're running on a TPU...
nobody gonna bring up the fact hes talking about making a pi super cluster by like @6:00 out of 50k computers for a million bucks. my math has been known to be wrong, but which board is available at -$20 a pop?
The RockPi S has a quad core processor, so you'd only need 12500 of those, and they start at ~$10USD. So you could get that done for well under $1M.
The premise of free processing and memory is that it's wrong on its face. There is no company; no government that doesn't have processing and memory as a consideration. Has this man done anything of note in the private sector? Try thinking this way in game creation. And processing cost is always about cost of implementation. If you had infinite resources you could just run an equation forever, but, of course, you would be long dead along with everyone else.
Your premise falls flat with almost all modern websites. For a private developer it's cheaper to make user spend more time than it is to spend more on developers. Bigger projects can consider system requirements. Most of them do not. Just throw it on the web and be gone.
This all assumes intelligence is mechanical. This assumes programming can go beyond a simple mechinical function. What if the brain doesn't think, what if the mind operates the brain like an engineer controls a machine. I am hearing in this lecture a comparison of a neuron to a circuit or sensor and the arrangement of circuits can do what? They just do what they do because someone designed them to do what we know how to do. I really don't know what this technology can expand into?
Intelligence definitely can be mechanical, we already have mechanical AI systems that can exceed human capabilities for many intelligent tasks, and there is no apparent limit to it. But I feels that life is not merely mechanical, or perhaps there is some transcendence from the mechanical to the living.
@@ssw4mI guess it all depends on our definition of intelligence. If I use your definition a hinge on a door is intelligent. For me, it would be the ability to solve problems. I would say a computer doesn't solve problems it just follows a program and the computer is unaware of what it is doing. The problem is actually solved by the programmer. Every time the door is opened the hinge doesn’t perform an act of intelligence, does it? Every time a computer runs an algorithm it is not an intelligent act, it is no more intelligent than what a hinge does, it only did what it was programmed to do.
@@ssw4m many intelligent tasks like what?
@@sedevacantist1 Sure. But people who follow algorithms might have something to say about you telling them they're not intelligent :) You could also say that the ability to change approaches is a sign of intelligence. Which is obviously true enough - but again, humans routinely _don't_ do that. You could say that the ability to look at the same data and produce different results (i.e. creativity) is a sign of intelligence - but then again, if a program does that, you're going to complain it's buggy. Funnily enough, again, just like with humans.
Understanding how neural networks work is a great insight into how our brains work. Even extremely simplistic models of neurons already display the same features we observe in animals. Look at how data is stored in neural networks - and you'll see where intelligence comes from. It took a lot of evolution to make intelligence work remotely reliably; again, humans are stupid most often than not - and even when they stumble upon something truly smart, they are very likely to not notice or be ridiculed for it. Our standards for software are a lot higher than evolution's :) Every time you read data from a _real_ neural network, you also modify it. We specifically disable that function in our models, because it's inconvenient.
What reason do you have to believe anything about this is _not_ mechanical?
@@solonyetskithere are many well-known examples: playing chess and other games, solving protein folding, finding new methods of matrix multiplication, generating rational written content based on wide knowledge (much faster, and better than most humans can), generating artistic images (much faster, and better than most humans can). I think that AGI is not very far away at all, and we already have all the pieces more or less.
Isn't what he describes (a database of locally consistent computational worldviews which allow global inconsistency) essentially Douglas Lenat's Cyc project? (en.wikipedia.org/wiki/Cyc)