"Turning the database inside out with Apache Samza" by Martin Kleppmann
HTML-код
- Опубликовано: 5 окт 2024
- Databases are global, shared, mutable state. That's the way it has been since the 1960s, and no amount of NoSQL has changed that. However, most self-respecting developers have got rid of mutable global variables in their code long ago. So why do we tolerate databases as they are?
A more promising model, used in some systems, is to think of a database as an always-growing collection of immutable facts. You can query it at some point in time - but that's still old, imperative style thinking. A more fruitful approach is to take the streams of facts as they come in, and functionally process them in real-time.
This talk introduces Apache Samza, a distributed stream processing framework developed at LinkedIn. At first it looks like yet another tool for computing real-time analytics, but it's more than that. Really it's a surreptitious attempt to take the database architecture we know, and turn it inside out.
At its core is a distributed, durable commit log, implemented by Apache Kafka. Layered on top are simple but powerful tools for joining streams and managing large amounts of data reliably.
What we have to gain from turning the database inside out? Simpler code, better scalability, better robustness, lower latency, and more flexibility for doing interesting things with data. After this talk, you'll see the architecture of your own applications in a completely new light.
Speaker: Martin Kleppmann @martinkl
Martin is committer on Apache Samza (a distributed stream processing framework), software engineer at LinkedIn, and author at O'Reilly (currently writing a book on designing data-intensive applications). He invented the infamous "LinkedIn Intro" email proxy. Previously he co-founded and sold two startups, Rapportive and Go Test It. He is based in Cambridge, UK.
2017 and this is still incredible interesting. Thank you.
2024 and it is still not possible to wish away state. Many thanks for this work!
amazing speaker. explaining a difficult concept with simplicity. 2024 still interesting !
2021 still awesome! Thank you!
Martin Kleppmann The God of Distributed Systems! Thanks Strange Loop for sharing this
This talk is the most important talk in the century about all kind of computer future logic
Brilliant talk, I disagree only with the comment "Kill REST APIs" but do agree with reducing the focus on request/response systems. Req/Res are implementations details of HTTP, REST can work over websockets. REST is a concept for building distributed systems, it is in no way is it limited to APIs or HTTP(req/res). That being said most "REST" APIs are implemented incorrectly since they lack Hypermedia controls in their message structure.
Still Relevant! Awesome content Martin :)
really digging these hand written slides
I like Martin Kleppmann, he's a very bright person and a good teacher. I disagree though with his statement "Kill REST". In this talk he proposes to use streams over REST but imo this is all use case dependent. Also the idea of a stream is not so new, publish/subscribe communication flow is pretty widely used already, just think about web sockets. Think about an application that doesn't need to be updated about any CRUD operations within the DB in real time (like 95% of applications). Would you still introduce a complex stream based backend over simple REST?
I really enjoy this kind of thinking. Thanks for the talk.
Genius! Way ahead of his time
CQRS?
+maverick88NL Totally! However, in upcoming period, I bet that many people will feel uncomfortable by switching from CRUD to CQRS. Worst issues I encountered was the essential separation of write model from read model. Especially, how to fit everything with specific technology. Implementation of CQRS pattern can be ridiculous sometimes. :)
MQTT is a publish/subscribe server with open source JavaScript web socket libraries that's been around for a long time. I used it in the public safety sector for officers to subscribe to streams published by the dispatch center. I guess I'm fuzzy on how this differs from that other than fuzzing the lines between the MQTT server and the database it may ride on.
Excellent presentation.
So Event Sourcing then?
"Martin Kleppmann - Event Sourcing and Stream Processing at Scale" - ruclips.net/video/avi-TZI9t2I/видео.html
I wonder what Martin Kleppmann thinks of relay and graphql
more like redux ...
I wonder if this guy knows Datomic. I think it's exactly what he wants :)
+Matúš Lešťan He mentions this IN THE VIDEO...
+Matúš Lešťan He compares to Datomic at 43:05
+Matúš Lešťan Yes, He has mentioned Datomic in his book Designing Data Intensive Applications, 2nd chapter.
If he wants it, chances are he probably built it.
good presentation. very interesting.
Does anyone know what software was used to draw theses slides? Would be great for university
thanks folks
iPad Pro with pen would suffice :)
24:22 "Kappa" Architecture. :)
Amazing! Thanks!
Kafka streams seems to have killed Samza
I read the phrase "When a client reads from a materialized view, it can keep the net‐
work connection open." from Martin's book "Making sense..." and wondered where was that coming from. How a materialized view offers such a feature ?
databases are so 1970's
Turing machines are so 1940's
Pants are so 1800's
Wheels are so 4000 BC