28:00 An interesting hack I've heard to deal with pauses is to have every node on the cluster do them in sync. If everybody is collecting garbage at the same time, nobody's around to take the new leader spot.
Thanks very much for this, I watched your video from 10 years ago too; very informative. Do you have any thoughts on whether GPU's have helped or can help?
Evolution [2:07]
Design[2:43]
State Machines [2:51]
Replicated State Machines [3:30]
Distributed Event Log [4:39]
Rich Domain Models (DDD) [5:29]
Data Structures (CS) [6:03]
Time & Timers [6:51]
Fairness [9:20]
(example graph old) [10:03]
(Example graph new: fairness) [13:19]
Gateway by classification of customer [13:43]
Sharding Matching Engine [14:27]
Migration by Asset Class [15:10]
OTC => Exchange Traded
Resilience [16:19]
Fault Tolerance [16:25]
Primary + Secondary vs Consensus [16:31]
Paxos / Viewstamp Replication / Virtual Synchrony [20:07]
Raft paper [21:11]
Raft Safety Guarantees [22:09]
(Raft Consensus Module graph) [23:03]
Append Position/Index [24:21]
Commit Position/Index [25:03]
Service Position/Index [26:21]
(lose node) [26:46]
Importance of Code Quality & Model Fidelity [28:19]
Robustness [29:52]
Performance [31:28]
Latency distribution awakening [31:44]
Systemic & queueing events [32:27]
Garbage Collectors [34:29]
Memory Access Patterns & Data Structures [36:26]
Binary Codecs [38:00]
Spectre & Meltdown [39:30]
Greatly increased cost for system calls, page faults, & context switching [40:40]
Advances in Hardware [42:09]
New IO APIs [44:56]
Mechanical Sympathy [46:05]
Does programming language choice matter? [46:22]
culture & design around the language
example: Java [46:47]
great, but design of some APIs hurts
Deployment [48:11]
Continuous Delivery [48:22]
24*7 Operations [50:10]
Flexible Scaling [51:08]
Wrapping up... [52:27]
Coming back to this again. Just as interesting the 2nd time 😊
This is a really good talk!!!!
Just lost myself on slide numbers about fault tolerance at 25:04, what number divided by 2 plus 1 is equal 25?
28:00 An interesting hack I've heard to deal with pauses is to have every node on the cluster do them in sync. If everybody is collecting garbage at the same time, nobody's around to take the new leader spot.
Great talk!
Thank you!
Thanks very much for this, I watched your video from 10 years ago too; very informative. Do you have any thoughts on whether GPU's have helped or can help?
awesome...nano event log resolution = nano market structures to trade with
39:00 --talk gets overvolted following amateurize v. arbitrage v. amortize wyrding. Then bsd io gets called out v. frame codecs.
Viewstamped replication is actually the simplest of protocols.