This is a piece of history for the modern internet. He mentioned AWS EC2 like a VPS, 12 years have passed but the internet world has completely changed.
Being a professional in this space with over 20 years of experience, this is a really good talk, covers basically most of the concerns of high performance web system design.
What I learned with all of this is that, if you have a problem in your system, just add a load balancer, if the problem persist, add more load balancers, if the problem is on the load balancer itself, add a load balancer to the load balancer.
This guy has such a great Socratic teaching style! At first I was annoyed by the constant questioning, but then I started participating myself (even though I didn't watch the rest of the course) and had a lot more fun!
I like that he pauses and asks questions whenever he is starting to introduce a new concept, gives listeners a chance to think about it before he gives the solution so that they know why the solution provided makes sense.
@@Dakid015 True. But he seems genuinely interested in listening what others have to say, and understanding their thought process, which is a great skill in a teacher.
This is a really good lecture. 14 minutes in this lecture and I must say the professor knows how to put together the stuff he's teaching. And Axel is everywhere xD
Nice, never imagined a CS lecture would come close to hitting a million views. I'm a 10 year+ experienced engineer and i've literally practically used all the stuff he's talking about on production but sometimes it's just good to hear some theory.
Regarding the approach discussed around 45-50mins in (cookies determining to which backend server a LB should route a request). He presents it as the perfect solution, but he ignores the problem he stated earlier when discussing other approaches. If you have a power user that creates heavy requests and they all go to the same server (as determined by the cookie), the load won't be balanced too well. Also, it ignores the fact that load on backend servers may change in time (if a given server was underutilized at the time of creation of the cookie, it doesn't mean the load will stay this way for the whole lifetime of the cookie). Great and engaging lecture overall! Kept me focused and answering his followup questions in my head :)
As a tenured engineer, this was really good to listen to (even if I knew most of it from experience). It also made me a bit jealous of the quality of education at Harvard.
Detailed Summary for [CS75 (Summer 2012) Lecture 9 Scalability Harvard Web Development David Malan by Merlin [00:08] Scalability and Web Hosting - Explaining different options to deploy applications and handle traffic online, including shared web hosting and virtual private servers (VPS). - Outlining vertical scaling and horizontal scaling as ways to increase resources and handle traffic spikes, and discussing the advantages of Amazon Web Services (AWS) for automating scaling. [07:52] Horizontal scaling means using cheaper, slower machines instead of expensive, high-end ones to stay below the ceiling of what is possible - Multiple slower, cheaper machines replaced by one high-end, expensive machine for building out the topology - Using SAS, SSDs and parallel ATA for speeding up data read/write times on databases [22:24] DNS round robin can lead to uneven server load - Caching can contribute to disproportionate load on certain servers - Sophisticated load balancing approaches can mitigate this issue [30:13] Storing sessions using RAID technology can increase site performance and provide redundancy - RAID technology can be used to store session data on a file server - RAID 0 can improve performance by striping data across multiple identical hard drives, while RAID 1 mirrors data for redundancy [43:51] Load balancers can be expensive, but there are software alternatives. - Software like Haproxy can be used for load balancing. - Cookies can also be used for maintaining sticky sessions, without compromising on privacy. [50:41] Using file based caching on static content may improve performance, but sacrifices space due to redundancy - File based caching means less regeneration of content - Redundancy in basic HTML tags can lead to increased disk space usage [01:04] Using a memory engine can help implement a cache efficiently - Memory engine tables are stored in RAM and can be used to write keys and values to implement a simple cache - Archive engine tables are slower to query but get compressed automatically, making it useful for storing log files [1:11:32] Having multiple slave databases serves as a redundancy for data loss - Databases attached to a master allow for a copy of every row that's in the master database, which can be further utilized as a redundancy if one database dies - Master-master setup is another redundancy solution to keep multiple databases in sync [1:25:37] Load balancer can ensure sticky sessions with cookie-based routing - Adding a load balancer to route traffic and store session cookies can ensure sticky sessions - Multiple master databases and cross-connecting them with load balancers adds redundancy but also complexity [1:33:13 Having multiple data centers and load balancing can improve uptime and reduce failure possibilities. - Avoid creating loops in network redundancy with two switches per server/ device. - Distribute load across different data centers and use global load balancing for higher uptime.
I was curious about this guy. Searched around and apparently he quit Harvard after this class and founded a company called Newsle, which got acquired by LinkedIn pretty soon. Very impressive...
Love this content and professor, it reminds me of my favorite professor in my undergrad for computer architecture. What made him so great was a superb delivery on a communication level, with a tremendous talent for speaking and annunciating. He knew his material inside and out and He also knew how to draw the class in by making the class think extremwly hard as a whole by presenting new challenging problems and invited conversations. Bravo and well done. These profs are gems.
It was recommended over a website, and this lecture is very informative and refreshes many jargons and provides some more insights as well. Very much recommended and thanks for this lecture Professor.
Meanwhile, the other students started playing a game where they drink shots every time the professor says 'Axel'. Thank you @David Malan for a great lecture.
@1:30:00 diagram is not equivalent to previous one because if Alice is sent to server 1, her data is now on both db servers due to master-master replication. So cross-connect is only needed if a db server fails, to ensure the other db server is used.
this instructor is just great! Does anybody know if the guy has an Channel (youtube or whatever) of his own? And damm... someone bring a cooler to Axel.
Born to late for the comp si boom. Born to soon to avoid the comp si recession. Currently I’m a junior at a company but the economy sucks right now. I’m studying system design topics to keep up my knowledge.
@@adityap8387 Yes, node is being used in the backend by many big companies like Netflix and PayPal. However, the most critical backend tasks are still done in Java.
After certain point of time, I, in my head, myself starts calling out Axel for an answer if there is any awkward silence, basically I am cheering for axel.
I still have a question on a person who potentially put a lot of load into a single server. At first, we were afraid of someone might send a lot of much-load requests to single server, so we introduced load balancer. However, once we use the cookie to resolve which server store that user's session, we still keep on sending much-load requests to that single server. In real world, there shouldn't be this kind of user that can bring the whole server down like this. I just kinda have this point in my mind that when did we forget this problem.
Even though I’ve been working as a SWE for a while now, this video makes wish I had majored in CS vs EE. This stuff is so much more interesting to me than EE crap.
EE is also interesting. Its just that most of the cool stuff in EE is expensive afaik to experiment whereas with CS you can spin up a cluster and test all these theories mentioned in this lecture on your own at very low cost(almost free). For EE, even if you want to low scale electronics(I am from ECE, hence the electronics), even bread board costs money.
1:30:00 " ... Functionally this is equivalent" - not really, since the horizontal connection between db1 and db2 presumably represents bidirectional replication.
Somebody needs to add a load balancer in front of Axel, all requests are being routed to Axel (he is still handling them pretty well though) :P
lol :D :D :D
HAHA
hahahaha :D
Lmao
Lollllzzz :D
horizontal scaling (13:00 - 21:00)
load balancing & caching (21:00 - 29:00)
shared session state (29:00 - 34:00)
RAID (36:00 - 40:00)
shared storage tech (42:00)
database replication (43:00)
load balancing tech (44:00 - 45:00)
session affinity (46:00 - 51:00)
in-memory caching (59:00 - 1:00)
data replication - active:passive (1:11 - 1:14), active:active (1:16 - 1:21)
partitioning (1:21 - 1:34)
data center redundancy (1:33 - 1:39)
security (1:39 - 1:44)
I found it hard to read, so here it is a list (and corrected the hours):
horizontal scaling (13:00 - 21:00)
load balancing & caching (21:00 - 29:00)
shared session state (29:00 - 34:00)
RAID (36:00 - 40:00)
shared storage tech (42:00)
database replication (43:00)
load balancing tech (44:00 - 45:00)
session affinity (46:00 - 51:00)
in-memory caching (59:00 - 1:00:00)
data replication - active:passive (1:11:00 - 1:14:00) active:active (1:16:00 - 1:21:00)
partitioning (1:21:00 - 1:34:00)
data center redundancy (1:33:00 - 1:39:00)
security (1:39:00 - 1:44:00)
Shaoin
thx a lot
you guys like doing that? try reclipped.com for amazing note taking and sharing on videos
@@shaoin3295 Thank you very much! This is really helpful
This is a piece of history for the modern internet. He mentioned AWS EC2 like a VPS, 12 years have passed but the internet world has completely changed.
I'm legitimately terrified that I'll be in a system design interview, they'll ask me a question, and I'll immediately turn and ask, "Axel?"
😂😂
Dude, Axel by now is the interviewer
Its been 7 years for me. I loved it then and still do. The entire course launched my career!
Being a professional in this space with over 20 years of experience, this is a really good talk, covers basically most of the concerns of high performance web system design.
So it is still relevant as learning material in today's time?
What I learned with all of this is that, if you have a problem in your system, just add a load balancer, if the problem persist, add more load balancers, if the problem is on the load balancer itself, add a load balancer to the load balancer.
Perfect
if the problen persist, then call Axel!
😂
Call AXL from Guns and Roses! He will solve it with his guitar :P
This guy has such a great Socratic teaching style! At first I was annoyed by the constant questioning, but then I started participating myself (even though I didn't watch the rest of the course) and had a lot more fun!
horizontal scaling (13:00 - 21:00)
load balancing & caching (21:00 - 29:00)
shared session state (29:00 - 34:00)
RAID (36:00 - 40:00)
shared storage tech (42:00)
database replication (43:00)
load balancing tech (44:00 - 45:00)
session affinity (46:00 - 51:00)
in-memory caching (59:00 - 1:00:00)
data replication - active:passive (1:11:00 - 1:14:00) active:active (1:16:00 - 1:21:00)
partitioning (1:21:00 - 1:34:00)
data center redundancy (1:33:00 - 1:39:00)
security (1:39:00 - 1:44:00)
@Shaoin
Where is Axel today in 2024?
Wow! glad to know people ask about me. really learnt a lot in that lecture
Prof. Malan is great
I like that he pauses and asks questions whenever he is starting to introduce a new concept, gives listeners a chance to think about it before he gives the solution so that they know why the solution provided makes sense.
Don't most teachers do this? Either way, David Malan is a world class teacher and computer scientist overall
@@Dakid015 True. But he seems genuinely interested in listening what others have to say, and understanding their thought process, which is a great skill in a teacher.
I just love David Malan’s teaching style. He’s soooo my favorite professor for CS! How does he make complex topics so easy to understand? Amazing!
Plot Twist : Axel, Jack and Isaac are the only kids in class.
17:07 Louis got one!
and Ben!
LOL!!!
So funny
ha ha ha
This is a really good lecture. 14 minutes in this lecture and I must say the professor knows how to put together the stuff he's teaching. And Axel is everywhere xD
Axel is homie
Axel sold out and is now working at citadel
I used to watch david malan cs50 course back in 2021 when i was in 2nd semester of CS. degree. Now for system design in 2024 :)
It is the most popular tutorial video for system design and recommended almost everywhere. Thank Professor David Malan and the uploader.
This is one of the most informative and well-presented CS lectures I have ever seen
In this class, Axel is the anti-Jon Snow. He knows everything!
You know everything, Axel.
U
Good lecture. I like this style of teaching when you go deeper and deeper based on simple questions.
Nice, never imagined a CS lecture would come close to hitting a million views. I'm a 10 year+ experienced engineer and i've literally practically used all the stuff he's talking about on production but sometimes it's just good to hear some theory.
donnemartin's system-design-primer required topics:
horizontal scaling (13:00 - 21:00)
load balancing & caching (21:00 - 29:00)
load balancing tech (44:00 - 45:00)
data replication - active:passive (1:11:00 - 1:14:00) active:active (1:16:00 - 1:21:00)
partitioning (1:21:00 - 1:34:00)
Thank you!
Nobody:
Literally Nobody:
David: Axel?
Regarding the approach discussed around 45-50mins in (cookies determining to which backend server a LB should route a request). He presents it as the perfect solution, but he ignores the problem he stated earlier when discussing other approaches. If you have a power user that creates heavy requests and they all go to the same server (as determined by the cookie), the load won't be balanced too well. Also, it ignores the fact that load on backend servers may change in time (if a given server was underutilized at the time of creation of the cookie, it doesn't mean the load will stay this way for the whole lifetime of the cookie).
Great and engaging lecture overall! Kept me focused and answering his followup questions in my head :)
As a tenured engineer, this was really good to listen to (even if I knew most of it from experience). It also made me a bit jealous of the quality of education at Harvard.
best teacher of CS, and best student of CS , both in same class!
Detailed Summary for [CS75 (Summer 2012) Lecture 9 Scalability Harvard Web Development David Malan by Merlin
[00:08] Scalability and Web Hosting
- Explaining different options to deploy applications and handle traffic online, including shared web hosting and virtual private servers (VPS).
- Outlining vertical scaling and horizontal scaling as ways to increase resources and handle traffic spikes, and discussing the advantages of Amazon Web Services (AWS) for automating scaling.
[07:52] Horizontal scaling means using cheaper, slower machines instead of expensive, high-end ones to stay below the ceiling of what is possible
- Multiple slower, cheaper machines replaced by one high-end, expensive machine for building out the topology
- Using SAS, SSDs and parallel ATA for speeding up data read/write times on databases
[22:24] DNS round robin can lead to uneven server load
- Caching can contribute to disproportionate load on certain servers
- Sophisticated load balancing approaches can mitigate this issue
[30:13] Storing sessions using RAID technology can increase site performance and provide redundancy
- RAID technology can be used to store session data on a file server
- RAID 0 can improve performance by striping data across multiple identical hard drives, while RAID 1 mirrors data for redundancy
[43:51] Load balancers can be expensive, but there are software alternatives.
- Software like Haproxy can be used for load balancing.
- Cookies can also be used for maintaining sticky sessions, without compromising on privacy.
[50:41] Using file based caching on static content may improve performance, but sacrifices space due to redundancy
- File based caching means less regeneration of content
- Redundancy in basic HTML tags can lead to increased disk space usage
[01:04] Using a memory engine can help implement a cache efficiently
- Memory engine tables are stored in RAM and can be used to write keys and values to implement a simple cache
- Archive engine tables are slower to query but get compressed automatically, making it useful for storing log files
[1:11:32] Having multiple slave databases serves as a redundancy for data loss
- Databases attached to a master allow for a copy of every row that's in the master database, which can be further utilized as a redundancy if one database dies
- Master-master setup is another redundancy solution to keep multiple databases in sync
[1:25:37] Load balancer can ensure sticky sessions with cookie-based routing
- Adding a load balancer to route traffic and store session cookies can ensure sticky sessions
- Multiple master databases and cross-connecting them with load balancers adds redundancy but also complexity
[1:33:13 Having multiple data centers and load balancing can improve uptime and reduce failure possibilities.
- Avoid creating loops in network redundancy with two switches per server/ device.
- Distribute load across different data centers and use global load balancing for higher uptime.
thank you so much for putting this out online, it's been amazingly useful, totally relevant in 2023
So much wisdom in one man. This is a real teacher. I want him to be my masters lecturer. He is so good.
you mean Axle, right?
@@AbhishekRaj174 :D
Axel is wasting his money on this course
:D
I was curious about this guy. Searched around and apparently he quit Harvard after this class and founded a company called Newsle, which got acquired by LinkedIn pretty soon. Very impressive...
nice find. thanks
all the questions were too easy and no one else was bothering to engage
It says he studied at harvard until April 2011 but, this course is from 2012. This maybe a different Axel
Love this content and professor, it reminds me of my favorite professor in my undergrad for computer architecture. What made him so great was a superb delivery on a communication level, with a tremendous talent for speaking and annunciating. He knew his material inside and out and He also knew how to draw the class in by making the class think extremwly hard as a whole by presenting new challenging problems and invited conversations. Bravo and well done. These profs are gems.
Thank you for the lecture
10 years later it's still very good
And that cache server on axel is running smooth
Amazing lecture! Great professor and engaged students. Thanks for sharing this.. still relevant in 2024
Axel is unstoppable!!
Great lecture. How I Wish I studied CS from professors like him.
I wish I could attend his live class someday. An amazing way to build up an understanding of System Design.
I picture Axel just sitting there holding his hand up the entire lecture
is Axel teaching assistant making sure theres is no awkward silence?
🤣
And it helps us to concentrate also
I don't think axel needs to take this course. He seems know everything
anti John-Snow doesnt mean he knows everything, it means he knows atleast something. Simple negation :P
one of the best lectures I have ever attended to learn cs things
Axel is a beast
very nice. I don't usually watch 1 hour+ videos on a single go, but this had me glued. Great lecture.
Best teacher ever!
@@gustavogianotti4128 No. Malan
I wish my prof was as good as him.
Enjoying his lecture even after 10 years of upload.
What if this is just a student presenting a project and Axel is the real teacher ?
😂😂😂
David is THE INSTRUCTOR.
*Axel
🤣🤣🤣
I wish my university and professors was this good. Thanks Harvard and David Malan
9 years later, still relevant to full extent.
It was recommended over a website, and this lecture is very informative and refreshes many jargons and provides some more insights as well. Very much recommended and thanks for this lecture Professor.
This is the best lecture to learn System Design. My life has changed!
This lecture can never get old!
I think Axel flunked the course and is retaking it !! :P
lol
@mark that's why he knows. he goes back to crush it again.
It is one of the most resourceful resource for system design
Axel is now an engineering manager at LinkedIn. How old is this video???!!!
2012, it's in title.
Wow, that was extremely clear, especially the wrap-up example he did at the end!
Meanwhile, the other students started playing a game where they drink shots every time the professor says 'Axel'. Thank you @David Malan for a great lecture.
Axle here really answering majority of the questions
amazing storyline, progressively building it up from scratch
There should be an Axel meme. Every time someone has a question, they ask Axel.
This Axl guy had nearly all the answers, definitely smart enough to get into Harvard.
Didn't know Moriarty teaches at Harvard
+Ramón López LOL
+Ramón López Sherlock is Axel, in fact.
LOL I was thinking the same thing!
You Rock Archn. I am completely impressed by your dedication to help the world for ensuring quality education. Love you sir!!! Great great job!!
That was an amazing class! Thank you very much, @David Malan!
this is ultimate... I watched an entire thing.. It's a good class to attend after CS50
Loved listening to the class, much more awake to this than any of my classes.
Need to give Axel a microphone next time I think.
Prof’s session has remarkable stickiness to Axel node,
Jack and Exel? They're probably off chasing unicorns in the land of glitter and rainbows! 🌈🦄
Axel did Axellent
nice !
lol
Great class, way better than any class I had in my college
@1:30:00 diagram is not equivalent to previous one because if Alice is sent to server 1, her data is now on both db servers due to master-master replication. So cross-connect is only needed if a db server fails, to ensure the other db server is used.
I usually listen to these videos on 1.5 or 2x, but I legitimately considered putting it on .75x lol
I'll upload them today. Thanks for letting me know!
Such a good lecture, but the only complaint is the voices of the students are SO low!! I can't hear them.
Did you atleast hear axel ????
YES I HEARD AXEL LOUD AND CLEAR
The most engaging lecture I've watched in recent times!
The teacher is excellent, this needs to be noted
great video from the archive. long live the internet.
Holy moly! What a prof! Brillant explanations!
All Computer Science Lectures begins with lot of curiousity and enthusiasm but always and I mean always ends with Buildings Burning Down(1:34:30)
this instructor is just great! Does anybody know if the guy has an Channel (youtube or whatever) of his own? And damm... someone bring a cooler to Axel.
Thanks for this lecture, Axel.
this man is great, how interactive he is.
Born to late for the comp si boom. Born to soon to avoid the comp si recession.
Currently I’m a junior at a company but the economy sucks right now. I’m studying system design topics to keep up my knowledge.
Great lecturer! Everything is very clear
Axel is clearly the favorite student!
Top notch content! Thank you so much for uploading this 🍀
Great lecture, thanks for uploading it! It's incredible that even at 2013 he didn't mention JavaScript at all!
Do you think NodeJS is mature enough for Backend? Or Perhaps C# is.
@@adityap8387 Yes, node is being used in the backend by many big companies like Netflix and PayPal. However, the most critical backend tasks are still done in Java.
@@adityap8387 yup, definitely mature enough
this is a fantastic lecture, great for prepping for interviews and what not
I was looking for a video which explain me end to end solution. I really appreciate for giving the lecture and uploading the video.
David Malan great lecturer . I was already knowledgeable now I know more. ++
Am I the only one coming back to this before each System Design interview?
Just wow! Very informative and useful. Thank you for making this available to us.
11 years still relevant
After certain point of time, I, in my head, myself starts calling out Axel for an answer if there is any awkward silence, basically I am cheering for axel.
Excellent lecture Sir. It was a pleasure listening to you and the topic.
I still have a question on a person who potentially put a lot of load into a single server. At first, we were afraid of someone might send a lot of much-load requests to single server, so we introduced load balancer. However, once we use the cookie to resolve which server store that user's session, we still keep on sending much-load requests to that single server.
In real world, there shouldn't be this kind of user that can bring the whole server down like this. I just kinda have this point in my mind that when did we forget this problem.
Even though I’ve been working as a SWE for a while now, this video makes wish I had majored in CS vs EE. This stuff is so much more interesting to me than EE crap.
Well, I major in Materials Science . Even more crappier than EE.
EE is also interesting. Its just that most of the cool stuff in EE is expensive afaik to experiment whereas with CS you can spin up a cluster and test all these theories mentioned in this lecture on your own at very low cost(almost free). For EE, even if you want to low scale electronics(I am from ECE, hence the electronics), even bread board costs money.
1:30:00 " ... Functionally this is equivalent" - not really, since the horizontal connection between db1 and db2 presumably represents bidirectional replication.
I enjoyed this lecture! Thanks David.
The lecture is a gem
Axel... Axel... Axel is everywhere.
This is a very useful video to learn about infra architecture..