When should you shard your database?

Поделиться
HTML-код
  • Опубликовано: 24 ноя 2024

Комментарии • 156

  • @wh264
    @wh264 4 года назад +153

    Thanks for the excellent content. I've summarized this for my own understanding.
    Before you Shard, try the following first
    0. Understand what your actual problem is before optimizing(too slow reads vs too slow writes) Analyze your slowest queriers and see why its slow: ruclips.net/video/-qNSXK7s7_w/видео.html Create indexes on appropriate columns and tune your data schema.
    1. Horizontal Partitioning - Have partition key(mostly on primary key) and split database into different ranges. This will create smaller B-trees on the indexes.
    2. Vertical Partitioning - When you have columns that you rarely access, and you cut a column out of the main database. This will make reads faster for frequent queries and slower for not frequent queries and make your B-trees smaller(less space in memory also)
    Partitioning Video: ruclips.net/video/QA25cMWp9Tk/видео.html
    ----

  • @gregt0m
    @gregt0m 3 года назад +45

    Hussein, you have a great style of presentation with proper tone, cadence, painting pictures without use of displays, and humor thrown in the right places. All this with no sense of arrogance exuded. Love your videos.

  • @dipunjgupta8082
    @dipunjgupta8082 3 года назад +26

    Most underrated channel on youtube. Sometimes I get bored from work and I come here to learn something interesting. You don't even know how much your videos mean to me. Thanks a lot Hussein!
    I am gonna use social distancing analogy a lot from now on 😂

  • @varatharajandhamotharan1511
    @varatharajandhamotharan1511 3 года назад

    I would say under rated channel. He is not teaching but he is discussing in a very informative way

  • @tahoemph
    @tahoemph 2 года назад +11

    One thing you don't mention which is important to understand is that read replicas cause some load on your primary for replication. Much like anything else, as you said, it isn't free. But it is fairly cheap.
    One use case you missed for sharding is data sovereignty. Sometimes data can be split into groups by location (e.g. zip country code) which not only can help with performance but can meet legal requirements for where data lives.

  • @abdelrahmanshehata7942
    @abdelrahmanshehata7942 2 года назад +1

    You are such a genius !!!
    You started by answering the question very early in the video, I like it.
    Then you started explaining everything very nicely !!!
    Perfect !!! Go on maaaaaaan

    • @hnasr
      @hnasr  2 года назад +1

      Im Glad to you liked it!
      بالتوفيق

  • @ganeshkhirwadkar4127
    @ganeshkhirwadkar4127 3 года назад

    Being a non-backend developer I still think atleast while watching your videos that I am a one of them !!! Superb and Easy Explaination

    • @hnasr
      @hnasr  3 года назад +1

      ❤️ thanks Ganesh!

  • @mramakrushnaYT
    @mramakrushnaYT 4 года назад +1

    That's wonderfull Hussein, Understanding the Why before going for a specific tech..

    • @hnasr
      @hnasr  4 года назад

      Thanks Rama!

  • @punerealestatebuilder
    @punerealestatebuilder 2 года назад

    I saw many videos but the way you explained horizontal/ vertical partitioning in just 30 sec is going to be with me foreever

  • @adityajoardar578
    @adityajoardar578 3 года назад +2

    This channel is addictive

  • @harshitbajpai4942
    @harshitbajpai4942 3 года назад

    The content is excellent for learning, really clears lots of stuff. But if anyone watching this video has an interview lined up(which is most probably true), don't explain in this fashion the strategy you choose to justify the problem you would be solving.

  • @zorsen117
    @zorsen117 4 года назад +37

    16:11 do you really want to do this with you life? I don't know man, I should have been a cook or something.

    • @dejangegic
      @dejangegic 3 года назад

      @Jamison Grotzinger I will kindly ask you to fuck off

  • @ahsaanali4512
    @ahsaanali4512 2 года назад

    I really love to watch your videos even though those topics are not the part of the my job but I watch it because I know I'll definitely learn something new. So keep updating us and keep uploading, appreciable.

  • @abhileo17
    @abhileo17 2 месяца назад +1

    made it so easy! awesome

  • @section9999
    @section9999 4 года назад +7

    Wow a whole ton of good stuff here. Props to you good sir!

    • @hnasr
      @hnasr  4 года назад +3

      DataSurgeon 369 😊🙏 enjoy thanks for your comment

  • @natem889
    @natem889 3 года назад

    Loved the way you explained. After a long time, I listened to some video for the whole duration.

  • @sundeepdharma
    @sundeepdharma 4 года назад +16

    Sharding is not really needed as you mentioned we can go for partitioning and local indexes within the partitions itself. What if writes are more there are various options in enterprise products, I worked long back with Oracle RAC setup with vplex managing the data storage for different nodes. Avoid writing component logging or audit trail logging to RDBMS instead write to nosql. I personally think only business data (OLTP) should be there in RDBMS all others can go log database (nosql) or splunk or datadog etc...

  • @vnaveenkumar982
    @vnaveenkumar982 3 года назад

    The best content on the internet with a crazy presentation skills. it was wonderful Hussein.

  • @aphroditesempai2186
    @aphroditesempai2186 3 года назад +1

    Hi thanks for your efforts. Its very hard to find experienced devs sharing their industrial challenges and providing good insights. Hoping to learn more. Keep up the work. Fighting !☺👏👏

  • @karthikeyansrinivasan52
    @karthikeyansrinivasan52 4 года назад +11

    Another fantastic video from this great guy to start an another beautiful day!!!

  • @banxt
    @banxt 9 месяцев назад

    “Predictably Irrational” is a very good book! :-p

  • @101kawsar
    @101kawsar 2 года назад

    My man often mentions Django, I love it :)

  • @hrayrpetrosyan5330
    @hrayrpetrosyan5330 3 года назад

    Thank you so much, Hussein! You're doing such a good job.

  • @neeravarora530
    @neeravarora530 4 года назад +2

    Awesome video Hussein , everytime i learn something new from your videos . Thanks for sharing your knowledge.

    • @hnasr
      @hnasr  4 года назад

      I am glad you are! Thanks for your comment

  • @MarcMcRae
    @MarcMcRae 4 года назад +1

    Very good background knowledge leading into the sharding explanation. Nicely explained. Thank you!

  • @vishalsrane
    @vishalsrane 2 года назад

    knowledge we get here is pure gold. Thank you 🙏

  • @rodrigocaballerohurtado5367
    @rodrigocaballerohurtado5367 4 года назад +7

    Hussein: You cannot longer perform transactions with sharding
    Me: thanks captain, that's it for me on the subject

  • @nadertarek4822
    @nadertarek4822 2 года назад

    When I find this DENSE content really enjoyable just like I'm watching Netflix, that does mean one thing you are really GREAT!!!, Thank you so much Hussein

  • @pradeepgupta4647
    @pradeepgupta4647 10 месяцев назад

    My search ends here, to clear my doubt thank you.

  • @ssksarraju
    @ssksarraju 3 года назад +1

    Great video. Love the way you explain the things. I wish you had elaborated little more or probably make a new video on why transactions are tough with sharding.

    • @hnasr
      @hnasr  3 года назад +1

      Correct that would require another video because its a deep topic , thanks for your comment ❤️

  • @dexterlohnes
    @dexterlohnes Год назад

    Great video! Thanks so much for taking the time to make this.

  • @stephennjuguna3793
    @stephennjuguna3793 3 года назад

    Just Wow! Learning so much from you Hussein

  • @aashishgoyal1436
    @aashishgoyal1436 4 года назад +2

    Thanks a lot Hussein. Really nice and engaging video with awesome explaination
    You deserve more views
    Gonna share it with my peers.cheers

    • @hnasr
      @hnasr  4 года назад

      Aashish Goyal thank you 🙏

  • @libranpal
    @libranpal 3 года назад +1

    Just to be clear, you don't sacrifice the transactional capability per se; you can't do it across shards but if the sharding is designed to keep every shard with no client depedency on the other shards, the transactional capability isn't lost. You may want to think of it for any cloud company where the database sharding is done per tenant and every tenant has private data. Says shards are created on a range of tenant names (say a-c for shard1, d-e for shard2 etc)
    , you aren't going to loose any capabilities here.

  • @WeiLiuhaha
    @WeiLiuhaha 2 года назад

    It's such a pleasure to watch! Fun and knowledgeable!

  • @cloud15487
    @cloud15487 4 года назад +3

    Lol I love the way you explain stuff. Subscribed!

  • @phillbaska
    @phillbaska 4 года назад

    Great video as usual. Great sense of humour and your explanations are very easy to follow.

  • @АйбарЖоламанов
    @АйбарЖоламанов 2 года назад

    Great video! I respect that you mentioned go

  • @EzequielRegaldo
    @EzequielRegaldo 3 года назад

    Thats why i love mongo db, sharding Is a breeze but can be better with relations, i want it, i need ir. Mongo router rocks

  • @robertkozik4845
    @robertkozik4845 3 года назад +3

    Just pointing this out: You can overwhelm a MySQL server's IO capacity very quickly with php-fpm at scale. Because each request spins up its own database connection, so unless you're throwing their write requests into a message queue then bulk inserting it's contents via some stateful service each connection = at least 1 IO event. And if you're co-locating odds are you don't want to set your io_capacity parameter above 2000 IOPS because of SSD burnout, and for php-fpm's concurrency model(or lack thereof) it's not hard to hit that threshold at even a small scale. Typically php-fpm sites of scale you'll see 30-50k IOPS, so even a cloud-based solution would be cost prohibitive.
    Not trying to be a jerk or anything because this requires a lot of specific knowledge of a particular programming language's execution model. But having said that, it's totally possible to overwhelm IO without a few million daily page views. Since you gotta keep in mind 80% of the traffic hits in 20% of the day. That's also basically the mark when Web 2.0 companies of yesteryear started sharding their LAMP apps.

    • @ethanj1533
      @ethanj1533 2 года назад

      So just don’t use fpm?

  • @judylee9452
    @judylee9452 3 года назад

    You are amazing . You are so wise. Thank u.

  • @krisna44
    @krisna44 3 года назад

    Your presentation is excellent

  • @sanjaybhatikar
    @sanjaybhatikar 2 года назад

    THOU SHALT NOT over-engineer too early. I am definitely putting that up on my wall :)

  • @Wherrimy
    @Wherrimy 3 года назад +1

    Thanks, Ill stick with sharting

  • @kapilrbagul
    @kapilrbagul 4 года назад +14

    Excellent one... just wanted to know your approach to be on top of latest technology. Which resources do you use? Which technology podcast do you listen? Can you make video on this topic? 😊

  • @abhay626
    @abhay626 3 месяца назад

    Amazing, thanks Hussein!

  • @RAYGUNWOD
    @RAYGUNWOD 4 года назад +1

    Fantastic video - thank you so much Hussein!

  • @yahyaandizan9561
    @yahyaandizan9561 4 года назад

    Good content for the basic understanding. Excellent

    • @hnasr
      @hnasr  4 года назад

      Glad it was helpful!

  • @abhishekt800
    @abhishekt800 2 года назад

    loved it..thanks for the explanation

  • @bashashaikabdul
    @bashashaikabdul 2 года назад

    Great explanation

  • @ReyAlexam
    @ReyAlexam 3 года назад

    I learned a lot by this. Thank you

  • @UnaliverOfChildren
    @UnaliverOfChildren 8 месяцев назад

    Remember, when you are confused and you dont know why its slow, always start with distributed caching

  • @KuriaNdungu
    @KuriaNdungu 3 года назад

    Excellent content bro

  • @kartikgupta3234
    @kartikgupta3234 13 дней назад

    Great content !!

  • @devendranarayan9748
    @devendranarayan9748 3 года назад +1

    Sometimes you sound like Gru 😂
    Great content.

  • @sscapture
    @sscapture Месяц назад

    Youre such a cool guy!!!!!

  • @Euquila
    @Euquila 3 года назад

    sharding is like sharting, unpleasant but sometimes necessary

  • @tarekali7064
    @tarekali7064 4 года назад +1

    Great videos! Thank you for making them!

    • @hnasr
      @hnasr  4 года назад

      Thanks Tarik!

  • @DheerajKumar-wk9xi
    @DheerajKumar-wk9xi 3 года назад

    Good Content Hussein, but that will be Great if you use a whiteboard of some pictorial content instead of showing everything in Air.

  • @thulasipb123
    @thulasipb123 3 года назад

    Excellent video. Thank you

  • @omose14
    @omose14 2 года назад

    Awesome bro!

  • @sanjaybhatikar
    @sanjaybhatikar 2 года назад

    Beautiful, thank you :)

  • @coding3438
    @coding3438 3 года назад +1

    It’s difficult to focus when the only thing I can’t take my eyes off is the silver ps2

  • @TheGdhungana
    @TheGdhungana 4 года назад

    Glad to see you bro..I was wondering how you look like..!

  • @mustafaalmulla
    @mustafaalmulla 4 года назад

    FYI, MongoDB have ACID transaction support with sharding

  • @viraj_singh
    @viraj_singh Год назад

    so my takeaway is
    sharding is for scaling write queries (which is rare)
    partition is for scaling read queries

  • @krorrarst9350
    @krorrarst9350 3 года назад

    Life Saver!

  • @shaheerzaman620
    @shaheerzaman620 4 года назад

    Great video Hussein. Can you please make a video on indexing and Acid transactions? That would be great! Thanks.

    • @hnasr
      @hnasr  4 года назад

      shaheer zaman thanks Shaheer! Check out my ACID video here Relational Database ACID Transactions (Explained by Example)
      ruclips.net/video/pomxJOFVcQs/видео.html .. i am still need to work on the indexing video coming soon :)

  • @donlywaybv
    @donlywaybv 3 года назад +1

    I'm your subscriber :D

  • @yashverma7084
    @yashverma7084 2 года назад

    thanks for the video

  • @SyedHaris007
    @SyedHaris007 4 года назад

    Please make a video on Elastic search

  • @desiaclementslewis8318
    @desiaclementslewis8318 3 года назад

    thank you

  • @sahersalamh9032
    @sahersalamh9032 4 года назад

    Amazing video, please make an arabic channel, we are missing this content here dude.

  • @lucavogels
    @lucavogels 4 года назад

    Really great!

  • @serhiihorun6298
    @serhiihorun6298 3 года назад

    Cool thank you!

  • @AubrieKarvis
    @AubrieKarvis 2 месяца назад

    Thanks for the breakdown! 🤔 I have a quick question: 🤷‍♂️ I have a set of words 🤷‍♂️. (behave today finger ski upon boy assault summer exhaust beauty stereo over). Can someone explain what this is? 😅

  • @kapssul
    @kapssul 4 года назад +3

    Could you make a video about the 5G impact on (Cross-Shard Queries) knowing that 5G Latency is below 1 milliseconds..
    1 - Could that bring sharding and cloud computing to the next level ?
    2 - What will happen to data centers infrastructure, and also to database query languages in the future ?
    3 - Will that reduce the data management cost for business owners ?

    • @hnasr
      @hnasr  4 года назад +4

      What amazing thoughts provoking questions Kapssul! Love them. That is going to take some time to research and answer because I have no idea. 5G is indeed a revolutionary tech and it will spin up the software engineering tech on its head.. thanks!!

    • @kapssul
      @kapssul 4 года назад +3

      @@hnasr I am on the way, you are already there.. you could grab the information quickly and more accurately, and share with us bit by bit your thoughts.. keep posting.. thanks for that quality content.. we learn a lot..

  • @andreigatej6704
    @andreigatej6704 4 года назад +3

    Great content! Thank you! Do you have any particular resources for learning BE concepts? (except for a job :D)

  • @DidierWyche
    @DidierWyche 15 дней назад

    Thanks for the forecast! I have a quick question: I have a SafePal wallet with USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). How can I transfer them to Binance?

  • @crabjuice47
    @crabjuice47 3 года назад +1

    Bro you are fucking brilliant and that you are willing to teach others what you know in such a good engaging way I love it. God Bless you even if you don't believe in God or not. :)

  • @pavelkravchenko2810
    @pavelkravchenko2810 3 года назад +1

    Why logic on client? You can make service that deals with this requests from client and send them to the right dB server? Or I don't understand something?

  • @momardiouf9141
    @momardiouf9141 4 года назад

    Greate content Hussen and thanks for sharing we learn a lot with you
    I just comme back to a previous question asked by another person in comments, he sais : "just wanted to know your approach to be on top of latest technology. Which resources do you use? Which technology podcast do you listen? Can you make video on this topic? "
    Can you answer to this please ? It will help us agains
    Thanks again for this greate content

    • @hnasr
      @hnasr  4 года назад +1

      Momar Diouf thank you Momar! Appreciate you 🙏
      I learn by listening to podcast, watching videos , reading and implementing the thing . I always ask why a tech exists before I ask what.
      I made few videos on the topic of learning check them out
      When Learning Backend Engineering Ask Why, not What (Minute Engineering)
      ruclips.net/video/67DglLwnBTU/видео.html
      My Preferred Method of Learning Backend Engineering Technologies
      ruclips.net/video/4NsWnT_-FoE/видео.html
      Learning at Home, Consistent Hashing, Empathy with Engineers and More - Software Chat
      ruclips.net/video/6PrR6SW4QGM/видео.html
      Advice for Junior backend engineers who just started their new jobs in software companies
      ruclips.net/video/V3C0VvNrFZ8/видео.html

    • @momardiouf9141
      @momardiouf9141 4 года назад

      @@hnasr Thanks for the reply. I will check the links
      Thanks

  • @nikolakolarov1416
    @nikolakolarov1416 4 года назад

    Great video! Are you planning to make a separated video about Vitess?

    • @hnasr
      @hnasr  4 года назад +1

      Nikola Kolarov thank you! Yes I am planning to make a video on Vitess

    • @nikolakolarov1416
      @nikolakolarov1416 4 года назад

      Hussein Nasser Awesome

  • @AbleToLiveHere
    @AbleToLiveHere 4 года назад

    Your videos are great. Thank you

    • @hnasr
      @hnasr  4 года назад

      THANK you so much! appreciate it

  • @ackrman
    @ackrman Год назад

    @14:00 is it still the case? I read that mysql supports transaction on sharding xa transactions or distributed transactions. And mysql nda cluster support acid.

  • @MohamedZiada
    @MohamedZiada 4 года назад

    always great video Hussein, thank you. Journal, logging all the traffic for website/ application, could be Database heavy write , right?

    • @hnasr
      @hnasr  4 года назад +1

      Thanks Mohd! and Correct, logging is a database write heavy operation so you would choose an LSM based DB such as RocksDB or myRocks.. Check out my database engines video for more details on this topic

  • @shivashankar_1998
    @shivashankar_1998 3 года назад

    Can we use timescale DB extension with postgres for faster writes ?

  • @minhthinhhuynhle9103
    @minhthinhhuynhle9103 2 года назад

    is Replication increase reading capacity via Load balancing mechanism ???
    Since I stackoverflow all months and cannot found any appropriate article about this. Even MongoDB Legacy Docs did not mention reading load balancing 😢

  • @zaheerkhan8097
    @zaheerkhan8097 2 года назад

    One question I had does databases like Postgres provide facility for automatic sharding . If yes what process do they follow while doing the same

  • @peop.9658
    @peop.9658 Год назад

    Can you name the youtube's podcast you referred to?

  • @lionelarucy4735
    @lionelarucy4735 3 года назад

    Love your videos, what do you think of the 'New' SQL Databases such as CockroachDB and using them instead of complicating your life with sharding?

    • @hnasr
      @hnasr  3 года назад +1

      I need to do my research of them but I do this there are use cases for them. I still don’t know what is new about them so can’t really comment

    • @lionelarucy4735
      @lionelarucy4735 3 года назад

      @@hnasr Cool, I look forward to hearing your take on them, they're what I opt for in situations where I think I'd have to shard because sharding isn't fun. I've also tried others like Citus which are pretty great but a bit more work to get the most out of.

  • @nathanbenton2051
    @nathanbenton2051 4 года назад

    Awesome video and thank you! What are 'file descriptors'? I think I heard you correctly at ~ 8:15.

    • @hnasr
      @hnasr  4 года назад +1

      Thank you!! File descriptors are handle to the TCP connection I don’t know much about them (which is good means I need to read more about them and probably make a video)
      Here is the wiki en.wikipedia.org/wiki/File_descriptor

  • @snaidu70
    @snaidu70 4 года назад

    Hussein, it is not true that writing is always fast. It depends on the number of indexes you have on the table.

    • @hnasr
      @hnasr  4 года назад

      Correct good point 👍 the more indexes you have the more work you need to do to update those indexes. And this could be even slower if it is bTree index compared to LSM tree

  • @uwemnkereuwem6272
    @uwemnkereuwem6272 2 года назад

    How about multi-master replication or bi-directional replication?

  • @Faz13able
    @Faz13able 3 года назад

    Hello I have log server. There is not transaction. Its straight write and read. No update or delete will be performed. And no rollback is necessary. Problem is minimum log size is 400 GB plus which consists of raw text data only. DB size increases 40 GB per day min. So you understand there is a a lot of write in the database . Problem is there is too much write in one instance that I can not read the database. Its a MongoDB database and I searched and found I can do sharding to distribute the read write. And Mongo shard comes with tool like mongos which will distribute my query from client based on lets say timestamp. So should I proceed with this plan or should I do partition first? And also the pipeline is at development stage so if u recommend I can still change DB to postgres or other DB technology. Thanks.

  • @dotcore1150
    @dotcore1150 Год назад

    Plz do something practical 🎉

  • @elmeroranchero
    @elmeroranchero 3 года назад

    is there any risk of outdated data if the slaves do not sync on time? how do we deal with this?

  • @nitreall
    @nitreall 3 года назад

    When exactly though? Share some numbers

  • @DipanjalMaitra
    @DipanjalMaitra 2 года назад

    You are awesome. I have learned a lot from you. Thank you for being there.
    But I have got a little confusion at 10:41. I agree the db requestes from clients will pass through the reverse proxy but the proxy should be SQL Aware like ProxySQL but can Nginx or HAProxy be configured as SQL Aware LB?

  • @uchennanwanyanwu2777
    @uchennanwanyanwu2777 4 года назад +9

    0:21 ...complicating your life with sharding...very funny.