The Basics of Database Sharding and Partitioning in System Design

Поделиться
HTML-код
  • Опубликовано: 7 янв 2025

Комментарии • 54

  • @tryexponent
    @tryexponent  Год назад +2

    Make sure you're interview-ready with Exponent's system design interview prep course: bit.ly/3YTjsjH

  • @jay_wright_thats_right
    @jay_wright_thats_right Год назад +33

    Animations to visualize what she is saying would make this video perfect!

  • @chellamgmoorthy
    @chellamgmoorthy Год назад +16

    I didn't knew what a database sharding was. This video gave me good amount of topics for me to research and learn. Thanks for the video!

  • @harshita9936
    @harshita9936 3 месяца назад +2

    This video was great. Short. Crisp. To the point.

  • @MuhammadAsif-nx7om
    @MuhammadAsif-nx7om Год назад +11

    Great and to the point explanation, No bluff
    Thanks

  • @JacquesJoubert-c8z
    @JacquesJoubert-c8z 2 месяца назад

    This was an amazingly informative video to get a high level overview of what database sharing is, thank you!

  • @AbhishekKumar-b1j1x
    @AbhishekKumar-b1j1x 10 месяцев назад +1

    Some people are very beautiful with a helping hand , thanku❤

  • @ayushgogna9732
    @ayushgogna9732 11 месяцев назад

    you guys are amazing i recently found your channel i am learning a lot and i am loving it

  • @oefzdegoeggl
    @oefzdegoeggl Год назад +4

    a few things to add. i prefer partitioning based on a guaranteed key in the sense it will not distribute badly ... so the "first letter of name" is a bad idea. better use the record id and group 100k of them or what into a partition. then before storing partitions on different servers, there are a few more things to do first. one is to split modifying queries from read-only queries (which has to be done on the application level) so a simple read-replica-server (which is trivially to be setup in postgres) can be used. next what is possible is a db split on the logical level. i mean for example keep the user's core data on db1 and chat messages on db2. leaving out foreign keys and using weak references instead, with a periodic cleanup job that resolves broken links is a good idea, eliminating issues on backup restore when cut in a bad moment as well.

    • @goofballbiscuits3647
      @goofballbiscuits3647 9 месяцев назад +1

      Coming from a decade+ of data work with health records, I have to bump this comment. Name, location and birthdate combined still aren't unique. Messing up data with potential tromps like this is straight up lethal in some fields.
      Remember, friends: bad data is worse than no data.

  • @edmoregosha8937
    @edmoregosha8937 9 месяцев назад +1

    The video script explains the basics of database sharding and partitioning in system design. It discusses how sharding can help manage large amounts of data by breaking it up into smaller partitions spread across multiple servers. The script also highlights the advantages and disadvantages of sharding in terms of scalability, performance, and operational complexity.
    Key moments:
    00:32 Traditional databases encounter limitations with increasing data size, necessitating sharding to enhance scalability and performance.
    -Geobase sharding partitions data based on user locations, reducing latency by routing users to the closest node.
    -Range-based sharding divides data by key value ranges, simplifying partition computation but potentially leading to uneven splits.
    -Hash-based sharding uses hashing algorithms to evenly distribute data across partitions, reducing hotspots but potentially separating related rows.
    -Automatic sharding dynamically manages data partitioning for higher performance and scalability, but manual sharding at the application layer increases development complexity.
    03:55 Sharding enables scaling, faster queries, and system availability, but poses challenges like complex management, hot spots, and high operational costs.
    -Advantages of sharding include scalability, faster queries, and improved system availability during outages.
    -Disadvantages of sharding involve complex data relationships, potential hot spots, and operational costs for maintaining high availability.
    Generated by sider.ai

  • @octavian0704
    @octavian0704 Год назад +3

    very well described, thanks for sharing.

  • @Deepz007
    @Deepz007 Год назад +4

    Great video on sharing, but partitioning wasn't mentioned or discussed.

  • @netspie
    @netspie 6 месяцев назад +1

    Just memorize every word and say in the job interview.. unbelievable..

  • @KelliWestwood
    @KelliWestwood Месяц назад

    Great analysis, thank you! Could you help me with something unrelated: I have a SafePal wallet with USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). How should I go about transferring them to Binance?

  • @cristinasanchez9029
    @cristinasanchez9029 Год назад

    Greatly explained, I subbed

  • @bantamchick
    @bantamchick 2 месяца назад

    Do watch this video with closed captioning on for unintentional comic effect, because for some reason CC does not always know what sharding is, so it sometimes captions it as "sharting". So you get to learn about "manual vs automatic sharting".

  • @SankalpCollege-f2o
    @SankalpCollege-f2o 10 месяцев назад

    Great video!

  • @devkiosk
    @devkiosk Год назад +3

    Awesome explanation.

  • @DesireStockhausen
    @DesireStockhausen 3 месяца назад

    Thanks for the interesting content! 😍 Just a small off-topic question: 😅 I have these words 🤨. (behave today finger ski upon boy assault summer exhaust beauty stereo over). Not sure how to use them, would appreciate help. 🙏

  • @IuisZeledon
    @IuisZeledon 3 месяца назад

    Thanks for sharing such valuable information! I have a quick question: I have a SafePal wallet with USDT, and I have the seed phrase. (air carpet target dish off jeans toilet sweet piano spoil fruit essay). How should I go about transferring them to Binance?

  • @robbybankston4238
    @robbybankston4238 11 месяцев назад

    I would think that another potential disadvantage would be if you are using commercial rather than OpenSource operating systems or databases where the licensing costs increase as the number of servers increase also.

  • @josephkabemba3211
    @josephkabemba3211 Год назад +2

    Crystal clear

  • @altruistization
    @altruistization 4 месяца назад

    you did not mention eventual consitency as a drawback of sharding?

  • @rmuneeb1
    @rmuneeb1 8 месяцев назад +2

    Untill her hands moved I thought she was an AI robot 😂

  • @vladyslavsosnov8412
    @vladyslavsosnov8412 Год назад

    Awesome, thanks

  • @goofballbiscuits3647
    @goofballbiscuits3647 9 месяцев назад +5

    Sorry, everyone...
    I parted *_and_* sharded 😢

  • @pieter5466
    @pieter5466 Год назад +2

    Good video but confusing use of the term 'partition', which is different than 'shard'.

    • @vaishnaves1723
      @vaishnaves1723 3 месяца назад

      Sharding is also data partitioning. Partitioning can mean different things based on context, similar to “consistency” (which is different in the context of ACID and CAP)

  • @mandydawson6199
    @mandydawson6199 Год назад +12

    Who is she and how do we get more videos with her?

  • @samislam2746
    @samislam2746 3 месяца назад

    you're strong

  • @caitlinmclaren2695
    @caitlinmclaren2695 Год назад

    Monolithic Databases??

  • @marcello4258
    @marcello4258 Год назад +1

    It sounds you messed up partitioning with sharding.
    And commodity hardware does not have ECC - don’t run a db on it.

    • @mick7827
      @mick7827 Год назад

      Each partition is stored within the same database server SO it's easier because sharding require multiple database servers ?

    • @deletevil
      @deletevil 5 месяцев назад

      "commodity hardware does not have ECC - don’t run a db on it"
      SQLite is a file based database. It doesn't have to reside into the non-paged part of the RAM. High energy cosmic radiation can corrupt only the volatile memory cells, not the storage.
      Also modern commodity hardware have some level of ECC for CPU cache memory. Single bit ECC support for L2 cache, and multi-bit ECC for L1 cache (at least my 10 year old Intel i7 has). A whole query operation will probably fit into the cache size of the CPU unless the data size for columns exceeds the L2 cache size of the CPU (good luck exceeding that, for example say L2 cache is 256 KB and even if we have half of it available for our query operation at this moment with all the data for columns, it would take more than 100 columns each containing >1000 bytes to surpass that cache boundary, domain corresponding these kinda large query is not a thing of commodity hardware anyways. Hospital billing, hotel management, restaurant billing? Nah).
      Taking worst case memory access time say 100 nano-seconds to fetch the data from RAM to L2 cache memory. Radiation will have to corrupt those exact memory bits inside the RAM within that 100 nano-seconds during the fetching cycle. Then it will take another 100 or so nano-seconds to write the data back to the disk (worst case disk access time of 50ms (0.005 ns) is assumed). It's extremely unlikely; almost next to impossible for that radiation to randomly flip those specific memory cells inside the RAM out of billions of memory cells pertaining to the SQLite update/delete query executing function that will complete it's execution and save the data into the disk within like 10 milliseconds at most (including all network overhead of system calls).
      SQLite for Desktop is your friend.
      However, if you intend to use any of the client-server architecture based database like MySQL etc then your statement is valid indeed.

  • @sriranjitharaghuraman1646
    @sriranjitharaghuraman1646 11 месяцев назад +1

    Some visualization would have gone a long way

    • @tryexponent
      @tryexponent  11 месяцев назад

      Thanks for the feedback!

  • @AvinashRaj
    @AvinashRaj Год назад +58

    Well thanks for reading the script.

    • @vivekkaushik9508
      @vivekkaushik9508 10 месяцев назад

      😂😂😂

    • @codermccoderson
      @codermccoderson 9 месяцев назад +9

      A lot of these YT educators write down the material before speaking to the camera. What’s your point?

    • @daphenomenalz4100
      @daphenomenalz4100 6 месяцев назад +1

      Every single youtuber has to be prepared bruh, they can't just speak everything from mind and stutter when thinking :|
      It's not a reaction video

    • @maulikshah9078
      @maulikshah9078 19 дней назад

      😂😂haha

  • @junyulu4648
    @junyulu4648 5 месяцев назад

    今天的油管就看到这儿了

  • @lakshminarayanacharan837
    @lakshminarayanacharan837 8 месяцев назад

    You are looking so cute 🥰

  • @satvikkhare1844
    @satvikkhare1844 Год назад +8

    reading for a teleprompter is not teaching!! sure it gave me topics that I can refer myself

    • @codermccoderson
      @codermccoderson 9 месяцев назад +1

      A lot of youtube educators have their material scripted before speaking to the camera? What’s your point?

  • @junyulu4648
    @junyulu4648 5 месяцев назад

    her name pls

    • @MJ-cf9nl
      @MJ-cf9nl 4 месяца назад

      It is: NoneOfYourBusiness

  • @sk-vs9nt
    @sk-vs9nt 7 месяцев назад

    am in love with this lady what her id

  • @KeshavmurthyRamachandra
    @KeshavmurthyRamachandra 10 месяцев назад

    you got the definition of Sharding wrong. understood you never did sharding in your life.