Megabytes to Petabytes: the journey of scaling a database

Поделиться
HTML-код
  • Опубликовано: 13 сен 2024

Комментарии • 9

  • @chrisg9886
    @chrisg9886 29 дней назад +2

    Thanks for the info , very helpful

  • @jhonsen9842
    @jhonsen9842 25 дней назад

    I never heard sharding for MYSQL which is Transactional DB. How they handle consistency. Great Tutorial

    • @PlanetScale
      @PlanetScale  25 дней назад

      If you wanna learn more about Vitess (used for sharding MySQL) checkout the course on our website:
      planetscale.com/learn/courses/vitess/what-is-vitess

  • @Xaoticex
    @Xaoticex 28 дней назад +1

    great vid, explain the vertical sharding part more

    • @PlanetScale
      @PlanetScale  28 дней назад

      What do you want to know? We might have some other resource I can point you at. Or good opportunity for a future video.

    • @Xaoticex
      @Xaoticex 28 дней назад

      @@PlanetScale how is a row having a relation with another shard handled or vice versa? too many shards having too many relations with each other, what are the patterns applied in this sense?

    • @PlanetScale
      @PlanetScale  28 дней назад +1

      ​ @Xaoticex You are probably referring to "scatter-gather" queries, or queries with cross-shard joins. Good to know that there's interest in this kind of content. Here's some additional reading if you're interested: developer.squareup.com/blog/cross-shard-queries-lookup-tables/

  • @Sweetironman
    @Sweetironman 17 дней назад

    I have a question for you. I'm working with a table which has more than 3 billion rows and each month new 20-25 million rows are being added. And we have created 5 indexes for all efficient searching but when we access data for even a month it takes too long. And the recent data of the week access is good, the data is loaded fast ( here we also just access recent data for that month) but when our recent record touches more than 3-4 million rows after 4-5 days the quary again becomes very very slow? I'm thinking of creating a separate table for each year for balancing the load but I'm not able to understand why for 4-5 day of the start of the month it's working good but then after 4-5 million new rows it's becomes slow again. It's not making sense to me? Any suggestions will be really appreciated.

    • @PlanetScale
      @PlanetScale  16 дней назад

      Interesting scenario! Hard to say for sure without more details, but a few things could be happening here:
      (1) The queries fetching only recent results are able to mostly hit data that is already in memory. The ones that fetch larger and older data sets might have significantly more I/O, leading to drastically slower query times.
      (2) You said you created 5 indexes ... are you confident they are being used appropriately? If you are using planetscale you can use Insights to figure this out. If not, you can try running some EXPLAINs to see whats going on.
      You might find this useful for investigating: planetscale.com/blog/identifying-and-profiling-problematic-mysql-queries