How Discord Stores TRILLIONS of Messages

Поделиться
HTML-код
  • Опубликовано: 16 янв 2025

Комментарии • 120

  • @geob7o
    @geob7o Год назад +21

    Hats off for the on call team during these processes 😅

  • @RakeshBitling
    @RakeshBitling Год назад +49

    This channel is underrated.. actually a lot of useful videos are available

    • @areeburrehmankhan1166
      @areeburrehmankhan1166 Год назад +1

      Yes dude he has taught me so much like wow .

    • @wilfredv1930
      @wilfredv1930 Год назад

      It is not underrated, for the subject and 500k subs is pretty well rated actually.
      It is a very well known resource.

  • @areeburrehmankhan1166
    @areeburrehmankhan1166 Год назад +55

    Amazing content man . Also your animation are really helpful in making things more understandable. Keep up the great work.

  • @ariganeri
    @ariganeri Год назад +19

    Really good content. Just one point - writes go to both the local SSDs and the persistent disks at the same time. Write-mostly in Linux md means that reads go to the other disks, so in this case the local SSDs.

    • @locfuho3899
      @locfuho3899 Год назад

      I don't think that is needed in case of sending and receiving messages, as long as data sync from the write disk come in order, we only need eventual consistency.
      For example, in a group chat, every body sending and receiving messages, right?
      As long as the messages stored correctly in the write DB, they can eventually be synced to the read DBs.
      I mean there is no hard consistency here.

  • @yadav117uday
    @yadav117uday Год назад +274

    you should have discussed how they moved from mongodb to cassandra, its was a bigger engineering challenge

    • @gradientO
      @gradientO Год назад +9

      Can you brief about why nosql makes sense in this context instead of relational dbs?

    • @terencepan2232
      @terencepan2232 Год назад +31

      ​@@gradientOscaling writes in relational db is hard or not as possible

    • @flygonfiasco9751
      @flygonfiasco9751 Год назад +7

      @@gradientOthere’s a lot of overhead in maintaining the relationships in a relational database and there’s overhead in ensuring strong consistency

    • @truevelvett
      @truevelvett Год назад +4

      @@terencepan2232 You can definitely shard but the operational complexity is insane

    • @kamal-xd7id
      @kamal-xd7id Год назад

      Because its not even written in original blog post from Discord.

  • @ivanlee172
    @ivanlee172 Год назад +11

    One thing missing here is how they setup the monitoring system to have such analysis from the running system, which is also an important part. Would like to hear more about this.

  • @king0s
    @king0s Год назад +13

    my takeaways and a question:
    ScyllaDB with C++ under the hood and no GC was a part of it.
    Request coalescing is another part.
    Selecting and deciding on optimal Google Cloud services i.e Persistent disk is another part.
    The two layered RAID setup is another part.
    Modifying the Linux kernel to do write on the Persistent disks and reads from the local SSDs was another part(But I'm wondering how they made the changes to the Linux kernel of the Google VMs? anyone help me out here, do they bake the changes to a VM image and deploy the image to the Google VMs?)
    And finally for the migration using a data migrator written in memory safe and highly performant Rust was another great decision.

  • @_jeh.k
    @_jeh.k 2 месяца назад

    Those animations look sick!!🔥🔥🔥🔥

  • @piyushkatariya1040
    @piyushkatariya1040 Год назад +27

    The real beauty of ScyllDB lies in Seastar framework which bypasses OS kernel to achieve close to metal speed. Also, Cassandra 5.0 which will release in few months from now won't have this issues and will much higher throughput as it will be able to compile against Java 17 SDK and might also probably run Java 21 which has Generational ZGC garbage collector which gives you sub milliseconds latency. Cassandra also have the advantages of defining aggregate function in Java or JS and run in Casandra DB instance itself rather than fetching all data in application server which is super expensive.

    • @bruterasta
      @bruterasta Год назад +2

      On the blog post it was pretty clear, that they were tired of GC alltogheter.

    • @piyushkatariya1040
      @piyushkatariya1040 Год назад +2

      @@bruterasta Generational ZGC is auto tunable.

  • @zl7289
    @zl7289 Год назад +5

    I’m just curious, what tools are you using to make this beautiful diagram and fancy effects 😮❤

  • @sharmilak5109
    @sharmilak5109 Год назад +1

    Hi your contents are worthy, it was such a meticulous way of explanation of concepts,thank you

  • @NapoleonPosada
    @NapoleonPosada Год назад +1

    Amazing. I have read about the features of Scylla but this a definitive proof that is an excellent database.

    • @ankeshkapil3129
      @ankeshkapil3129 Год назад

      But costly to run

    • @hagaiak
      @hagaiak 11 месяцев назад

      @@ankeshkapil3129 How is a more efficient database more costly to run? The more performant a DB is, the cheaper it is to run, even if you're talking about small workloads

  • @0xmmn
    @0xmmn Год назад

    kudos to make a video to the point and easy to grasp not like other channels doing long videos to hack the youtube algorithm.

  • @flatdinos
    @flatdinos Год назад

    next month I'll migrating one of my database, it's so painful to planning and flow designing of migration, but this video help a lot.

  • @SimarMannSingh
    @SimarMannSingh Год назад +3

    QUICK QUESTION: I've seen this style of YT video somewhere else too, somewhere in a tech video I remember. Is there a tool you're using to make these kind of videos? Or did you (or someone you hired) edited the video yourself?

  • @khiariyoussef3226
    @khiariyoussef3226 Год назад +15

    I may not have enough experience with such tasks, but how do they make sure they pick the right database that fits their performance requirements (at that scale) ?

    • @peterpeter9230
      @peterpeter9230 Год назад +18

      Sometimes these companies have experts (maybe one of the developers of such DB) but at such a scale it's mostly these companies, who test if these DBs can widthstand such a hight load. So I'd call it precision guess work. Compare existing solutions in terms of performance etc., maybe run a extensive tests and then just give it a go.

    • @NachitenRemix
      @NachitenRemix Год назад +6

      I guess they have TONS of metrics which cann give them the info they exactly need on what they need and don't need to optimizr

    • @MaulikParmar210
      @MaulikParmar210 Год назад +6

      They didn't - there are Petabyte scale deployments used by many organisations that are larger than Discord uses. Your DB is going to be as good as its integration with the system and use case. This case study only shows discord lacks expertise or is reluctant to take in advisory from experienced people. This is true for Twitter and airbnb, too, when they switched DBs, thinking it would help but didn't.
      It requires a lot of experience and architectural knowledge of DB itself to make the rught call. In reality tho who make right call are usually not in like light or advertise about their milestones. So it's not an open book that can be simply explained with diagrams while implementation details would vary a lot.
      P.S. Even large orgs make mistakes or lack skills. They sometimes need external help, and when they dont, they often run into performance issues that are talked about in the public domain.
      How they pick it? It's mostly biased towards teams experience and decision makers bias towards tech they know until they are ready to move on and do all migrations to new tech, it's always moving, changing so there's no specific reciepie for that.

    • @georgesmith9178
      @georgesmith9178 Год назад +9

      This actually follows some money logic. First you pick a tool that is good enough and FREE, like Cassandra. If the business develops slowly, then you can get by with this initial solution just fine. But, when you have a booming growth, as is the case with Discord, the free solution is not designed for that type of scale, so you start to think about high-availability, low-latency, and so on, and usually end up with some commercial solution, that may be even be based on your original free solution, but with enterprise features on top. In the case of Discord, they just picked a DB with better engine, and they leveraged a different storage solution in the cloud, ergo a better performing storage. Of course, the explosive growth guarantees money is no longer an issue - you can afford the enterprise solution and the better performing storage because customers are paying :). There is you logic :).

    • @alooooooola
      @alooooooola Год назад

      it think the fact that they changed the behavior of read and write of an SSD make the different. I dont really think changing the DB make that much different, yes it may be better but not so much without the technic of override SSD behavior. And since this is Discord, they migrated from a garbage collector language and move to a non-garbage collector language before (Go to Rust), it could also affect their choice in this case.

  • @JacobSamro
    @JacobSamro Год назад +2

    Scylla is a saviour ❤

  • @javisartdesign
    @javisartdesign Год назад +1

    Cool, interesting to watch! learning all the time!

  • @georgesmith9178
    @georgesmith9178 Год назад +1

    Awesome content. Please, tell me what you use for the graphics and animations. They are so fluid.

  • @nareshgb1
    @nareshgb1 Год назад

    "discord found themselves in a bit of a pickle"....good one :)

  • @zitronenlolli1
    @zitronenlolli1 Год назад +1

    great content!!

  • @gus473
    @gus473 Год назад +1

    💯 Great episode! Interesting takes on this 🐘 project! 😎✌️

  • @leomysky
    @leomysky Год назад +1

    Thank you for the video!

  • @RahulGupta_Grahul
    @RahulGupta_Grahul Год назад

    Nice explainer. Can you also please link the OG blogpost in the description?

  • @beofonemind
    @beofonemind Год назад

    Thanks for this, very helpful.

  • @kiransingh2935
    @kiransingh2935 Год назад +2

    I have literally never seen a company be happy with choosing Cassandra. I know of engineering orgs who have an entire pod of SREs dedicated to nothing but keeping the Cassandra alive.

  • @tubenzr
    @tubenzr Год назад

    what an Excellent team to be able pull off that humongous data 🥶

  • @rhaba
    @rhaba Год назад

    Is it possible that there's a typo at 4:48 and following? There are two md0 devices.

  • @strtale
    @strtale 4 месяца назад

    Discord doing the backend: 👹
    Discord doing the UI/UX: 🥺

  • @BenThatOneGuy
    @BenThatOneGuy Год назад +2

    Quality content as always, love this channel

  • @punkerIII
    @punkerIII Год назад

    Thank you for the video.

  • @ketaminefairy
    @ketaminefairy Год назад

    someone please explain more on the RAID 0 setup? What is the safety net under that? Am I missing/not understanding something?

  • @bjugdbjk
    @bjugdbjk Год назад

    Could you make a video on the Data services part which written in Rust ? That will be a quite interesting to many folks !!

  • @soorkie
    @soorkie Год назад

    I love your content. Do we have a discord server for this channel and this community? I would really love to engage more.

  • @blasttrash
    @blasttrash Год назад

    3:28 is that like a cache on query parameters?

  • @willianrocha8615
    @willianrocha8615 Год назад

    Only 9 days what a huge achievement

  • @aunghtayoo337
    @aunghtayoo337 Год назад +1

    "of course. Migration production data is no joke!". Thanks for the quality contents.

  • @dougphilips8807
    @dougphilips8807 Год назад +1

    I am confused the difference between "Request coalescing" and what your other videos call caching? Since "Request coalescing" is part of the success here I'd like to understand the difference better, thanks!

  • @MinhHoang-fu7tt
    @MinhHoang-fu7tt Год назад

    Amazing topic!!!

  • @jerryking4777
    @jerryking4777 Год назад

    wow, can i ask what after effect template is used for presentation? I would like to present my work in school.

  • @thememmer
    @thememmer Год назад

    What tools do you use to create these videos

  • @delucabruno
    @delucabruno Год назад

    kudos to the Discord team 👏

  • @gigakoresh
    @gigakoresh Год назад +5

    Inspiring story. It would be really cool to see how Telegram solves their backend challenges. They have an even greater scale, smaller team with less funding and fastest latency of any messenger. I wish they shared more of their backend, that system is engineering marvel, just like Discord.

    • @pegasusgemini6541
      @pegasusgemini6541 Год назад

      Me too, i'm really impressed on how Telegram is very responsive and optimized

  • @ayotomiwasalau6373
    @ayotomiwasalau6373 Год назад

    What pipeline tool did they use for the data migration from Cassandra to Scylla?

  • @touchwithbabu
    @touchwithbabu Год назад +1

    Love your content and respect for your efforts

  • @carlosm.1233
    @carlosm.1233 Год назад

    What software do you use to create such presentation?

  • @EdwinFairchild
    @EdwinFairchild Год назад

    is discord the only paltform having to store trillions of messages? or data for that matter? how have other companies solved it?

    • @123mrfarid
      @123mrfarid 2 месяца назад

      I think quite a few like meta companies products : whatsapp, insta, fb. Google : google fpm, gmail, google ads, youtube

  • @palkollar7739
    @palkollar7739 Год назад

    what do you use for these animations?

  • @DK-ox7ze
    @DK-ox7ze Год назад +5

    If the writes were going to the persistent disk with high latency, then how were they made immediately available (with low latency) to all the intended members of that messaging group?

    • @daviduzumaki
      @daviduzumaki Год назад

      I'm guessing data was also stored in an LRU cache when it was written to the persistent disk

    • @DK-ox7ze
      @DK-ox7ze Год назад

      @@daviduzumaki In that case they might have to again deal with data loss issues which they were facing, and the reason why they introduced persistent storage in the first place.

  • @pm71241
    @pm71241 Год назад +6

    Hmm ... As a user of ScyllaDB. This doesn't seem like the most daunting DB migration scenario, I could imagine.
    I mean... Had it been any other database than Cassandra, the task would have been magnitudes larger.

  • @bjugdbjk
    @bjugdbjk Год назад

    Rust is a superstar !!

  • @AymanDevops
    @AymanDevops Год назад

    An amazing video thanks man,
    I just have a question about the Data service part, isn't similar to redis or CDN or anyother cache mechanism? I'm just wondered why they called this name and what it does exactly

    • @lord_nikon_010
      @lord_nikon_010 Год назад +1

      I believe is not like a CDN or Redis. I understand that is an API acting as an interface between the Monolith and the database, decoupling one form another - which allowed to use a language (Rust) more efficient to handle the high-performance data operations.

  • @jerkmeo
    @jerkmeo Год назад

    That's just awesome sharing. Thanks!

  • @jasonguo7596
    @jasonguo7596 Год назад

    Is it ok to move older messages to a separate database periodically? So the main database would not be carrying so much data all the time.
    And have dedicated service to read messages from that database with historical messages.

  • @quang.luu.179
    @quang.luu.179 Год назад

    excellent conten.

  • @sagiajaj17
    @sagiajaj17 Год назад

    How is their solution different from RAID10? It is essentially raid1 array mirroring a raid0 set. Am i missing something?

  • @RahulGupta_Grahul
    @RahulGupta_Grahul Год назад

    I think the boxes at ruclips.net/video/O3PwuzCvAjI/видео.htmlsi=3YXYT_VTw2QqriyG&t=343 are a bit misleading. First, probably scylla uses both persistent disks for writes and NVMe(s) for reads, so, really one box should contain both with arrows for reads and writes. Secondly, the message service is partitioned, therefore, different writes go on different instances of scylla instead of a single scylla master node or central cluster which the triangular placement of scylla in diagram suggest.

  • @RisalFajar
    @RisalFajar Год назад

    When there's a new message or edited old message, how do the app receive the changes? Is it by listening to data on the DB (like a Web Socket)?

  • @seeball
    @seeball Год назад

    That's incredible

  • @nexus888
    @nexus888 5 месяцев назад

    If Discord would only create a good UX. Having may servers is a nightmare to go through..

  • @TheSelectmax
    @TheSelectmax Год назад +1

    Thanks for the content. A little introduction was missing what kind of database this is and why it is better, except for C++ under the hood

  • @shinmini99
    @shinmini99 Год назад

    thanks for content :)

  • @repairstudio4940
    @repairstudio4940 Год назад +1

    To the Discord devs 🥂

  • @luckylove72
    @luckylove72 Год назад

    So they didn’t plan to have data warehouse?

  • @jianruan5491
    @jianruan5491 Год назад

    cool,I want to try in my work

  • @abhishekkumar-ei5hl
    @abhishekkumar-ei5hl Год назад

    Please make video in realtime application like figma, Google docs, or multiplayer game

  • @mz7640
    @mz7640 Год назад

    Can't they use AWS dynamoDB?

  • @5p4rk3r
    @5p4rk3r Год назад

    how to run scylla db in the cloud??

  • @vaddimka
    @vaddimka Год назад

    Well it doesn't actually tell HOW it stores trillions of messages (like partitioning, cache organization, hot path issues), just a high level overview of the migration and dbms

  • @carlosharris2681
    @carlosharris2681 Год назад

    Friendly programming noob here: Can someone explain why Garbage Collection-free gives Scylla an advantage over Cassandra? Thanks in advanced.

    • @sergiocoder
      @sergiocoder 7 месяцев назад

      To my understanding, without a garbage collector there are no stop-the-world pauses to collect the garbage in the background and therefore more computing resources are spent on the actual work of DB.

  • @nishant_singh
    @nishant_singh 11 месяцев назад

    Can you make me a pro at syatem design ?? I just love this subject...

  • @dirac7233
    @dirac7233 Год назад +2

    Scylla is basically a C++ version of Cassandra. This story is one more proof of java's inferiority. Why are people still using Java ?? 🤷🏽‍♂️

    • @sergiocoder
      @sergiocoder 7 месяцев назад +1

      Because they can't learn C++ :)

  • @imMavenGuy
    @imMavenGuy Год назад

    9 days - are you kidding me!

  • @sriharsha580
    @sriharsha580 Год назад +2

    What is a garbage collector, how it is useful for DB’s?

    • @cherubin7th
      @cherubin7th Год назад +5

      Cassandra is written in Java, so the garbage collector is automatically part of it.

    • @RicardoSilvaTripcall
      @RicardoSilvaTripcall Год назад +2

      The Garbage Collector is responsible for checking which variables aren't more needed in the program and removing than from memory, the problem is, the Garbage Collector has to run every now and then, and this process requires processing power and will add latency to the main application running, because for the most of the time, the whole application or parts of it has to be stopped and wait for the Garbage Collector to do it's job, even though this process takes a few milliseconds, in a high throughput application this can add up and slow down the whole system.
      Languages like C++ and Rust have a "manual" memory allocation and deallocation, the programmer is responsible to take care of it before hand, so you won't need a garbage collector at runtime, what results in faster executions ...

    • @zakk6182
      @zakk6182 Год назад

      @@RicardoSilvaTripcallappreciate this

  • @joaoguilherme-or1ud
    @joaoguilherme-or1ud Год назад

    Show!!!!

  • @pegasusgemini6541
    @pegasusgemini6541 Год назад

    Waouh!

  • @chrishabgood8900
    @chrishabgood8900 Год назад

    Joy != rust

  • @rolina.azmitiamedrano5137
    @rolina.azmitiamedrano5137 Год назад

    👀👌

  • @2xKTfc
    @2xKTfc Год назад

    So much effort for messages that nobody can ever find again anyway 😂Seriously, Discord scrollback might as well be write-only it's so unusable. :(

  • @shpluk
    @shpluk Год назад +2

    I can agree it is cool technology wise, but 99.9% of discord messages are garbage, and who will ever scroll back to see any of them?
    I admire the engineering effort it just seems to me kind of wasted effort

    • @bluebird3131
      @bluebird3131 Год назад

      Not true at all. If you have message garbage on Discord, it your server choice you have to question, not the tool or the others... 🥱

  • @RohanDas23
    @RohanDas23 Год назад

    How do they store TRILLIONS of messages? Like everyone else, they use DATABASE. Hows that any different of any other services?

  • @nareshb5
    @nareshb5 Год назад +2

    1st viewer😂

  • @leonchen8317
    @leonchen8317 Год назад

    Brother, I love your content, but please, your tone is too flat. It makes me sleepy