Clustered Collections makes Mongo faster but there is a cost

Поделиться
HTML-код
  • Опубликовано: 17 ноя 2024

Комментарии • 30

  • @hnasr
    @hnasr  Год назад +1

    fundamentals of database engineering course database.husseinnasser.com

  • @ketembo
    @ketembo Год назад +8

    Day 1 of waiting for Hussein to make a video on consensus algorithms

    • @hnasr
      @hnasr  Год назад

      i tried to read into them few months ago and haven’t picked up the pace.

  • @Ghost_1823
    @Ghost_1823 11 месяцев назад

    We are heavily using clustered index in our app. But one drawback was use of UUID and creating own clustered index. Thanks this video helped to avoid bottleneck

  • @husreason
    @husreason Год назад +2

    Can we please get a video on secrets management? Love the breadth of topics you have covered on your channel (thankk you so much!), but this topic seems to be missing, so I'd love to learn it from you!

  • @joshcho96
    @joshcho96 Год назад +2

    Thank you so much for your insight everytime :) I am learning so much from your videos.

  • @juliussakalys4684
    @juliussakalys4684 Год назад

    Whenever possible UUID strings should be converted to binary and stored as binary in the DB itself. This way it takes 16 bytes, compared to "string-stored" 36 bytes.

  • @adarshk7
    @adarshk7 Год назад

    About the secondary index being preferred, I could imagine a composite index being more selective, where the > 2 IO would be less of a cost than the lost selectiveness. Maybe more so in range queries. So I guess it depends on your query in the end (where if you wanted custom behaviour you could even go for $hint). What do you think?

  • @tesla1772
    @tesla1772 Год назад

    Since b trees are aslo storede in files and pages. Do db fetched entire btree when an index scan/seek has to be done

  • @EddyCaffrey
    @EddyCaffrey Год назад

    Great video.
    It is a great addition to the database.

  • @marsha363
    @marsha363 Год назад +1

    Awesome talk as always!
    Regarding 18:00, why would you want to do a query with the _id, and another filter, while the _id is unique? For kind of “is exist” query?

    • @hnasr
      @hnasr  Год назад +1

      one example is a range query, give me all documents between id10 and 50 and having certain field is particular value , if that field is indexed it will be preferred over id

  • @pemessh
    @pemessh Год назад +2

    Quick question, why did they go with the recordid way in the first place?

    • @hrmeet0509
      @hrmeet0509 Год назад +1

      +1 on the same question

    • @hnasr
      @hnasr  Год назад +2

      if I would make a guess, it’s technical debt.
      because of their original model when they first shipped MMAPv1. they had a single btree with a diskloc pointer directly to disk. that model is simple but had alot of problems mainly the use of mmap and didn’t have full acid support and MVCC . in 2014 they bought WiredTiger and that had the btree with the recordid. so it was easier to integrate is to replace the diskloc pointer with a recordid and keep all architecture the same.. otherwise it will require major rewrite
      it seems they did this big change in 5.3 as clustered collection

    • @pemessh
      @pemessh Год назад

      @@hnasr I see. That's interesting. Thank you for the answer.

  • @mohammedabdulbary1577
    @mohammedabdulbary1577 Год назад

    another amazing video, love you man ❤

  • @burunkul
    @burunkul Год назад +5

    why won't mongodb team make a clustered index a default one?

    • @hnasr
      @hnasr  Год назад +2

      i envision it being default in few years once they iron out the bugs and limitations . which will makes it close to mysql innodb

  • @bashardlaleh2110
    @bashardlaleh2110 Год назад

    IDK if my question is valid but in minute 9:00 it's not clear why you assume that reading a range of IDs from the visible index would be faster than the hidden index, why chances are those IDs being in one page is higher than chances of that being in the hidden index? doesn't this depend on how we are writing records? why writing in the visible index is next to each other but in the hidden is random?!

  • @EddyWilson-k3c
    @EddyWilson-k3c 9 месяцев назад

    Can you shard a clustered collection?

  • @ВоробійВіталій
    @ВоробійВіталій 2 месяца назад

    great, thx

  • @oddym5788
    @oddym5788 Год назад

    Where did you books and sword go :(

    • @hnasr
      @hnasr  Год назад

      I moved office, they are on my side now 😄

  • @JinKee
    @JinKee Год назад

    Why is SQL so much faster than NoSQL?

    • @stevefox7418
      @stevefox7418 Год назад

      Indexing, structured data etc.

    • @Aditya24234
      @Aditya24234 Год назад +2

      That depends a lot on your workload, MongoDB can certainly outperform SQL by a huge magnitude provided that you have designed your schema that suits and fits NoSQL and similarly there will be certain workloads where SQL would run faster. A big chunk of that performance is also dependent on the configuration and the type of deployments you are running.

    • @tonyhart2744
      @tonyhart2744 Год назад +4

      You mean the other way around ???, most scalable database on planet use NoSQL, Vitess,Cassandra,ScyllaDB etc

    • @jenkins9202
      @jenkins9202 Год назад +3

      In general it's the opposite, unless you're abusing NoSQL they should outperform any SQL database due to having relaxed ACID guarantees. You'll find most big tech companies had to eventually migrate to a NoSQL database because of SQL being a performance bottleneck when you're at a massive scale, e.g. Twitter, Facebook, Instagram etc.
      Of course it all depends on your domain, some use-cases require strong consistency guarantees with relational data which doesn't leave you with much choice but to use an RDBMS.

  • @riskiadhitama-j6s
    @riskiadhitama-j6s Год назад

    #bukopin
    #mandiri
    #britama
    #deposito
    greentea_metrimini@graharaya