The ORDER BY Algorithm Is Harder Than You Think

Поделиться
HTML-код
  • Опубликовано: 1 июл 2024
  • In this video I describe in detail how my implementation of the K-Way External Merge Sort algorithm works. K-Way External Merge Sort is an algorithm used to sort large datasets that don't fit in main memory (usually RAM). Therefore, this algorithm is used by databases like Postgres to process ORDER BY queries when tables don't fit in memory. The algorithm consists of a series of "passes" through one or multiple files and a number of in-memory buffers used to load and process different chunks of a file in each pass. The end result is a file that contains all the requested rows sorted by the keys given in the ORDER BY clause.
    🌐 LINKS
    Algorithm Implementation:
    github.com/antoniosarosi/mkdb...
    ✉️ CONTACT INFO
    Business Email: business@antoniosarosi.io
    Contact Email: sarosiantonio@gmail.com
    Twitter: / antoniosarosi
    Instagram: / antoniosarosi
    LinkedIn: / antoniosarosi
    🎵 MUSIC
    • [Chillstep] Broken Ele...
    • Ptr. - Genesis
    • Digital Road
    • Juno
    📖 CHAPTERS
    00:00 Introduction
    00:22 The Memory Problem
    01:32 Database Tables & Sorting
    03:18 K-Way Data Structures
    04:38 Algorithm Execution (Pass 0)
    06:17 Pass 1
    09:22 Pass 2
    11:07 I/O Complexity
    11:58 Variable Length Data
    13:16 Final Thoughts
    🏷️ HASHTAGS
    #programming
    #computerscience
    #algorithm
  • НаукаНаука

Комментарии • 117

  • @paulstelian97
    @paulstelian97 4 дня назад +4

    Since you do "limit 100", can't the query planner be smarter and do some sort of limited sorting that drops off entries automatically? Like an array that holds 100 entries, and every entry is just inserted into there, but the excess goes off and is discarded automatically.
    I actually wonder if you can't do this for most reasonable queries too!

    • @tony_saro
      @tony_saro  4 дня назад

      Maybe it can, I didn't implement it myself though.

    • @treelibrarian7618
      @treelibrarian7618 2 дня назад +1

      I also had the same thought - for any limit N result that is easy enough fit in ram you could do this with no extra disk space required and only one pass of the data. take first N rows, sort them and then load more data from the file in big enough chunks (10MB?). check each row against the last in the sorted result table. if the checked row is before the last in the result table, keep it in an unsorted pile that just tracks the last in the pile, and then move the last-pointer in the sorted result table to the previous entry. continue till the last in the unsorted pile is after the current last in the sorted pile, then merge and sort the two together, keeping the first N again. repeat till whole database has been scanned.

    • @tony_saro
      @tony_saro  2 дня назад +1

      @@treelibrarian7618 Yeah the algorithm seems pretty obvious, but you'd have to figure out if the limit actually fits in memory and the problem is you're dealing with variable length data (not all rows are the same size). It's not as easy as it seems, but it doesn't change much anyway, you still need an external sort algorithm, I don't even know why I added the limit in the example.

    • @paulstelian97
      @paulstelian97 2 дня назад

      @@tony_saro Even variable length has a maximum length, so you can conservatively estimate the maximum for that.

    • @tony_saro
      @tony_saro  2 дня назад +1

      @@paulstelian97 You can estimate a max size using the table schema, as long as it doesn't have TEXT or BLOB fields it's pretty reasonable. But anyway, I'm just so dumb that I didn't even think about optimizing LIMIT queries when I wrote the DB, I actually didn't even implement LIMIT, I only put it in the video example because usually you won't SELECT * FROM a giant table, so I thought it'd be more "realistic", but still, as mentioned, that doesn't change anything about the video, when applicable you still need an external sort algorithm. I might pin this comment if more people have doubts about this.

  • @eduardoandrescastilloperer4810
    @eduardoandrescastilloperer4810 6 дней назад +37

    Something so simple ended up being so complex and you didn’t even talk about distributed DBs. Top notch

  • @TreeLuvBurdpu
    @TreeLuvBurdpu 4 дня назад +26

    I'm watching this just from the perspective of someone wanting to sort tables that might become very large. This explains why it's so important to create indexes on any columns you might want to sort.

    • @tony_saro
      @tony_saro  4 дня назад +3

      True, BTree indexes will skip this algorithm altogether.

  • @nirshadnijam2291
    @nirshadnijam2291 20 дней назад +37

    This is top tier. The people who invented these algorithms. Insane. We are taking technology for granted. You explain this really well. For a person who doesn’t have a formal computer science background, this video helps me understands how the tools I use day to day work under the hood

    • @tony_saro
      @tony_saro  20 дней назад +13

      I have a formal Computer Science background and I still find databases very hard to understand 😂. There's a lot of research that went into them over the past 5 decades. People who came up with these algorithms are definitely insane.

    • @ra2enjoyer708
      @ra2enjoyer708 День назад

      @@tony_saro Well CS is kinda orthogonal to the database design. While RMDBs are based on math more-or-less, you won't be able to explain why Mongodb, a No-SQL database, supports join operations, while an egregiously RDBMS Postgresql supports egregiously unrelational formats like JSON and XML, with math theory alone.
      And SQL databases weren't THAT math-based to begin with, back when PostgreSQL started there was no such thing as "SQL compliant", since every RDB had its own incompatible flavour of SQL. Considering it was also the time when the applications were written expecting to talk to the DB directly, the hatred for SQL and the need for ORMs at the time makes sense, especially since there was no XML crutch to use as a go-to serializible format for messaging.

  • @editxswajal
    @editxswajal 10 дней назад +27

    This is what for I pay my internet bills. Btw great work and explained well

  • @ra2enjoyer708
    @ra2enjoyer708 День назад +1

    This video is pretty good at picturing file operations as something that just works and not a clusterfuck at all.

    • @tony_saro
      @tony_saro  День назад +1

      It just works but it's not easy to implement at all 😂

  • @Andre-kh3fl
    @Andre-kh3fl 5 дней назад +10

    You are a legend. I've just watched your video coding a database and I'm very impressed. I'd really like to see you coding a compiler/interpreter. I know it takes a while, but you have awesome didactics. Keep going man, your channel is top tier!

    • @tony_saro
      @tony_saro  5 дней назад +5

      I will at some point, now I'll focus on smaller projects because the database drained all my energy 😂

  • @alfredovr9111
    @alfredovr9111 10 дней назад +17

    Que sorpresa encontrarme con videos tuyos de nuevo! Finalmente todos los años que llevo aprendiendo inglés valieron la pena 😂. Mucho animo con este canal me encantan tus vídeos y aprendo mucho de ellos

  • @ciberman
    @ciberman 9 дней назад +4

    You opened a door of new kinds of algorithms for me. I never realized that there is a huge world of algorithms and data structures to work with data that don't fit in memory. Amazing!

  • @elraito
    @elraito День назад +1

    This is actually valuable lessons presented in awesome way. Man i just hope you blow up because we need way more of this this type of content.

  • @sebamarin12
    @sebamarin12 6 дней назад +4

    You have explained it in such a way that I have understood it even though I am an HTML programmer.

  • @cryptopatrick
    @cryptopatrick 7 дней назад +5

    Damn! This guy is next level - insane quality. Bravo!

  • @vinicius707
    @vinicius707 21 день назад +6

    This is some great content, man! I really appreciate the effort that you put into it and for making it available for free on RUclips. Thanks!

  • @josueC235
    @josueC235 3 дня назад

    No hay cosa más hermosa que ver tu contenido en inglés tambien 🥹🫶
    Amo enserió tu contenido brou ❤

  • @gu1581
    @gu1581 11 дней назад +5

    At my work, we have a project where in an application several queries are run, the results are mapped into memory via 'Object Relational Mapping' and then stuff like joins, limiting and SORTING is done in the application with the RAM of the query results. I pointed out the performance difference it would make but the fact that it's not even possible to do this for bigger tables real adds some fuel to the fire :D Great video, shows why leaving as much processing as possible to the DBMS is a good idea

    • @tony_saro
      @tony_saro  11 дней назад +3

      If you're selecting only a few megabytes of data that's probably fine, but otherwise the database will do a much better job, it's designed and optimized for that.

    • @TapetBart
      @TapetBart 5 дней назад

      You do joins in the application instead of on the database???? Why???

    • @gu1581
      @gu1581 5 дней назад +2

      @@TapetBart We used microservices and chose to have one database per service. Turnrd out that joining across multiple databases is very cumbersome especially if they have different credentials. So we simply did the joins with ORM in the application... and by "we" I mean my team before I joined the project. Later I suggested we used one schema per service eliminating all of that. Don't even know why we used microservices in the first place - we don't have to scale out the app for higher throughput because only a few dozen users use it at once

    • @TapetBart
      @TapetBart 5 дней назад

      @@gu1581 ah, makes perfect sense then.
      I am doing something a bit similar with DuckDB.

  • @lampham7874
    @lampham7874 6 дней назад +2

    I've heard this kind of algorithm since university but too lazy to dig to it, now thank you for helping me got its idea 💪

  • @sunkittsui7957
    @sunkittsui7957 5 дней назад +1

    Very intuitive and concise explanation!

  • @Israel220500
    @Israel220500 7 дней назад +1

    Really nice video man. Merge sort was my favorite sorting algorithm when I was beginning my journey into computing. Nice to see that what databases use is a variation of it (with a lot more complications hehe). Saludos de Brazil.

  • @levipalait683
    @levipalait683 5 дней назад +1

    Very good explanation and amazing animations! Please keep up the content!

  • @m-alghobary
    @m-alghobary 11 дней назад +3

    The explanation is so clear, I didn't have to put any effort to get the idea 👏👏👏
    I subscribed, dropped a like and I hope you continue producing great content 🙏🏻

    • @tony_saro
      @tony_saro  11 дней назад +3

      I will, thanks for the sub and like 👍

  • @urbaniv
    @urbaniv 7 дней назад +2

    Awesome. Great work

  • @ashwdq
    @ashwdq 10 дней назад +1

    Que buen videoo!! Muchisima suerte y ánimo con este canal! Tienes muchísimo potencial para crecer. No te rindas y muchísimo ánimo❤❤

  • @ViniciusVieira13
    @ViniciusVieira13 10 дней назад +2

    Great presentation and explanation on this topic. I've happily subscribed

  • @carylandholt
    @carylandholt 4 дня назад +1

    Outstanding job! So well presented and extremely clear.

  • @Glovali
    @Glovali 10 дней назад +2

    Very well explained. I look forward to more videos!

    • @tony_saro
      @tony_saro  10 дней назад +1

      Working on it 👨‍💻

  • @loocheenah
    @loocheenah 7 дней назад +2

    This channel is lit, man, instant sub 🎉

  • @alexanderzikal7244
    @alexanderzikal7244 16 часов назад

    Thank You again! A really interesting problem with different sizes.

  •  4 дня назад

    Pretty good explanation! I just remember your first videos in the spanish channel and it's just amazing to see how you are progressing as engineer!

  • @eric-seastrand
    @eric-seastrand 4 дня назад

    Fascinating! Subscribed.

  • @facundopuerto4415
    @facundopuerto4415 День назад +1

    Muy buenos tus videos. Las animaciones hacen que sea mucho más fácil de entender. Saludos!

  • @mayfly0
    @mayfly0 2 дня назад +1

    wow such a clear explanation 👍👍

  • @user-ju7co9md1q
    @user-ju7co9md1q 5 дней назад +1

    El rey a vuelto, ahora sí hay un buen motivo para aprender inglés.

  • @adokana_
    @adokana_ 9 дней назад

    Buen vídeo, me alegro de volver a verte

  • @brayancortes7333
    @brayancortes7333 2 дня назад +1

    Gran explicación, sigue asi, ademas me ayudas a mejorar mi ingles, un saludo desde Colombia

  • @amj864
    @amj864 3 дня назад

    Just subed, love the video hopefully, you do many many more and gain a yuuugggee fan base

  • @genins21
    @genins21 День назад +1

    Amazing explanation about the sorting, but I'm actually not sure you need to sort the whole table any time someone asks for top N values...
    It would make much more sense to select the top 100,and then just sort that 100

    • @tony_saro
      @tony_saro  День назад +1

      Check the pinned comment

  • @Mariomm_marti
    @Mariomm_marti 4 дня назад +1

    Hey! I think what you did in your Spanish channel was much less niche and this content definitely looks much much more polished and informative. Congratulations!

  • @66ussa77
    @66ussa77 10 дней назад

    Excelente como siempre 💯💪

  • @AqoCyrale
    @AqoCyrale 7 дней назад +2

    glad you went in-depth about this topic. will you do more like this with other sub-topics? or do you have a new big project you're working on that you will release in the future when done?

    • @tony_saro
      @tony_saro  7 дней назад +1

      Both, next video will be similar to this one and then I'll move to another project

    • @AqoCyrale
      @AqoCyrale 7 дней назад +2

      @@tony_saro awesome, looking forward to it. thanks a lot!

  • @consciousmi4842
    @consciousmi4842 11 дней назад

    Wow, this was very well explained. Thanks for posting. Can we have more videos plz. Thank you again

    • @tony_saro
      @tony_saro  11 дней назад +1

      Sure, I'm working on a video similar to this one focused on algorithms and then I'll start working on another #mkown episode

  • @robertchang-me
    @robertchang-me 8 дней назад

    Really good content, like it

  • @LokendraSingh-42
    @LokendraSingh-42 2 дня назад +1

    This video is complex as is, and now I am wondering what will happen when we have to implement Isolation.
    I guess, it'd simply be reading required pages during initial run, and then we can work with that dataset and sort it
    Here's one more, implementing postgres statement timeout

  • @borislavrangelov4770
    @borislavrangelov4770 7 дней назад +1

    Regarding dealing with variable-length, during pass-0, when outputing a sorted page of tuples, offset from the start of the file and page size can be recorded in an in-memory 'page file' which can be just an ArrayList (java background here, use something which can grow and has 0-n lookup speed). Later stages can use that 'page file' list as a way to lookup the position of the page in the file for the cursor to use. Their outputs will generate a new 'page file'.
    Now, if we want to parallelise this algorithm, since we have the offset and size, we can let each worker thread find the location of whichever page range its going to be working with using that 'page file'. The output will always be starting from the minimum offset and be the total size of the range and the 'page file' since its an Array List each thread can insert in the appropriate index the new page values.

    • @tony_saro
      @tony_saro  7 дней назад +1

      That's more or less what I'm doing but you can't store it in memory only, let's say you need to sort 1TB and pages are 4KB, you'd need to store the offsets of approximately 250 million pages, using 4 bytes per offset would already require 1GB of memory. So in case that happens you need to store what you call "page file" on disk as well. Another very important detail is that the number of pages will change in each different run, just like I showed the example of producing 3 pages from 2 pages only, maybe in the next pass you produce 1 page using 3 pages. It's still paralelizable, but you have to store the offsets for each thread during each run.
      And this is just a basic algorithm I came up with, if you take a look at the Postgres version it's much more complex than what I've explained here 😂.

    • @borislavrangelov4770
      @borislavrangelov4770 7 дней назад +1

      ​ @tony_saro Will take a look. Also I didn't consider such large queries where GBs or even TBs of data are involved.
      The videos are great. Got me thinking on a lot of stuff around db engine design.
      Looking forward to more videos 😁

  • @SurfingOnTheGuitar
    @SurfingOnTheGuitar 3 дня назад +1

    Great video, so for every order by query, a separate file with the sorted results will be produced? Or does it actually sorts the records in place (in the db files) and keep them sorted for later queries ?

    • @tony_saro
      @tony_saro  3 дня назад

      Separate file unless the results fit in memory

  • @alejandroklever
    @alejandroklever 6 дней назад +2

    Another great video. I would really like to see the code of this.

    • @tony_saro
      @tony_saro  6 дней назад +2

      It's on GitHub, linked in the description.

  • @victormadu1635
    @victormadu1635 11 часов назад

    Please keep up the good job

  • @user-yi6sb8qo1j
    @user-yi6sb8qo1j 4 дня назад +1

    This reminds me the bottom up approach of merge sort, tricky, would this K-way done in such approach actually? Subcribed and liked, hope this channel always continue. Here are DSA topics beyond textbooks but also explained well, specialized for DB . I remember "The Arts of Programming" mentioned external sort too, from random places, I bookmarked some other external sorting algo: Poly-phase Mergesort, cascade-merge, oscillating sort, I know I don't know them

    • @tony_saro
      @tony_saro  3 дня назад +1

      It's similar to bottom up you can represent the passes as a tree

  • @macgyverswissarmykni
    @macgyverswissarmykni 5 дней назад +1

    Nice seeing someone else writing Rust code

  • @ChrinovicMukanya
    @ChrinovicMukanya 9 дней назад

    I found my new role model

  • @angelffg
    @angelffg 2 дня назад

    Great!

  • @valcubeto
    @valcubeto День назад

    Perfectamente explicado, y eso que no se me da muy bien el inglés

  • @sunkittsui7957
    @sunkittsui7957 5 дней назад +1

    Are there any other external sorting algorithms like this?

    • @tony_saro
      @tony_saro  5 дней назад +1

      Yes, SQLite 3 uses something called PMA (Packed Memory Array) to sort on disk. I don't know how it works, didn't research enough.

  • @victorramos3110
    @victorramos3110 5 дней назад

    Cómo te volviste tan fluido en inglés? Llevo años con el inglés y no soy ni la mitad de fluido...

    • @tony_saro
      @tony_saro  5 дней назад +1

      Escuchando cómo hablan los nativos y practicando la pronunciación. Una cosa es el inglés que te enseñan en la escuela o academias y otra es el inglés que se habla en la calle 😂.

  • @jaime7295
    @jaime7295 20 дней назад +3

    Una pena que en la comunidad hispana no haya interés para este tipo de temas :( La mayoría de gente solo se centra en el desarrollo web, cuando no conocen el fascinante mundo del bajo nivel

    • @tony_saro
      @tony_saro  20 дней назад +8

      Justo, por eso me he cambiado a la inglesa porque esto es lo que me interesa a mí.

    • @zeusjean
      @zeusjean 11 дней назад

      @@tony_saro

    • @danields_04
      @danields_04 7 дней назад

      Mucho ánimo con este nuevo canal.

  • @user-ur4ev7vl6c
    @user-ur4ev7vl6c 17 часов назад

    Hey pal! Thank you very much! I wanna to create my own dbms(cloud, embedded and so on) too! If I would have a some result than can I send a repo github under you commentary?)

    • @tony_saro
      @tony_saro  15 часов назад

      I think RUclips will flag it as spam if you add a link to your commentary

  • @diegoparra8859
    @diegoparra8859 9 дней назад

    Volveras al canal de Antonio Sarosi?

  • @MrC0MPUT3R
    @MrC0MPUT3R 6 дней назад

    Sir, do you have a license for those guns?

    • @tony_saro
      @tony_saro  6 дней назад

      They might be illegal ☠️

  • @baranacikgoz
    @baranacikgoz 4 дня назад

    I felt myself a way worse software engineer after I saw you ...

    • @tony_saro
      @tony_saro  4 дня назад +2

      This is leaning more towards pure Computer Science than software engineering.

  • @julionavarro2600
    @julionavarro2600 5 дней назад

    El doble de Sarosi?

  • @edwinroman30
    @edwinroman30 9 дней назад

    Ey there is man named Antonio who does programming videos too, is he your twince?

  • @rayilisto
    @rayilisto 20 часов назад

    Nos olvidaste, toñito:c

    • @tony_saro
      @tony_saro  15 часов назад

      Ya he hablado sobre el tema en Instagram y Twitter.

  • @adriankal
    @adriankal 8 дней назад

    Wouldn't that be unnecessary if you had index on name which could be already sorted on insert and just take first 100 elements from it? This algorithm was invented when hdds were expensive. Today 1gb is cheaper than one minute of your time.

    • @tony_saro
      @tony_saro  8 дней назад

      If there's a BTree index and the query sorts by that index only then yes. Otherwise, if there are 100 queries sorting 1GB at the same time you still need an external sort algorithm, and if there's one query sorting 5TB you still need the same algorithm. All databases have some variant implemented.

  • @ProxyDev
    @ProxyDev 9 дней назад

    Amazing work

  • @zeroows
    @zeroows 10 дней назад

    thank you

    • @tony_saro
      @tony_saro  10 дней назад

      You're welcome, thank you for commenting