when indexes are useless | The Backend Engineering Show

Поделиться
HTML-код
  • Опубликовано: 27 янв 2025

Комментарии • 45

  • @hnasr
    @hnasr  3 года назад +5

    head to database.husseinnasser.com to get a discount coupon for my Introduction to Database Engineering course

  • @carlosdelgado5632
    @carlosdelgado5632 3 года назад +28

    I happen to realize about this about 2 weeks ago when I created an index on a column with "status" and saw no increase of performance at all, as the matter fact it actually decreased so I guess I learned it the hard way 😂

  • @squirrel1620
    @squirrel1620 3 года назад +18

    I was just troubleshooting a long running query and found that one of the tables being joined was missing an index. The SQL EXPLAIN showed exactly what index should be added. Added it and bam, 40s query down to 1s

  • @talhaabdurrahman9407
    @talhaabdurrahman9407 3 года назад +11

    the way you said bahrain made me so happy

  • @emjaytripleo
    @emjaytripleo 3 года назад +2

    Just bought your course Yesterday, looking forward to it

  • @RustinRoberts
    @RustinRoberts Год назад

    Great video. This helped my understand indexing a bit deeper. Thank you

  • @marsilinouzaky2748
    @marsilinouzaky2748 3 года назад +7

    Great video Hussein, hope u have a great vacation.
    I think two more ways indexes can be useless if it's on a column that changes value very very often where maintaining a rebalanced tree will become challenging and introduce a lot of overhead. Also if the rows are not that many then index might not provide a lot of benefits.

    • @mohamedmohamedy3085
      @mohamedmohamedy3085 3 года назад

      exactly.
      one of the reasons we generate random test data, is to force the DB engine to use the index.

  • @noorulamin2652
    @noorulamin2652 2 года назад +1

    explained very well thank you Hussein Nasir for such awesome content we really appreciate your content & hard work keep it up 👌

  • @comradepeter87
    @comradepeter87 3 года назад +3

    I had questions regarding the propagation of query from backend to database - why do we introduce so much overhead in this part?
    We use textual commands and pass it to the database for lexing *for every single query* (queries get cached but this only helps if query is exactly the same), then parsing, then actually doing its actual job. But since most of the query is programmatically generated, why don't we have a binary format for this, where we just send a blob C-style structure over the network. It would have all the information database needs without needing to filter through the human junk.
    Also, I am not an engineer (yet!) so no real world experience, but do databases run on different machines than the server? For all examples and usecases encountered by me, they've always been on the same machine, so using the network (especially TCP for localhost networking!) seems unnecessarily expensive. There should be an alternative method communication that doesn't require network but uses some Unix-y message passing of some sorts.

    • @rishabhanand4270
      @rishabhanand4270 3 года назад +8

      yes, databases are best kept away from your server. This way it's decoupled and you can scale each other independently. A server should do exactly what it's name suggests: serve your clients. Whatever requests your client has, it should note it down and send it to the "kitchen" where "food" is prepped, cooked and sent to the clients. Sometimes "food" is also cached :).
      Coming to the network problem: it's not always a full TCP connection establishment, there is a connection pool from your server to the database which is almost always active and your server can just use these existing connections to quickly send over the raw query. Yes, there is still a little overhead but it's still better than keeping your database on the server itself when we talk about scale.
      Finally, we don't send "parsed" queries over the network because then you are parsing it on your Server. As explained by Hussein in this video, the parsing is dynamic, i.e, the database looks at a query and comes up with the most efficient plan based on statistics which it stores in the database itself. So you're loosing all the good database optimizations if you just send the parsed query directly to the database. Also, if tomorrow the database changes it's parser (maybe an upgrade), you will have to upgrade all your server instances to have the same version of the parser. It's a headache nobody wants. Let the database do it's job of caching, looking at your query to plan and executing your query.

    • @bepamungkas
      @bepamungkas 3 года назад

      To beat the server query planner, you'll need actual knowledge of the data. AFAIK, only SQLite allows you to do that by embedding the entire logic into its driver; making it a classic serverless RDBMS.
      note: Of course there is cloud serverless db like cosmos or aurora, but those are a bit more obscure in implementation due to several abstraction layer. SQLite use simplified model where db interface is treated as part of the application itself.
      Regarding your last question: Yes. It is advisable to separate db with backend due to predictability factor. Roundtrip latency is generally an acceptable cost compared to resource exhaustion during peak load (since on single-machine scenario, every additional load will be multiplied by at least two: backend processing and db access). And latency is usually much more predictable compared to load.
      Having said that, for localhost you could use UNIX socket- or named pipes on Windows- to avoid TCP "overhead". However, on obscure edge cases where you need do bug hunting, TCP-based connection generally easier to debug. And using TCP from the get go means you're not locked to single scenario, or worse, made wrong assumptions due to difference in system behavior.

  • @sirajul-anik
    @sirajul-anik 3 года назад

    on point 2, i had figured out this one a few months ago. I was totally shocked.

  • @bashardlaleh2110
    @bashardlaleh2110 3 года назад +2

    The composite index thing is really interesting

  • @Gabriel-sc3mg
    @Gabriel-sc3mg Год назад

    and how does indexes works when we make a join by a PK it will search the results by Index and then search in the other table? great video btw

  • @HSBTechYT
    @HSBTechYT 3 года назад

    I had a question about composite indexes in nosql/mongo
    How does mongo know what's left and what's right ?
    And let's say I have a index for userid property but _id (inbuilt) is also indexed right ?
    So will this affect performance?

  • @lakhveerchahal
    @lakhveerchahal 3 года назад +3

    I'm not sure of this but this might be the case (it can be wrong) -
    When u only do
    select * from employees where condition;
    This * can cause the db to do heap scan instead of index scan because it anyways need to fetch every column from heap, so to skip two hops it goes to table scan in the first place.
    Please correct me if it's wrong.

    • @ili101
      @ili101 3 года назад +2

      Doesn't sound logical to me. If you have million rows and you do "SELECT * WHERE id = 42069" I don't see way it will scan million rows instead of using the index then getting this 1 row from the heap. From what I know select * is only bad for network traffic and CPU/RAM/IO for fetching/formatting the output. Except probably if you select *only* from the indexes then you don't need the heap at all and skip this extra step?

    • @marsilinouzaky2748
      @marsilinouzaky2748 3 года назад +1

      The index doesn't store all columns values so it doesn't really matter what columns you're looking for, instead what matters in this case is the where clause column, the DB will scan this index based on the where column, get the primary key, then it has access to all columns related to this primary key, so specifying one column or * won't make much of a difference if you have an index that will be used to filter your rows.
      Someone can correct me if I'm wrong :)

    • @lakhveerchahal
      @lakhveerchahal 3 года назад

      Yes I guess, it'll still use the index scan 99% of the time, it's just bad for the I/O.

    • @mohamedmohamedy3085
      @mohamedmohamedy3085 3 года назад

      selecting only happens after finding the rows.

  • @mihaidorobantu8574
    @mihaidorobantu8574 3 года назад

    Regarding the first name case - can we not just create an index on a normalized version of the first name table?

  • @shiskeyoffles
    @shiskeyoffles 3 года назад +1

    Similar situation!
    Saving in watch later

  • @rishiagrawal4260
    @rishiagrawal4260 3 года назад +1

    can you please make a video on HLS ? and more importantly how you explore/read untouched technology ?

    • @hnasr
      @hnasr  3 года назад +1

      I think this will help ruclips.net/video/1-KmLc0c2sk/видео.html

  • @marawanlotfy1477
    @marawanlotfy1477 3 года назад +1

    hi hussein
    love from egypt man ❤️

  • @siya.abc123
    @siya.abc123 3 года назад +1

    Enjoy your vacation bro, we will all be here when you return.

  • @henrydesousa8264
    @henrydesousa8264 3 года назад

    Regards from Costa Rica.

  • @kooshasangari2555
    @kooshasangari2555 3 года назад

    ⭐ Thanks Man, You are #1 ⭐

  • @benjaminkir
    @benjaminkir 3 года назад

    good video.. even better to display with 1.5 speed

  • @jasdeepsinghgrover2470
    @jasdeepsinghgrover2470 3 года назад

    When the index is implied but not mentioned.. Like object type(having index) is car and we are searching for BMW but we never mentioned object type to be car in the query.

  • @dixztube
    @dixztube 3 года назад

    Trying to learn kotlin by just reading the documentation. Ugggggh. I like JavaScript but need to leave my comfort zone.
    Hope your family is doing well. Thanks for a great channel !

  • @johnnychang3456
    @johnnychang3456 3 года назад +1

    Another one is when you do a LIKE query with percent sign on the left of your where clause. E.g, select sth from xx where xxx LIKE %q%
    Since % means all possibilities, the database has no way to apply any index scan.

    • @mohamedmohamedy3085
      @mohamedmohamedy3085 3 года назад

      also, iLike with/without the leading % will cause a full table scan.

  • @emptymeta5849
    @emptymeta5849 3 года назад

    Yo, big fan of your outage content.
    Could we get some content on the recent roblox haloween outage?
    Have you heard anything about that?
    Edit: Didn't see the part about vacation, hopefully we get some good outage content when hes back

  • @rishavagarwal6531
    @rishavagarwal6531 3 года назад

    Thank you sir for the video :)

  • @ahmedtawil7705
    @ahmedtawil7705 3 года назад

    How you make it ?!!
    really when i face a problem , i just find that you talking about in newly videos!! 😂😂
    Thank you Hussein

    • @hnasr
      @hnasr  3 года назад

      I read minds 😍

  • @erlangparasu6339
    @erlangparasu6339 3 года назад

    please give sql code or benchmark result when it useless

  • @hitmusicworldwide
    @hitmusicworldwide 3 года назад

    Hey! Let us know how life is back home ! That sounds fantastic!

  • @Vijay-Yarramsetty
    @Vijay-Yarramsetty 3 года назад

    commenting for the sake of RUclips algorithm, to help him monetize from the video.

  • @rahulbera454
    @rahulbera454 3 года назад

    500th like ❤️

  • @mughees52
    @mughees52 3 года назад

    Welcome back to bahrain in Advance

  • @electronlibre4163
    @electronlibre4163 3 года назад

    Don't we say indices instead of indexes? 😁