Are SQL joins bad for performance?

Поделиться
HTML-код
  • Опубликовано: 26 дек 2024

Комментарии • 40

  • @alifawzi8197
    @alifawzi8197 Год назад +3

    for the last two day's i was watching your content,
    i genuinely appreciate the time and the effort that you put into this vidoes, your content is amazing and it's well explained,
    Thank you so much for sharing such a content.

    • @DatabaseStar
      @DatabaseStar  Год назад +1

      Thanks for the comment! I'm glad you like the videos!

  • @dav.R7
    @dav.R7 5 месяцев назад +1

    I've always had this question: Does the number of joins affect performance or not? This video answered all my questions.

    • @DatabaseStar
      @DatabaseStar  5 месяцев назад +1

      I’m glad it was helpful!

  • @pable2
    @pable2 2 месяца назад +1

    In this case, why wouldn't I add indexes everywhere if it allows for more performance?

    • @DatabaseStar
      @DatabaseStar  2 месяца назад +3

      Good question. When you insert, update, or delete data, the database needs to update the index to ensure it matches the related table. This step takes an extra bit of time. Each extra index will slow down these operations. So we only usually add the indexes we need.
      Unless you’re on a database that is very focused on reading data and does not update data very often, then it may not be a concern.

    • @ReutAslan
      @ReutAslan 2 месяца назад

      Another reason is that each index costs you storage, so you are essentially trading off storage for speed/performance

  • @Sdirimohamedsalah
    @Sdirimohamedsalah 7 месяцев назад

    Thank very much for this constructive demo.
    I have question: when you indexed the columns, it reduced the total cost. But not the total execution time. Why you prefer reducing the total cost over the execution time which is crucial to for an applications in production ?

    • @DatabaseStar
      @DatabaseStar  7 месяцев назад

      Thanks! Good question. I believe it's because the data set was so small and the execution time was small that it didn't really impact the time. On a larger data set you may see a bigger difference in execution time.

    • @Sdirimohamedsalah
      @Sdirimohamedsalah 7 месяцев назад

      @@DatabaseStarthank you for your response.

  • @NotMeEitherOfficial
    @NotMeEitherOfficial 9 месяцев назад

    In your case separated tables win againts the single table with no indexes, in my case separated tables make more cost since each table has about 40 columns, but the single table on provide the columns that needed for the user, lets said 40x4 is 160 columns but in single table we approximitely only combine 20 columns each. I will try to implement this indexes with my company databases, as we seems need this indexes to be implemented. Working on old databases that been laying around for decades with MyISAM engine with millions of rows and try to make the performance faster as it getting slower every single month. Thanks for the video its really helpful. I will also considering to ask management to migrate to other database engine or even other DBMS like PostgreSQL, using Cache is kinda eating too much memory considering our company budget that run all the apps in one server.

    • @DatabaseStar
      @DatabaseStar  8 месяцев назад +1

      Good point, it also depends on how many columns you need to return. Just because the separate tables have 40 columns, doesn't mean you necessarily need to select all 40 columns. But if you need all these columns, then a single table may make more sense like you are using.

  • @higiniofuentes2551
    @higiniofuentes2551 Год назад

    Thank you for this very useful video!

  • @u12blue
    @u12blue 4 месяца назад

    Are all these concepts applicable on cloud services? Wouldnt there be a difference?

    • @DatabaseStar
      @DatabaseStar  4 месяца назад

      They should apply regardless of where your database is hosted.

  • @sdmagic
    @sdmagic 6 месяцев назад +1

    Very well done, sir. I do get asked about the effort made to normalize a database. The insight offered by your very clear explanation will go a long way in helping to answer those queries.
    Your videos and website are a great resource for the database developer community.

    • @DatabaseStar
      @DatabaseStar  5 месяцев назад +1

      Thanks for the kind words! I'm glad you like the video and my channel.

  • @ftet1
    @ftet1 6 месяцев назад

    Hi, can you offer us the DDL scripts that you used to set up your example database? Then we can recreate it directly. That would be great. Thanks and regards \sdohn

    • @DatabaseStar
      @DatabaseStar  5 месяцев назад

      Good idea! The sample database (olympics) is available on GitHub here: github.com/bbrumm/databasestar/tree/main/sample_databases/sample_db_olympics
      The script for the queries used in this video is now on GitHub as well: github.com/bbrumm/databasestar/tree/main/videos/100_joins

  • @TheCodeConnoisseur
    @TheCodeConnoisseur Год назад +1

    Excellent

  • @CanRau
    @CanRau Год назад

    Very insightful thanks a lot ❤ what does noc stand for?

    • @DatabaseStar
      @DatabaseStar  Год назад

      Thanks! NOC stands for National Olympic Committee.

  • @milenfrom
    @milenfrom Год назад +1

    Hi,
    Thank you for the great insight.
    What is the software you use to run the queries and the explain feature?

    • @DatabaseStar
      @DatabaseStar  Год назад +2

      Thanks! I'm using a tool called pgAdmin, which is a common SQL editor for Postgres databases.

  • @tanzimibthesam5861
    @tanzimibthesam5861 Год назад

    Do you think joins can have impact on scalability? Thanks

    • @DatabaseStar
      @DatabaseStar  Год назад

      No I don't think so. However, once your database gets pretty large, you'll be looking into all kinds of techniques to improve performance, and one of them may involve caching or creating summary tables which means fewer joins - but it has tradeoffs.

  • @navjotsingh2457
    @navjotsingh2457 3 месяца назад

    Ty

  • @maxyudin
    @maxyudin Год назад

    👍

  • @CheatingStoriesWithChinwe
    @CheatingStoriesWithChinwe Год назад

    🤔

  • @romanbunshaft8412
    @romanbunshaft8412 10 месяцев назад

    now try ordering by something )

    • @DatabaseStar
      @DatabaseStar  10 месяцев назад

      We can add an Order By but the point still stands.

    • @romanbunshaft8412
      @romanbunshaft8412 10 месяцев назад

      its VERY depends on a query itself. joins are not always faster and better. @@DatabaseStar

  • @luco-games
    @luco-games Год назад

    This is completely wrong.
    0) Operating in "less than a second" units for 100-200k tables is like saying this car was not expensive, it's was less than a million dollars. You need to show execution times millis from the query plan (EXPLAIN ANALYZE).
    1) You execution time seemed to be 5 times faster for the denormalized query.
    2) Joining will ALWAYS be slower than one tabel if you join big tables (sorry but in 2023 few hundred k rows is nothing, even on a local machine).
    3) You didn't explain how actually JOIN works behind the scenes but I get it because it would ruin the whole video.

    • @DatabaseStar
      @DatabaseStar  Год назад +3

      Thanks for the feedback. I wouldn't say it's "completely wrong" because the video demonstrates the concept step-by-step and shows numbers.
      I can create another video that demostrates this with larger tables, as I think it would be more beneficial.
      0) That's a good point, which is why I didn't refer to the time taken when talking about the query, I referred to the cost from the execution plan. For larger tables & longer queries I would have also used the time taken.
      1) I don't think it was 5 times faster, the cost comparison (after indexes) was 822 vs 1,239, so it's a little faster.
      2) I don't think joining will always be slower than one table if working with big tables. That's the point of this video - joining is not always slower if you have a normalised design and indexes, BUT it depends on the query.
      3) I don't need to explain how join works behind the scenes for this video to be useful.

    • @NotMeEitherOfficial
      @NotMeEitherOfficial 9 месяцев назад

      0) they does point that with "cost", as you typed "few k rows is nothing even on local machine", so the execution time will be really thin and can't be use for the actually benchmark since probably other apps consume the gap, as also now days computer has something called multi threading processor.
      1) it does, i guessed the first answer could answer the second point too.
      2) "always" is not the perfect fit, cause in some cases its actually work as he also provided the cost, unindexed tables is one of the problems.
      3) he explain with the flow pgadmin provided, well when people "trying" to explain about something and then it not satisfy you, you will just criticize them?