Great video! One addition: The "EXPLAIN" command is an invaluable tool for optimizing SQL queries. It provides a detailed execution plan, allowing the developers to understand how the database engine processes a query. By analyzing the execution plan, you can address the performance bottlenecks with proper optimizations, e.g. proper indexes.
Opt for indexes with SELECT, WHERE, JOIN clauses. Use full column comparison to get data instead of half or computed comparison (i.e startsWith) Avoid ORDER_BY on large data retreval Use limit of smaller number with pagination for more data.
@@sampri22 for data processing and analytics? better way to do this is dump your database data and put into BigQuery or Hadoop, they have better resources for processing a large data
One of the best SQL videos I have come across, just the way it is put together and the infographics. If you are learning SQL, you really should understand the mechanics behind optimizing queries, how databases work. Just adding more hardware or VM resources will not fix the issue if your queries are not optimized properly.
Additionally, for the optimizer to "make up" a reasonably good plan (from the various alternatives), it needs to know a bit about the data (value) distribution. This is where STATISTICS / ANALYZE (depends on the DB vendor) come handy. It helps the optimizer do estimates for the various steps (rows, size of data, etc.) of each plan, and figure out which of the different plans is the best candidate to execute. Therefore it is important to collect this information on critical columns (usually join, where clause columns). It is also important to keep this information regularly refreshed so that the optimizer does not make bad decisions based on stale statistics. Very bad things can happen with stale statistics.
0:08 - Start 1:25 - Using Index on Join Columns significantly improve the Join 1:47 - Next Step is use of Where Clause 2:13 - Lets 3:14 - To write Sargable Queries 4:18 - Optimising 5:21 - Remember - Order of Optimization
Understanding how the DB engine works with indexes is key. you may assume that a WHERE purchase_date >= 2022 AND purchase > 100 would be the same if you have indexes on purchase_date and purchase, but it might be required to have a composite index... Order in the WHERE clause may also be important as it helps reducing the dataset before applying the second condition.
That's the gap I needed to fill. Couldn't find this info wrapped in the right words and animations as here. Thanks a lot for the content! Hoping to find more relevant videos to expand my knowledge of SQL. Currently struggling with performance of my queries on big datasets (~6mil rows). Not clear how to avoid functions and computations during search and filtering in some cases. Biggest struggle so far
Index usage tip: When using params in your query (e.g., select .... where year > ?), databases may not utilize an index if it is unbalanced. For instance, if you have approximately 1 million rows with year = 2022 and only 1000 rows with year = 2023, the database cannot predict whether the parameter will be useful for filtering. To resolve this issue, pass the value directly in the query itself, allowing the execution plan to determine if the index is suitable for the intended purpose.
As I wrote in my comment, good understanding on how you db engine works is key. And they are all different. So never assume that a good query on a MySQL will be a good query on Postgres, Oracle or any SQL engine.
@@maf_aka I think the idea was not to use prepared statements *where you don't need them.* E.g. if you already have validation in place that ensures your received value is enum (number, null, etc.) - you can be sure no SQL injection is possible there - so no need to use prepared statements *there.*
Thank you. SELECT coming practically at the end of the process was a hard thing to get my head around, let alone remember, when I first began with SQL. Still is TBH. The fact it means something more like 'display' than 'go and get' (what we typically mean by 'select' in conversational English) was a hurdle too. Wish I'd come across this video back then.
Thank you for your time and effort to explain any of the subjects. Really like it and more over able to register the concept in mind easily. Thanks again,.
Thank you for a fantastic visualization of the SQL queries execution order. That's exactly what I have been missing in the other materials. I really appreciate your style of teaching
You should select from the orders table then join the customers since your where clause is a column in orders table! Your SQL is joining on unnecessary rows from orders & customers!
Excellent video explaining basic concepts in very short time..❤ Impressive graphic animation, could you please share how the execution plan animation was done
Awesome visualization, I've been loving all the short videos on this channel! Clarifying Q. The execution order has SELECT happening after HAVING, so this should mean that the calculated column total_spent doesn't exist at the time the HAVING clause is evaluated?
Lord Buddha. I'm looking for an active data flow visualization that can shorten data query response times! A great video, it saved me today. Leaving with 1 subscription as a fan! 🔍⚡
This query actually does not need to join customers table since all the fields are present in the orders table already. (unless there are invalid / dirty customer_id data in the orders table and you want to filter them out)
cool, didn't think it's possible to include all these concepts in 6 min video. One thing, it's great to watch it when you want to summarise already existing knowledge
Well explained. However I do miss 1) the generation of more query-plans and selection amongs them (cost estimations) and (as an element herein) 2) different table access tactics (sequential scan, index access or index only).
I have always thought that the Sql structure is poorly designed by not starting from FROM and placing the reference at the end of the statement, for example in a SELECT it should go just before ORDER BY, in an UPDATE the SET after WHERE, etc. Somehow they wanted to remedy the problem by introducing the WITH clause but I'm sure many regret that whoever designed the language should have worked a little harder at the time.
Nice bird's-eye view introduction. It is not clear how to 'use appropriate indexes' to optimize for sorting, and how to implement pagination. Especially in your example where the sort order is made on an aggregate.
Something doesn't add well here. If you notice HAVING clause refers to 'total_spent' which is defined in SELECT, so dependency wise HAVING should be after SELECT and not before it.
I feel like this is a bit misleading because sometimes where and select influence the first stage. As you said, when there’s a covering index, the database won’t read the entire table. So the select and where influence what is read from the source. Order and limit can also come it at the source as well if the index can be used with the order. You refer to this when you talk about “sorting the whole table”. CTEs and sub queries are not mentioned but that’s okay i guess.
Question: at the end of the video you mentioned do not sort the whole data and use pagination for optimizing ORDER BY and LIMIT. Those are the things I use for pagination! What do you mean by that? The other thing is from your video LIMIT happens after ORDER BY. How come it can help when ORDER BY has already happened?! Btw great videos and content, thank you for these
00:45 Understanding SQL query execution and optimization techniques 01:30 Understanding SQL execution plans can optimize queries for better performance 02:15 Optimizing SQL queries through index usage 03:00 Writing soluble queries is essential for optimizing database performance. 03:45 Sargable queries improve query performance. 04:30 Understanding the SQL execution order is crucial for query optimization 05:15 Optimizing SQL Queries with Indexes 05:57 Understanding SQL execution order is key Crafted by Merlin AI.
Avoid using non-saragable condition (func or calc) on index column If has to use func on column, write a computed column or function-based index first.
why don't you make a tutorial on SQL. I would like to watch it and I think it'll help a lot of people. By the way thank you very much for this amazing explanation.
Think the key component missed here is that you are utilizing a SELECT aggregate within the HAVING statement. To me that looks like SELECT has to come before HAVING, would it not?
I still don't understand the difference between first point noted on here 3:19 and second point noted on 3:23. Would you mind to re-explain it ? thank you!
Thanks for the video but there is one mistake you've make, in HAVING clause you can't use the alias 'total_spend' which was defined in SELECT statement, becuase SELECT was executed after HAVING. THE CORRECT WAY IS : HAVING SUM(order_amount)>=1000
Won't it throw an error near Having total_spent as total_spent is an alias used in select clause and according to order of execution having will be executed before select and total_spent won't be recognised
yes, i think so, and I have another question, 'Having' uses total_spent from the SELECT, so how come HAVING is executed before the SELECT? Doesnt make sense...
Great video! One addition: The "EXPLAIN" command is an invaluable tool for optimizing SQL queries. It provides a detailed execution plan, allowing the developers to understand how the database engine processes a query. By analyzing the execution plan, you can address the performance bottlenecks with proper optimizations, e.g. proper indexes.
Thanks for sharing this.
Thanks a lot for the addition, really good :)
@cmertayak - I second you. It's an awesome command I use many times at my work to optimise. My go to command to improve queries execution.
Thanks for sharing
Oh yes, if you run EXPLAIN in some desktop client like Mysql Workbench, shows you detailed chart diagram of your Query, quite useful
Opt for indexes with SELECT, WHERE, JOIN clauses.
Use full column comparison to get data instead of half or computed comparison (i.e startsWith)
Avoid ORDER_BY on large data retreval
Use limit of smaller number with pagination for more data.
Could you explain how? What if i need large data retrieved with order by. How would i use limit and pagination in this case? Thanks
@@sampri22 for data processing and analytics? better way to do this is dump your database data and put into BigQuery or Hadoop, they have better resources for processing a large data
One of the best SQL videos I have come across, just the way it is put together and the infographics. If you are learning SQL, you really should understand the mechanics behind optimizing queries, how databases work. Just adding more hardware or VM resources will not fix the issue if your queries are not optimized properly.
Very well presented, thanks for explaining SARGAble concept
The way you explained with the animations are Awesome. Great Job. Very Well Explained.
Very profound, please share more on SQL like windows and CTE, your explanation is very approachable.
Thank you, @bytebytego, for breaking down the topic with clear visuals and simple narration!
the way you help us visualize this is next level. Thank you!
Very good intro. Would like a more detailed explanation on more complex queries.
they don't do detailed explanations. it's basically "use indexes". don't sort lots of data. well, thanks.
@@jonbaird9718agreed, RUclips is made for juniors
bro this way of teaching is really really make sense. thanks a lot for these visuals.
Hi Sir thank you 🙏 for taking the time to explain the SQL. Sorry Iam new and very helpful.
Additionally, for the optimizer to "make up" a reasonably good plan (from the various alternatives), it needs to know a bit about the data (value) distribution. This is where STATISTICS / ANALYZE (depends on the DB vendor) come handy. It helps the optimizer do estimates for the various steps (rows, size of data, etc.) of each plan, and figure out which of the different plans is the best candidate to execute. Therefore it is important to collect this information on critical columns (usually join, where clause columns). It is also important to keep this information regularly refreshed so that the optimizer does not make bad decisions based on stale statistics. Very bad things can happen with stale statistics.
0:08 - Start
1:25 - Using Index on Join Columns significantly improve the Join
1:47 - Next Step is use of Where Clause
2:13 - Lets
3:14 - To write Sargable Queries
4:18 - Optimising
5:21 - Remember - Order of Optimization
Understanding how the DB engine works with indexes is key. you may assume that a WHERE purchase_date >= 2022 AND purchase > 100 would be the same if you have indexes on purchase_date and purchase, but it might be required to have a composite index... Order in the WHERE clause may also be important as it helps reducing the dataset before applying the second condition.
WHERE order has no effect on most sql systems. The only way you can force SQL to filter data first is to use a derived query.
That's the gap I needed to fill. Couldn't find this info wrapped in the right words and animations as here. Thanks a lot for the content! Hoping to find more relevant videos to expand my knowledge of SQL.
Currently struggling with performance of my queries on big datasets (~6mil rows). Not clear how to avoid functions and computations during search and filtering in some cases. Biggest struggle so far
Index usage tip: When using params in your query (e.g., select .... where year > ?), databases may not utilize an index if it is unbalanced. For instance, if you have approximately 1 million rows with year = 2022 and only 1000 rows with year = 2023, the database cannot predict whether the parameter will be useful for filtering. To resolve this issue, pass the value directly in the query itself, allowing the execution plan to determine if the index is suitable for the intended purpose.
As I wrote in my comment, good understanding on how you db engine works is key. And they are all different. So never assume that a good query on a MySQL will be a good query on Postgres, Oracle or any SQL engine.
this opens the gate for SQL injection, don't do this
@@maf_aka I think the idea was not to use prepared statements *where you don't need them.* E.g. if you already have validation in place that ensures your received value is enum (number, null, etc.) - you can be sure no SQL injection is possible there - so no need to use prepared statements *there.*
Ok, but then you get a different query plan for each (different parameter / set of parameters) query
@@lethern2 yep. but that's why you need to understand how your db engine works
Thank you. SELECT coming practically at the end of the process was a hard thing to get my head around, let alone remember, when I first began with SQL. Still is TBH. The fact it means something more like 'display' than 'go and get' (what we typically mean by 'select' in conversational English) was a hurdle too. Wish I'd come across this video back then.
This is the best explanation I've ever seen. Big thumbs for you!
Thank you for your time and effort to explain any of the subjects. Really like it and more over able to register the concept in mind easily. Thanks again,.
Thank you for a fantastic visualization of the SQL queries execution order. That's exactly what I have been missing in the other materials. I really appreciate your style of teaching
You should select from the orders table then join the customers since your where clause is a column in orders table! Your SQL is joining on unnecessary rows from orders & customers!
You are awesome at explaining any concept. Thank you so much
oh my goodness, this is too good for non IT background jumping ship to see where AI will land. Thx. You are my 3blue1brown for IT
Simple and to the point explanation. Love it. Thanks 👍
wow, what an awesome introduction to SQL optimization.
What a Video , Voice , Explanation , Graphics and etc...well done mate
The best one , interms of Optimization !
I heard it called "predicate pushdown" when you move a condition earlier in the plan
Excellent video explaining basic concepts in very short time..❤
Impressive graphic animation, could you please share how the execution plan animation was done
Superb video! Simple explanation on query optimisation.
Great explanation in a short video.
If orderby comes after the select then it will work on the data already read from disk, right?
Awesome visualization, I've been loving all the short videos on this channel!
Clarifying Q. The execution order has SELECT happening after HAVING, so this should mean that the calculated column total_spent doesn't exist at the time the HAVING clause is evaluated?
Wow. To the point with knowledge I can use today. Thank you.
*Explanation level is so beautiful!*
Lord Buddha. I'm looking for an active data flow visualization that can shorten data query response times! A great video, it saved me today. Leaving with 1 subscription as a fan! 🔍⚡
LOL just had an interview and had exact copy of example he is showong and explaing on 😂😂😂
Thanks on this video realy helped.
This query actually does not need to join customers table since all the fields are present in the orders table already. (unless there are invalid / dirty customer_id data in the orders table and you want to filter them out)
cool, didn't think it's possible to include all these concepts in 6 min video. One thing, it's great to watch it when you want to summarise already existing knowledge
Well explained. However I do miss 1) the generation of more query-plans and selection amongs them (cost estimations) and (as an element herein) 2) different table access tactics (sequential scan, index access or index only).
I have always thought that the Sql structure is poorly designed by not starting from FROM and placing the reference at the end of the statement, for example in a SELECT it should go just before ORDER BY, in an UPDATE the SET after WHERE, etc. Somehow they wanted to remedy the problem by introducing the WITH clause but I'm sure many regret that whoever designed the language should have worked a little harder at the time.
thanks, helped clear up some issues I had.
Your presentation is so pleasant to watch, is it manually key-framed in the video editor or are there tools to do that naturally?
Nice bird's-eye view introduction.
It is not clear how to 'use appropriate indexes' to optimize for sorting, and how to implement pagination. Especially in your example where the sort order is made on an aggregate.
Something doesn't add well here. If you notice HAVING clause refers to 'total_spent' which is defined in SELECT, so dependency wise HAVING should be after SELECT and not before it.
good things to practice for the interview. Thanks
Thank you for sharing your knowledge
I feel like this is a bit misleading because sometimes where and select influence the first stage. As you said, when there’s a covering index, the database won’t read the entire table. So the select and where influence what is read from the source.
Order and limit can also come it at the source as well if the index can be used with the order. You refer to this when you talk about “sorting the whole table”.
CTEs and sub queries are not mentioned but that’s okay i guess.
Thanks for this kind of explanation
Great video!! Very helpful! Thanku sir!
my app didnt reached 40 queries per second yet but i will implement that just in case my app will be next amazon :D
Nice and simple explanation.Thanks
Question: at the end of the video you mentioned do not sort the whole data and use pagination for optimizing ORDER BY and LIMIT. Those are the things I use for pagination! What do you mean by that?
The other thing is from your video LIMIT happens after ORDER BY. How come it can help when ORDER BY has already happened?!
Btw great videos and content, thank you for these
these videos are amazing!!!! thanks!!!
00:45 Understanding SQL query execution and optimization techniques
01:30 Understanding SQL execution plans can optimize queries for better performance
02:15 Optimizing SQL queries through index usage
03:00 Writing soluble queries is essential for optimizing database performance.
03:45 Sargable queries improve query performance.
04:30 Understanding the SQL execution order is crucial for query optimization
05:15 Optimizing SQL Queries with Indexes
05:57 Understanding SQL execution order is key
Crafted by Merlin AI.
Best explanation ever
Having uses total_spent from the SELECT, so how come HAVING is executed before the SELECT?
I'd say so too. This is error. First SELECT part is evaluated, then - HAVING part.
Can you make a video explaining the difference between system design and software architecture?
Thank you, this was really helpful.
Great video, very informative and well explained bravo!
Avoid using non-saragable condition (func or calc) on index column
If has to use func on column, write a computed column or function-based index first.
Awesome as usual! Thanks a lot!
This stuff is gold. Thank you for making this available for free. Really appreciate it!
1:26
I always thought that the SELECT happened before HAVING, considering that we can use SELECT aliases in the HAVING filter.
Love your channel. Your videos are great.
In this example the 'total_spent' alias is already in use in the HAVING clause without defining. How is that possible?
yes, I have the same question, it doesnt make sense...
Good insight Thanks!
why don't you make a tutorial on SQL. I would like to watch it and I think it'll help a lot of people. By the way thank you very much for this amazing explanation.
Very good video. It is really helpful.
thank you for your video,
i working on IT with 10 years experience, but I never know the order between JOIN and WHERE,
utill I watch this video
So good explanations
Thanks for your sharing Bro's.
Excellent explanation, thanks!
Thanks. Good to know! Useful!
Great. Thanks for sharing..
Amazing. Thank you!
Think the key component missed here is that you are utilizing a SELECT aggregate within the HAVING statement. To me that looks like SELECT has to come before HAVING, would it not?
Fantastic explanation.
Thanks for this! Will there be a transcription soon?
really cool - thanks.
your write in predicate in one case '2023-01-01' and in another '01-01-2008' - which format is correct?
Hi The actual plan should be derived from the explain and explain analyze right instead from the query?
What tool do you use to generate your animations?
You guys are awesome!
Great video!
hi, can you enable captions/subtitle for this video? thank you!
join before where?? not always!
I still don't understand the difference between first point noted on here 3:19 and second point noted on 3:23. Would you mind to re-explain it ? thank you!
Would building a cte table and then running a non-sargable query on it, should also be avoided?
good explaination
Very simple and to the point, love the visualization too
thanks a lot for your content
Thank you so much!
Thanks for the video but there is one mistake you've make, in HAVING clause you can't use the alias 'total_spend' which was defined in SELECT statement, becuase SELECT was executed after HAVING. THE CORRECT WAY IS : HAVING SUM(order_amount)>=1000
so in the above example, which place we should index ?
Why are there no subtitles? I need subtitles. Thank you very much!
Will it be even faster if we always order where first and join after?
This is top-notch in every aspect. I read a book with similar content, and it was top-notch. "Better Sleep Better Life" by William Brook
Won't it throw an error near Having total_spent as total_spent is an alias used in select clause and according to order of execution having will be executed before select and total_spent won't be recognised
Is that a typo in the first select clause, total spent should be total_spent?
yes, i think so, and I have another question, 'Having' uses total_spent from the SELECT, so how come HAVING is executed before the SELECT? Doesnt make sense...
מדהים!
Thanks!