I think the below one also works. with cte as (select *, row_number() over(partition by buckets) as rn from( select *, ntile(3) over(partition by distance) as buckets from src_dest_distance)) select source,destination,distance from cte where rn
@@kartikeytyagi9119 in the given query two state names are repeated vice versa and i need to fetch the row which was inserted first. if i create buckets, i can get three buckets for three pairs of states, each bucket will have two rows with vice versa state names and if i apply row number for each bucket, i can be able to get first inserted rows(by using rn
This might sound daft, but given the problem/solution set out at 2:20 why not just SELECT * FROM INPUT WHERE SOURCE IN ('Bangalore','Mumbai','Chennai');
I guess this can be also one more approach with greatest and least function: select distinct greatest(source, destination) as source, least(source, destination) as destination, distance from src_dest_distance;
Thank you for this alternative. I was thinking about something similar. This approach to me is more clever and avoids using the row number, which I feel is a bit hacky (even if it works)
It's great to see tutorials like this posted. However, it should be noted that the solution presented only works for a very specific dataset. Two potential failure cases come to mind: 1) Adding the city pair Bangalore Chennai will fail. As others have noted, the fix for this is to make the where clause check both T1.Source = T2.Dest and T1.Dest = T2.Source 2) It assumes that every city pair shows in both orders. If a city pair is only listed in one direction, it won't match on the self join. Others have mentioned using least() and greatest(). Not just an alternative, but using such functions solves both of the problems noted above: SELECT DISTINCT GREATEST(source, destination) AS source, LEAST(source, destination) AS destination, distance FROM cities; Note: tested with MariaDB, other engines may vary.
Problem with MS Sql is it wont let you create row_number() without putting an order by in the over() clause. I solved like this: with CTE as ( select t.*, row_number() over( partition by distance order by distance) as [id] from travel_routes t ) select source, destination, distance from CTE where id=1 May be the CTE is overkill, but for someone who might find this useful.
A self anti semi join sounds easier to me: select * from src_dest_distance a where not exists (select 1 from src_dest_distance a where a.source=b.destination and a.destination=b.source and b.source
I think we can simply write the query using least , Greatest funtions (in Oracle Database) SELECT DISTINCT least(SOURCE, DESTINATION) SOURCE, greatest(SOURCE, DESTINATION) DESTINATION,
DISTANCE FROM table1; It will give the desired results
Thanks for sharing these functions. I'm trying to think through the logic of the code to see what the output would be. I'm not sure but it looks like your code will perform 6 sets of comparisons returning 6 rows of data. How do you reduce it down to the 3 rows desired?
@@olu0mg Least function will always select the first word based on alphabetical order You may watch some videos on these functions, it is quite easy. And As we are using 'Distinct' in the query above, it will remove the duplicate records and return only 3 rows
Actually @techTFQ it would be better if you add t1.dest = t2.source bcz in the sample data we don't have repetition of source but that is mostly possible to uniquely identify we could add this condition
Hey My solution for the same is ;with cte as ( select *, case when source destination then source else destination end as destination1 from src_dest_distance) select source1,destination1,max(distance) from cte group by source1,destination1
@@jacksparrow3595 You can certainly compare strings in SQL. Protap’s suggestion is the best solution to the stated problem. It’s extremely simple, gives correct results, and even has the benefit that the selected rows will always have the two endpoints in alphabetical order, which, if the actual source were quite long, would make the output easier to use.
For this data set it's fine but it would be cool to incorporate the possibility of the same city showing up in a different pair. I guess you could just also add the condition that t2.source = t1.destination
with cte as ( select (case when sourcedestination then source else destination end) as destination,distance from src_dest_distance ) , cte2 as ( select source,destination,distance, row_number() over(partition by source order by distance) as rnk from cte )
select distinct s,d,distance from (select *, case when source >destination then source else destination end s, case when source < destination then source else destination end d from src_dest_distance)x
Again an awesome video with great explanation. Just one suggestion to the join condition. Should we include T1.source = T2.destination and T1.destination = T2.source and T.id < T2.id so that we match only the required rows. The condition T1.source = T2.destination and T.id < T2.id may join non required rows as well which is not present in your data example. Like Bangalore -> Hyderabad and Chennai -> Bangalore. Let me know your thoughts. Thanks
agree, the best answer sees beyond the data given to the broader nature of the data and request, and provides a solution that also is robust for handling future edge cases. This is "find all the unique trips" not just "return these 3 specific rows"... I'd add on that doing an inner join is less ideal for this same reason. It assumes that a "return trip" record will always be present to pair against. Instead, using a left join with the filter condition placed in the WHERE clause (WHERE T2.id is null) would better handle the potential situation of an unpaired entry down the road. Retain a record when no "return trip" match is found is more robust, assuming that "find all the unique trips" is the mission.
@@mudassirasaipillai6584 That does not avoid duplicates. That will gives the same duplicate values. For any city pair, you will have 2 rows, they will have ids of id1 and id2 which will be different, so one will be larger than the other; and they will have a different source. Then for the join in one case you will have T1.id < T2.id, and in the other case you have T1.id > T2.id. By having one of those as the condition you will only get one of the pair. But having t1.source t2.source, the condition will be true in both cases and both directions of the journey will be returned.
with cte as (select *, case when source < dest then concat(source,dest) end as source_dest from city where source_dest is not null) select source,dest,distance from cte;
Another Simpler way to solve this : with xyz as ( select * , lead(destination) over() as LD from src_dest_distance ) Select source , destination , distance from xyz where source = LD ;
I would prefer source < destination, that way the source comes first alphabetically. But neither will return the desired data, where it seems to want to use the first trip in the table.
Taufiq, what if the entry is like: bangalore hyderabad 400 delhi bangalore 1200 will this condition not treat these as duplicate records ? Can we use the condition: T1.source=T2.destination and T1.destination=T2.source ?
we can solve it through not exists operator also right. with cte as (select *, row_number() over as id from src_dest_distance) select * from cte AS t1 where not exists (select * from cte AS t2 where t1.source = t2.destination and t1.destination = t2.source and t1.id>t2.id);
This would do the job SELECT DISTINCT IIF(source > destination, source, destination) AS source, IIF(source < destination, source, destination) AS destination, distance FROM src_dest_distance
Why not filter the original input for SOURCE < DESTINATION? That would eliminate one of the 2 records, and would be much more efficient than a self join...
yep, although it wouldn't change the results, just make sure to use lower function on both sorce and destination to avoid headaches down the line when someone introduces "delhi" and someone's script breaks and nobody knows why
This is an easier solution @techTFQ with cte as (select *, lead(destination,1,destination) over() as sc1 from src_dest_distance) select source, destination, distance from cte where source = sc1;
Easier: select source as destination, destination as source, distance from input union select destination, source, distance from input Explanation: union will automatically remove duplicate rows ;)
with cte as ( select * , lead(destination) over(order by distance) as Lead_dest from src_dest_distance ) Select source , destination , distance from cte where source = Lead_dest ;
Hi, My approach to this Question With cte as (Select *,ROW_NUMBER() over (order by Distance) as RN from Distance) Select Source,Destination,Distance from cte where RN %2 =1 I got it output....but is the approach right??
select s.source,s.destination,s.distance from src_dest_distance as s join src_dest_distance as s1 on s.source=s1.source and ascii(s1.source)>ascii(s.destination)
Looking at the output, it is a alternative row of the input table. In this instance, I believe we can divide the row number by 2. Where ever the value is not equal to zero, the result is the output table. Select * From (Select *, row_number() over () as row from input_table) t1 where row % 2 0
Thank you for the Video. This SQL only works if there are always two sets of records (A->B, B->A). If there is only a single source - destination pair (S->D) it will not be part of the result.
Table name assumed - Location WITH cte as (SELECT * , LAG(Source,1,0) OVER() as comp FROM Location) SELECT Source,Destination,Distance FROM cte WHERE Source NOT IN (SELECT Source FROM Location WHERE Destination=comp)
This is what I came up with before I watched your solution: select src, dest, dist from cities c1 where ( select count(*) from cities c2 where c2.src = c1.dest and c2.src < c1.src) = 0;
select t1.source, t1.destination, t1.distance from src_dest_distance t1 left join src_dest_distance t2 on t1.destination = t2.source and t1.distance = t2.distance where t1.source
with cte as (select row_number()over (order by distance ) as rowid,SOURCE,DISTINATION,DISTANCE from TRAVEL) select SOURCE,DISTINATION,DISTANCE from cte where (rowid%2)0
select source,destination,distance From (select *, LEAD(Source,1,Source) OVER() as Source1, LEAD(Destination,1,Destination) OVER() as Destination1 FROM src_dest_distance) as a WHERE source=destination1 AND destination=source1; Is that correct please check?
hi everything is excellent, but you mentioned the concept used in the title, so anyone who saw this video didn't get the opportunity to think on their own coz as the concept used is already mentioned, but you are doing a great job, to data community
I don't understand why we cannot just do: SELECT * FROM table1 WHERE Source IN('Bangalore', 'Mumbai', 'Chenai') ??? Maybe the problem definition needs to be a bit more precise? Cool video either way :-)
Write a sql query on input table is like Item, no_items Apple, 8 Potato, 4 Banana,6 Tomato,2 And Output table should be like Item, sum_item Vegitables, 8 Fruits, 14
with cte as( select source,destination,distance,row_number() over(order by distance) as row from src_dest_distance) select source,destination,distance from cte where row in (1,3,5) This Query also gave the same result. Is that correct?
No, as in the example the distances are all the same, the order is more or less random. With real distances it would be still a problem, because it is possible that two different pairs of cities have the same distance.
What if you have two paths to the same destination from 1 city? Here is my alternate solution: use append to combine start and destination cities, then use append to combine destination and start cities. Now you have a pair of "unique" (for a given pair of cities) identifiers. Using the same self join and row number, check if start-destination in destination-start before current row.
Suppose there are another data also bangalore to mumbai and mumbai to bangalore. Then how join will be helofull because bangalore will be maped with two data.
MYSQL Solution for Freshers With CTE as (Select Source,Destination,Distances, Case When Source>Destination then concat(Source,Destination) else concat(Destination,Source) end as Batch from src_dest_distances) Select Source,Destination,Distances from (Select *,row_number() over (Partition by Batch) as Rn from CTE) N Where Rn=1;
with cte as( Select *,row_number() over(order by distance ) as rn from src_dest_distance), cte2 as (Select *,row_number() over(order by distance ) as rn from src_dest_distance) select c1.source,c1.destination,c1.distance,c1.rn from cte c1,cte2 c2 where c1.rn
Hi Sir , Please let me know me this will work or not with cte as (select *,least(source,destination) as lst,greatest(source,destination) gr from distance) select source,destination,distance from (select *,row_number() over(partition by lst,gr) rnk from cte) sal where rnk =1;
for this particular output my mind came up with this solution. 💀 is it valid for others test cases as well ? with cte as ( select * , row_number() over() as rn from citydistances ) select `source`,Destination,Distance from cte where rn %2!=0;
Again we insert two rows 7. Bangalore to Hyderabad 400 8.Hyderabad to Bangalore 400 We have to fetch again one row is Bangalore to Hyderabad 400 , in that situation which condition we are apply.
Assuming that your table1 always has both directions, then SELECT * FROM TABLE1 WHERE source < destination will suffice otherwise SELECT DISTINCT CASE WHEN source > destination THEN destination ELSE source END source, CASE WHEN source > destination THEN source ELSE destination END destination, distance FROM table1 would work but be slower but should be significantly faster than a self-join.
Hi Toufik, Can you make a video on "case-insensitive pattern matching in PostgreSQL". I recently faced this issue when using wildcard, unlike mySQL Postgre isn't case-insensitive. Thanks for the resources & study and practice materials you provide, they are very helpful.
I would not hire you if you gave that answer. It solves the problem one time only, based on assumption about the input data. What if the order of input rows was randomized?
Hi sir, I am trying to solve in sql server but its getting following error. Then which col should i consider for order by clause Msg 4112, Level 15, State 1, Line 147, The function 'row_number' must have an OVER clause with ORDER BY.
This solution is not working if we have more than 1 destination with same value Ex: Source Destination. Distance Id "Pune" "Delhi" 1000. 1 "Delhi" "Pune" 1000 2 "BOM" "Delhi" 2000 3 "Delhi" "BOM" 2000 4 Here we have Delhi 2 times in destination (record 1 and 3) Output as per suggested query: "Pune" "Delhi" 1000 "Delhi" "Pune" 1000 "BOM" "Delhi" 2000 We got 2nd record as well bcz id 2< id 3 and source(Delhi)=destination(Delhi)
I think the below one also works.
with cte as
(select *,
row_number() over(partition by buckets) as rn
from(
select *,
ntile(3) over(partition by distance) as buckets
from src_dest_distance))
select source,destination,distance
from cte
where rn
But why we use bucket in this?
@@kartikeytyagi9119 in the given query two state names are repeated vice versa and i need to fetch the row which was inserted first. if i create buckets, i can get three buckets for three pairs of states, each bucket will have two rows with vice versa state names and if i apply row number for each bucket, i can be able to get first inserted rows(by using rn
Since distance is same how can you make buckets by partitioning the distance?
I just learned JOIN today in SQL and this was a great lesson to add to my earlier lesson. I also liked the addition of the ID column. Thank you!
This might sound daft, but given the problem/solution set out at 2:20 why not just SELECT * FROM INPUT WHERE SOURCE IN ('Bangalore','Mumbai','Chennai');
I guess this can be also one more approach with greatest and least function:
select distinct greatest(source, destination) as source, least(source, destination) as destination, distance from src_dest_distance;
Appreciate it. Could you tell me, is it MySQL or SQL server or others ? Thank you
PostgreSQL
Thank you for this alternative. I was thinking about something similar. This approach to me is more clever and avoids using the row number, which I feel is a bit hacky (even if it works)
It's great to see tutorials like this posted. However, it should be noted that the solution presented only works for a very specific dataset. Two potential failure cases come to mind:
1) Adding the city pair Bangalore Chennai will fail. As others have noted, the fix for this is to make the where clause check both T1.Source = T2.Dest and T1.Dest = T2.Source
2) It assumes that every city pair shows in both orders. If a city pair is only listed in one direction, it won't match on the self join.
Others have mentioned using least() and greatest(). Not just an alternative, but using such functions solves both of the problems noted above:
SELECT DISTINCT GREATEST(source, destination) AS source, LEAST(source, destination) AS destination, distance FROM cities;
Note: tested with MariaDB, other engines may vary.
This also works well with SQL server and Databricks SQL
Second theamigo42. This query will not work in those 2 scenarios
Problem with MS Sql is it wont let you create row_number() without putting an order by in the over() clause. I solved like this:
with CTE as
(
select t.*, row_number() over( partition by distance order by distance) as [id]
from travel_routes t
)
select source, destination, distance from CTE where id=1
May be the CTE is overkill, but for someone who might find this useful.
if distance is same for all records then this will not work
A self anti semi join sounds easier to me:
select *
from src_dest_distance a
where not exists
(select 1 from src_dest_distance a where a.source=b.destination and a.destination=b.source and b.source
I think we can simply write the query using least , Greatest funtions (in Oracle Database)
SELECT DISTINCT
least(SOURCE, DESTINATION) SOURCE,
greatest(SOURCE, DESTINATION) DESTINATION,
DISTANCE
FROM table1;
It will give the desired results
Thanks for sharing these functions. I'm trying to think through the logic of the code to see what the output would be. I'm not sure but it looks like your code will perform 6 sets of comparisons returning 6 rows of data. How do you reduce it down to the 3 rows desired?
@@olu0mg
Least function will always select the first word based on alphabetical order
You may watch some videos on these functions, it is quite easy.
And As we are using 'Distinct' in the query above, it will remove the duplicate records and return only 3 rows
@@Anonymous-le2zr Thank you! Somehow missed DISTINCT when I first read your post.
desired output is hyd, delhi, pune in destination column
above query gives hyd, mumbai, Pune as output
That was also the first thing that I taught about
This has to be one of the best IT channels out there.
Actually @techTFQ it would be better if you add t1.dest = t2.source bcz in the sample data we don't have repetition of source but that is mostly possible to uniquely identify we could add this condition
Also creation of ID is not necessary as text can also be compared directly, here source or destination
@@yaminurrahmantopiwala6207 it is required, query isn’t giving required output without id column instead it’s giving whole data
@@aakriti_100 I still disagree. You share your sample data and I will give you perfect query without redundancies
Hey
My solution for the same is
;with cte as (
select *,
case when source destination then source else destination end as destination1
from src_dest_distance)
select source1,destination1,max(distance)
from cte
group by source1,destination1
Instead of the group, DISTINCT can be used too.
Great solution, if the source and destination are not strict, we can do simple trick-
select * from src_dest_distance where source
U can't use comparison operator in character column
@@madavsaravanan8844 why?
@@protapnandi9729 beacuse
How can you compare
Two names
@@jacksparrow3595 You can certainly compare strings in SQL. Protap’s suggestion is the best solution to the stated problem. It’s extremely simple, gives correct results, and even has the benefit that the selected rows will always have the two endpoints in alphabetical order, which, if the actual source were quite long, would make the output easier to use.
For this data set it's fine but it would be cool to incorporate the possibility of the same city showing up in a different pair.
I guess you could just also add the condition that t2.source = t1.destination
with cte as (
select (case when sourcedestination then source else destination end) as destination,distance
from src_dest_distance ) ,
cte2 as (
select source,destination,distance, row_number()
over(partition by source order by distance) as rnk from cte )
select * from cte2
where rnk = 1
I think my solution is much simpler and therefore more understandable:
select source c1, dest c2, dist from input where sourcedest
select distinct s,d,distance from
(select *, case when source >destination then source else destination end s,
case when source < destination then source else destination end d
from src_dest_distance)x
Again an awesome video with great explanation. Just one suggestion to the join condition. Should we include T1.source = T2.destination and T1.destination = T2.source and T.id < T2.id so that we match only the required rows. The condition T1.source = T2.destination and T.id < T2.id may join non required rows as well which is not present in your data example. Like Bangalore -> Hyderabad and Chennai -> Bangalore. Let me know your thoughts. Thanks
Yes you need that condition T1.destination=t2.source and also you can replace t1.id
@@mudassirasaipillai6584 it is not working
agree, the best answer sees beyond the data given to the broader nature of the data and request, and provides a solution that also is robust for handling future edge cases. This is "find all the unique trips" not just "return these 3 specific rows"...
I'd add on that doing an inner join is less ideal for this same reason. It assumes that a "return trip" record will always be present to pair against.
Instead, using a left join with the filter condition placed in the WHERE clause (WHERE T2.id is null) would better handle the potential situation of an unpaired entry down the road. Retain a record when no "return trip" match is found is more robust, assuming that "find all the unique trips" is the mission.
@@mudassirasaipillai6584 That does not avoid duplicates. That will gives the same duplicate values.
For any city pair, you will have 2 rows, they will have ids of id1 and id2 which will be different, so one will be larger than the other; and they will have a different source.
Then for the join in one case you will have T1.id < T2.id, and in the other case you have T1.id > T2.id.
By having one of those as the condition you will only get one of the pair.
But having t1.source t2.source, the condition will be true in both cases and both directions of the journey will be returned.
with cte as
(select *,
case
when source < dest then concat(source,dest)
end as source_dest
from city
where source_dest is not null)
select source,dest,distance from cte;
Can't we write source> destination instead of complex join,row_ num
you have one of the best channels for practicing SQL queries...great going
with cte as(
select *,
row_number() over(order by distance) as rn
from src_dest_distance)
select source,destination,distance from cte where (rn%2)0
approach for specific dataset
select * from src_dest_distance
where source
Another Simpler way to solve this :
with xyz as (
select * , lead(destination) over() as LD
from src_dest_distance )
Select source , destination , distance from xyz where source = LD ;
great explanation ..
We can also run a simple query
select * from src_dest_distance
where source > destination
I would prefer source < destination, that way the source comes first alphabetically.
But neither will return the desired data, where it seems to want to use the first trip in the table.
Very nicely explained
Taufiq, what if the entry is like:
bangalore hyderabad 400
delhi bangalore 1200
will this condition not treat these as duplicate records ?
Can we use the condition: T1.source=T2.destination and T1.destination=T2.source ?
Then this record delhi bangalore will be filtered out
we can solve it through not exists operator also right.
with cte as
(select *, row_number() over as id from src_dest_distance)
select * from cte AS t1 where not exists (select * from cte AS t2 where t1.source = t2.destination and t1.destination = t2.source and t1.id>t2.id);
Thank you TFQ....
This question is asked by me.
Thank you for your reply 🙏 it helps a lot to others as well.
This would do the job
SELECT DISTINCT IIF(source > destination, source, destination) AS source, IIF(source < destination, source, destination) AS destination, distance
FROM src_dest_distance
Explanation in Excel did the whole trick with focus on ID column, thanks a lot sir
You're assuming you're getting sorted pairs, can you do it with unsorted source and no ID?
Why not filter the original input for SOURCE < DESTINATION? That would eliminate one of the 2 records, and would be much more efficient than a self join...
yep, although it wouldn't change the results, just make sure to use lower function on both sorce and destination to avoid headaches down the line when someone introduces "delhi" and someone's script breaks and nobody knows why
This is an easier solution @techTFQ
with cte as (select *, lead(destination,1,destination) over() as sc1 from src_dest_distance)
select source, destination, distance from cte where source = sc1;
excellent solution
Amazing 👏 keep it up 😇😇
I wouldn't have done it like that but it is a quite neat solution. Thanks.
Thanks Taufiq, I think visualization of output is necessary in such scenarios
Select source , destination, distance from table
Union
Select destination as souce, source as destination, distance from table
? Is it possible ?
Easier:
select
source as destination,
destination as source,
distance
from input
union
select
destination,
source,
distance
from input
Explanation: union will automatically remove duplicate rows ;)
it is not working
with cte as (
select * , lead(destination) over(order by distance) as Lead_dest
from src_dest_distance )
Select source , destination , distance from cte where source = Lead_dest ;
Hi,
My approach to this Question
With cte as
(Select *,ROW_NUMBER() over (order by Distance) as RN from Distance)
Select Source,Destination,Distance from cte where RN %2 =1
I got it output....but is the approach right??
But it won't match if the input table is scrambled. Like delhi to mumbai is not just below mumbai to delhi
select s.source,s.destination,s.distance from src_dest_distance as s
join
src_dest_distance as s1
on s.source=s1.source and ascii(s1.source)>ascii(s.destination)
Looking at the output, it is a alternative row of the input table. In this instance, I believe we can divide the row number by 2. Where ever the value is not equal to zero, the result is the output table.
Select *
From (Select *, row_number() over () as row
from input_table) t1
where row % 2 0
Select hash(destination) + hash(source) unique, first(destination), first(source), max(distance)
From table
Group by hash(destination) + hash(source);
We can concatenate both the columns and on that column we can get the duplicates rt?
Hi I think this can be done by using row number function and selecting the odd rows will give the desired output?
I think this will only solve this particular data.
well laid out..thanks
Thank you for the Video. This SQL only works if there are always two sets of records (A->B, B->A). If there is only a single source - destination pair (S->D) it will not be part of the result.
select distinct greatest(source,destination),least(source,destination),distance from src_dest_distance
Table name assumed - Location
WITH cte as
(SELECT * , LAG(Source,1,0) OVER() as comp
FROM Location)
SELECT Source,Destination,Distance
FROM cte
WHERE Source NOT IN (SELECT Source
FROM Location
WHERE Destination=comp)
So clear ! Thank you Thoufiq. Keep it coming man. More to watch and learn.
thanks for your efforts Toufiq! Jazakallohu hairon.
I really enjoyed how you explain the win func and make easy for us to understand
Great work and delivery 🎉❤
I think wasy way to remove the duplicate records in the table is to use the union set operator.
Sir i hv one question. Syntax kaise likhe ..i mean basics may kuch sikhate hay or yaha pe or kuch likhte hay
Very clean and crisp explanation..Thanks Taufiq
Fabulous
Nice explanation Taufiq 👌 👍 👏
select * from src_dest_distance where source < destination;
We can also use lead function in Table and then comparing in with the 1st Column. That will result the Unique output.
Very Nice explanation 👌 thank you 😊
This is what I came up with before I watched your solution:
select src, dest, dist
from cities c1
where (
select count(*)
from cities c2
where c2.src = c1.dest
and c2.src < c1.src) = 0;
Helpful.
select
t1.source, t1.destination, t1.distance
from src_dest_distance t1 left join src_dest_distance t2 on t1.destination = t2.source and t1.distance = t2.distance
where t1.source
with cte as
(select row_number()over (order by distance ) as rowid,SOURCE,DISTINATION,DISTANCE from TRAVEL)
select SOURCE,DISTINATION,DISTANCE from cte where (rowid%2)0
if here is one another column mumbai to bangalore 400 km after first row then how we can tackle it
select source,destination,distance From (select *,
LEAD(Source,1,Source) OVER() as Source1,
LEAD(Destination,1,Destination) OVER() as Destination1
FROM src_dest_distance) as a
WHERE source=destination1 AND destination=source1;
Is that correct please check?
hi everything is excellent, but you mentioned the concept used in the title, so anyone who saw this video didn't get the opportunity to think on their own coz as the concept used is already mentioned, but you are doing a great job, to data community
very valid point. Hence I have just renamed the title to remove it so it help the future viewers.
IT IS NOT WORKING IN ORACLE WITH OUT OVER() CLAUSE
is below query work ?
select * from src_dest_distance
where source > destination
I don't understand why we cannot just do:
SELECT * FROM table1 WHERE Source IN('Bangalore', 'Mumbai', 'Chenai')
??? Maybe the problem definition needs to be a bit more precise? Cool video either way :-)
The solution has to be dynamic in case there are other cities added to the table. It shouldn't be hardcoded.
Write a sql query on input table is like Item, no_items Apple, 8 Potato, 4 Banana,6 Tomato,2 And Output table should be like Item, sum_item Vegitables, 8 Fruits, 14
with cte as(
select source,destination,distance,row_number() over(order by distance) as row from src_dest_distance)
select source,destination,distance from cte where row in (1,3,5)
This Query also gave the same result. Is that correct?
No, as in the example the distances are all the same, the order is more or less random. With real distances it would be still a problem, because it is possible that two different pairs of cities have the same distance.
Could you not in this case also do a bitwise OR of the two fields and select distinct?
My senior @jinal used to explain the data in the same way how you do here...loved your presentation.
I think we simply do like below.
Select * from tablename
Where source > destination
yes..much easier and better.
What if you have two paths to the same destination from 1 city?
Here is my alternate solution: use append to combine start and destination cities, then use append to combine destination and start cities. Now you have a pair of "unique" (for a given pair of cities) identifiers. Using the same self join and row number, check if start-destination in destination-start before current row.
Best of the Best Techtfq, Awesome delivery consistency. great job ciao
Suppose there are another data also bangalore to mumbai and mumbai to bangalore. Then how join will be helofull because bangalore will be maped with two data.
MYSQL Solution for Freshers
With CTE as
(Select Source,Destination,Distances,
Case When Source>Destination then concat(Source,Destination) else concat(Destination,Source) end as Batch
from src_dest_distances)
Select Source,Destination,Distances from
(Select *,row_number() over (Partition by Batch) as Rn from CTE) N
Where Rn=1;
thank you bro.........can you do more videos on interview point of view
with cte as(
select src_dest_distance.*,ROW_NUMBER()OVER() as "x" FROM src_dest_distance
)
select source,destination,distance from cte where x%2!=0;
with cte as(
Select *,row_number() over(order by distance ) as rn
from src_dest_distance),
cte2 as (Select *,row_number() over(order by distance ) as rn
from src_dest_distance)
select c1.source,c1.destination,c1.distance,c1.rn
from cte c1,cte2 c2
where c1.rn
Wha if there is another row which is chennai, bangalore, 400. Will it works?
Hi Sir ,
Please let me know me this will work or not
with cte as (select *,least(source,destination) as lst,greatest(source,destination) gr from distance)
select source,destination,distance from (select *,row_number() over(partition by lst,gr) rnk from cte) sal where rnk =1;
please share another way of doing that?
Is there a reason for using a question like this on an interview? And if you don't get it do you still get the job?
Nice Presentation.. Keep it up..
You are a true gem for data community ❤️
Hi sir,
Can you please make a video on how to update data of one table from another table using the merge concept.
It will be very helpful for me.
Can we represent or operator with || in SQL
for this particular output my mind came up with this solution. 💀
is it valid for others test cases as well ?
with cte as (
select * ,
row_number() over() as rn
from citydistances
)
select `source`,Destination,Distance from cte
where rn %2!=0;
Take a shot every time he says OK, challenge.
Again we insert two rows
7. Bangalore to Hyderabad 400
8.Hyderabad to Bangalore 400
We have to fetch again one row is Bangalore to Hyderabad 400 , in that situation which condition we are apply.
In that case what we can do is first select distinct rows and take that as table so you won't find any issuse
Assuming that your table1 always has both directions, then SELECT * FROM TABLE1 WHERE source < destination will suffice otherwise SELECT DISTINCT CASE WHEN source > destination THEN destination ELSE source END source, CASE WHEN source > destination THEN source ELSE destination END destination, distance FROM table1 would work but be slower but should be significantly faster than a self-join.
We can solve it by where function too and much more easier and faster
Hi Toufik,
Can you make a video on "case-insensitive pattern matching in PostgreSQL". I recently faced this issue when using wildcard, unlike mySQL Postgre isn't case-insensitive.
Thanks for the resources & study and practice materials you provide, they are very helpful.
in postgres, you can use 'ilike' for a case-insentive version of 'like'
What if we just take the odd rows.
Select * from (Select *,row_number() over() as id from table) where (id&1)=1;
I would not hire you if you gave that answer. It solves the problem one time only, based on assumption about the input data. What if the order of input rows was randomized?
@@PhilAndOr lol
Hi sir,
I am trying to solve in sql server but its getting following error. Then which col should i consider for order by clause
Msg 4112, Level 15, State 1, Line 147, The function 'row_number' must have an OVER clause with ORDER BY.
You can order by any column, the solution would still work
@@techTFQ Thankyou sir🙏
I also got the error even though after i gave order by clause i am still facing that error.
This solution is not working if we have more than 1 destination with same value
Ex:
Source Destination. Distance Id
"Pune" "Delhi" 1000. 1
"Delhi" "Pune" 1000 2
"BOM" "Delhi" 2000 3
"Delhi" "BOM" 2000 4
Here we have Delhi 2 times in destination (record 1 and 3)
Output as per suggested query:
"Pune" "Delhi" 1000
"Delhi" "Pune" 1000
"BOM" "Delhi" 2000
We got 2nd record as well bcz id 2< id 3 and source(Delhi)=destination(Delhi)
one doubt. what happens if data is not on this order