I converted the 'date' column from text to Date format after importing. And when I ran the duplicate_cte, I only got 5 rows in output. Note: I used date instead of 'date' in the partition by section.
3.15am South African time, I managed to finish the project. It was tough had to start over, missed some stuff but I did it. I just completed my Meta Data Analyst Professional Course and decided to do this Boot camp. As someone coming from the Beauty Industry, I was so lost. Thank you @AlextheAnalyst for helping people like me who want to learn and excel but cannot afford tuition for University or most Institutions.
Timestamp: 18:59: after unchecking the Safe updates mode , you don't have to restart MySQL; go to Query on the top left side of your window then select reconnect to server and then run your query.
Hola Alex, soy de Argentina y estoy viendo todo el Bootcamp con subtítulos. Eres una persona maravillosa a la hora de explicar las cosas, siento como que un amigo me está explicando los temas. Muchas gracias por lo que hiciste.
Hey Alex! I'm from India, I have been following you for a months but really couldn't make any project. But from the first encounter of your content, I knew I'm gonna walk on your foot steps. I loved it and also I was looking for some data like this last 2 weeks, I did tried on most the US, Indian data bureau(s) and where not. Yesterday I decided to make this project at hand, AND WOW It was that data I was looking for. Thank you so much. This is my second project ever in SQL. I love it totally from the beginning to the end. And I had so much fun doing this project, literally. I was laughing with the funny parts, even overwhelmed in the end at that industry populating part juts like. I cheered yayy. I made a mistake too, I forgot to run the deleting query of the duplicates. I had to run from the start to find out where did I miss. I love your energy and how to take things so calmly and carry the learner with you till the very motive. I probably have written too much now. I am so excited to follow this project till the visualization part. And here's one more thing I tried, I wanna show you - In the populating part, we can do it without making the blanks, null. This is the query I tried, UPDATE layoffs_staging2 t1 JOIN layoffs_staging2 t2 ON (t1.company = t2.company AND t1.location = t2.location) SET t1.industry = t2.industry WHERE (t1.industry IS NULL OR t1.industry = '') AND (t2.industry IS NOT NULL AND t2.industry != '');
hey i'm working on the latest layoff dataset, matched some unknowns in the stage column by doing a self join on company column, should i update those unknown values?
I'm so proud to have made it this far! All made very possible by Alex's natural teaching skills. Thanks Alex! This is absolutely going on my portfolio! 👍
For anyone that struggled with the date error code, I've spent countless troubleshoot and was able to find out what was causing the issues. In my case it was the DATE not showing as NULL but showing as 'None' or blank like this ' ' to fix this, you will need to update it as NULL depending if yours is showing 'None' or ' ' update layoffs_staging2 set `date` = NULL where `date` = 'None' OR ' ' (NO SPACE, DEPENDING WHAT YOURS IS SHOWING) p.s Thank you Alex for this project. it was really helpful!
I was having issues too. For some reason the dates were not matching up. I had to do this code to get it to work: UPDATE layoffs_staging2 SET `date` = CASE WHEN `date` LIKE '%/%/%' THEN STR_TO_DATE(`date`, '%m/%d/%Y') ELSE `date` END;
I have started my journey of learning data analytics.... And with the help of your content i understand so many things quickly and easily... Hopefully i will get the job at the end of thid course.... Thanks a lot alex... From pak
Thanks Alex. Great video. The best tip so far was from the data cleaning vid. I didn’t realize that I could check the results before executing changes on the database. Like start_time, length(start_time), SUBSTRING(start_time,1,19) to check truncating that string prior to the real deal.
I don't know but I'm too thankful for you, I really hope see people could be so great like you in my daily life. Thank you, Alex, I hope you getting too much positive things with numbers of people who are got benefit from your videos.
I must commend you Alex, had a little bit of trouble populating the null values following the instructions in this video, had to go and check the code on your Github which was super helpful and straightforward. Thanks for your lecture and am sure going to complete this boot camp playlist❤
Thank you, Alex. I can't believe I did my first Data Cleaning project. I have been following the playlist. I did some cleaning by myself. My dataset is more refined though. Lots of love and respect!
absolutely enjoyed the tutorial and learnt loads of stuff!! When you said at the end that not everyone is able to make till the end, I felt really good that I could follow along and made it till the end. :) thank you very much Alex for putting out this tutorial for all of us newbies. Really looking forward to begin Exploratory data analysis tutorial!! Cheers..!
In a previous video I remember you saying that sometimes it is just easier to import data into excel to clean it up before running it for analysis in SQL. After watching your Excel series I COMPLETELY SEE WHY! Excel is definitely more point and click, but I cant imagine the lag trying to work a million lines of code on Excel would be like.
Dear Alex, I wanted to take a moment to express my heartfelt thanks for the incredibly useful lesson you provided on data cleaning. Your generosity in sharing your knowledge for free has been invaluable, and the techniques you've taught me will undoubtedly help me handle data more effectively in my work. I truly appreciate your time, effort, and willingness to help others grow in this essential skill. Thank you once again for your guidance and support! Warm regards, Saheel Mowlana
...and another lesson taken... 47:35 "I can't trust that data, I really can't!" We should get to this level before confidently deleting 'useless' rows! As always, Alex you're the best! Thank you very much for all your contribution!
Alex congratulating me for making it to the end like i have a choice. i have made it to the end like 3 times now😂. great work man. i bless the day i discovered your channel.
16:49 If anyone is having trouble here, I had an error code 1175. You just need to go to Edit-> Preferences-> SQL Editor -> and deselect the option "Safe Edits". You then have to exit and open again (save before you exit) and then run again. Edit: Alex explains this at 18:52
The only thing I am interested in is, how you can make MySQL this interesting, these videos are so great that I am watching it till the end from the very beginning. Thanks Alex 👍
Timestamp 24:27 This is how I resolved all those, just the way you taught us. It is usually caused by difference in character type. SELECT location FROM layoffs_staging2; SELECT DISTINCT location, country FROM layoffs_staging2; SELECT * FROM layoffs_staging2 WHERE location LIKE 'Mal%'; UPDATE layoffs_staging2 SET location = 'Malmo' WHERE location LIKE 'Mal%'; SELECT * FROM layoffs_staging2 WHERE location LIKE 'Flor%'; UPDATE layoffs_staging2 SET location = 'Florianopolis' WHERE location LIKE 'Flor%'; SELECT * FROM layoffs_staging2 WHERE location LIKE '%sseldorf'; UPDATE layoffs_staging2 SET location = 'Dusseldorf' WHERE location LIKE '%sseldorf'; Hope this is okay?
Hey! You can also use a Case Statement. I did like this: UPDATE layoffs_staging2 SET location = CASE WHEN location = 'Düsseldorf' THEN 'Düsseldorf' WHEN location = 'Florianópolis' THEN 'Florianópolis' WHEN location = 'Malmö' THEN 'Malmö' ELSE location END WHERE location IN ('Düsseldorf','Florianópolis','Malmö') ;
✅thank you! I finally made it to the end To be honest, I found it very challenging and some of my results are different- for example when changing things, Alex had 2 rows returned I had around 4. I'm not sure where I went wrong but I did follow all the steps. I might have to revisit this again but I'm just happy I finally completed it!
Just finished doing this one, really fun and practical. Now heading to the EDA part. A question, I am not sure how to post this projects in a portfolio. I normally publish projects in my github page when it's about web or app development but I've never done SQL projects before, how is it supposed to be published to make it properly visible for recruiters for example. Thank Alex for all the value you share
Hi, your comments appreciated on followin removing duplicates sql statement in very easy way: Create table test (TABLE NAME) select distinct * from layof;
39:40 I'm learning a lot from this tutorial, so thank you! I'm looking at the SQL command and I can't help thinking it would look less confusing to a co-worker if it were a FOR LOOP to get a similar result.
Absolutely phenomenal work. Thank you very very much for this. Cleared some of the concepts and at the same time created a project. Absolutely love it.
Thank you ,Alex. Now that I can see how sql works in a real project. Although I've finished all of your sql videos I have no idea how to utilize it in a real project. But thanks to this video I feel like I can move forward. Again , thank you. You are the best teacher for a self learner like me.
@Alex The Analyst , Thank you so much for sharing a genuine Content . Till now i have learnt lots of SQL Tutorial , (there is one issue on fixing Text Type to Date ) . I hope when you find this message definately help me out there.
That was absolutely very informative and helpful. it's a real-world data. I guess there are a lot of dataset that are more messy which would a different approach.
About the duplicates: I used a subquery to reference the table in the 'WITH' statement, that way I get to delete duplicates without creating another table. Just wanna know if it's a valid procedure ☺. Love this course, learning a lot.❤
Dear Alex! Firstly, i wanted to thank you for this series, you're truly amazing and its been so so fun following along your videos. I have faced a small problem In #4 when removing duplicates in the companies. I have followed every step to the T but instead of having 2 rows , I have 6 returned!! I retraced my steps but again it gives the same output and i'm quite frustrated. Is there any reason for this? I also welcome any comments from the community if you can help out! Thank you so much in advance :)
I understand the ease of which using the upload tool makes it to create a table and import the data all at once, but I find that it makes it cumbersome in this case since there is no unique column. Is there a reason that you wouldn't create the schema for the table and create an auto incrementing id column that is the primary key to assign a unique id to every row, then use the row_number function to search for duplicate rows using all the columns except the id column. This would save you from having to create a duplicate table to store row_num as you could just use the id column to delete the duplicate records. This also seems like it would make your database easier to deal with since it would have a proper primary key. Sure, it is a meaningless primary key, but it would certainly make updating the data easier and faster in many cases.
while you were updating the date format but there were null values and you could still go ahead and continue the update query. but when i tried to do the same it is not letting me do it and throws an erroe `Error Code: 1411. Incorrect datetime value: 'NULL' for function str_to_date`
I have a question. I do the exact same thing, but there is nothing when I do the 'removing duplicates' part. I did all the processes again and just realized that mine is only 564 records. I don't know why. Can you explain how to fix it?
Good evening, I have a problem with downloading data in mysql. The file layoffs.csv in the description has 2361 row and after downloading on my pc too. After importing the data into MySQL using the method in this episode (table data import wizard), only the first 564 rows remain in my table. Has anyone encountered this problem before? At the same time I take this opportunity to thank Alex for the work and time spent
copied this from @nandikran "If you are facing only 500+ rows of data getting imported where as there are 2000+ rows of data, here is how to fix it: Step 1: Convert your csv file to JSON using any online CSV to JSON converter. Step 2: You will have 'None' instead of NULL in your data now. Run this query UPDATE layoffs SET company = CASE company WHEN 'None' THEN NULL ELSE company END, location = CASE location WHEN 'None' THEN NULL ELSE location END, industry = CASE industry WHEN 'None' THEN NULL ELSE industry END, total_laid_off = CASE total_laid_off WHEN 'None' THEN NULL ELSE total_laid_off END, percentage_laid_off = CASE percentage_laid_off WHEN 'None' THEN NULL ELSE percentage_laid_off END, `date` = CASE `date` WHEN 'None' THEN NULL ELSE `date` END, country = CASE country WHEN 'None' THEN NULL ELSE country END, funds_raised_millions = CASE funds_raised_millions WHEN 'None' THEN NULL ELSE funds_raised_millions END; Step 3: You will have all fields as Text, to convert the wrong ones into Int, run this query ALTER TABLE world_layoffs.layoffs CHANGE COLUMN total_laid_off total_laid_off INT NULL DEFAULT NULL , CHANGE COLUMN funds_raised_millions funds_raised_millions INT NULL DEFAULT NULL ; Now your data is identical to Alex's"
You may try to import it again, left click on the layoffs table that have been created (or imported) before, choose Table Data Import Wizard, repeat the same process as before, the only difference is use existing table instead of create new one. This fix it for me without converting the raw data from csv to json.
Also i wanted to add for the null values when it came to total_laid_off and percentage_laid_off, for companies that had multiple rows and some of those multiple rows had null values but at least one had all values we could have gotten the values for those one by using eg: total_laid_off = 20 and percentage_laid_off = 0.5 then we can assume that that company had a total of 40 and then used that to populate the similar rows for the company if it had say the percentage laid off or the total laid off but that only works if hiring did not happen and a different date could still share the same base population as the one we found
Hi Alex, i can't thank you enough, its my 19th video and i can feel like im getting to understand the idea behind the language. i want to ask, concerning values like percentage laid off, is that where measures of central tendencies takes place to fill in the nulls?. Just thinking out loud. Thank you once again Alex
I am facing the same issue. I just came to the comment section to see if anyone can bring some light to it. I really want to practice and do the project, it's frustrating to be stuck right before starting.
I faced the same issue. I don't have a solution but a suggestion. Just follow through the video with whatever data u could import. Won't be perfect, but try to get what we're trying to do here. Then just try to practice data cleaning on some raw data which you can find on kaggle.
Great tutorial, thanks. Only, you started by deleting duplicates, and later on you standardize data, but by doing so you might have created new duplicates. so I suggest to swap those 2 steps.
Hi Alex, this is indeed a great project to add in my portfolio project. Thanks for this project and taking your time to go through the tasks step by step. I wanted to find out if the order of executing the tasks is important. For instance, I would have handled the project in the order: standardization, handling missing and null values, remove duplicates, e.t.c. What is your take on this?
Dear Alex, First and foremost thank you very much for this bootcamp which is just mind-blowing (and also for making it for free). Secondly, I just dowloaded the gitHub file but the moment I import the csv file it shows "564 records imported"instead of more than 2000. How come? What am I missing? Thank you in advance boss!
Hey @blankninja5158 I had this exact same problem! It took me ages to figure out lol but basically I think MySQL has a problem with reading non-ASCII characters, which are like the weird looking characters. What I did was open the CSV file in an excel sheet and edited the data file there. In the company column, there was a company called Ualá which I changed to Uala, and in the location column there was Düsseldorf, Florianópolis and Malmö which I changed to Dusseldorf, Florianopolis and Malmo (in the real names of those places the vowels have accents which I think caused the problem in the first place). I then saved the excel file as a CSV file again and it imported all the data no problem. Hope this helps!
@@AviShandilya Mac user as well.. I don't know but no one is replying and I am literally going mad. Found no solution and cannot continue or end the course.. so annoying
Watched all the ads without even skipping, that's how much I am grateful for your work and time you put into this.
Yes , it's true .. I'm grateful for this course , that too for free
will start doing this too
mam I'm from India can u get a job ? currently I'm looking for job if u got a job then tell me how you got the job it means a lot to me...
@@chandrabose867 no job for you
Timestamps:
Removing Duplicates: 8:00
Standardizing Data: 17:32
Null/Blank Values: 33:30
Remove Unnecessary Columns/Rows: 46:12
Great video!
I converted the 'date' column from text to Date format after importing. And when I ran the duplicate_cte, I only got 5 rows in output.
Note: I used date instead of 'date' in the partition by section.
Life saver I was just searching for this
nice video
@@obeliskphaeton yeap if you change it before importing for some reason does not work, thanks!!
Thanks for the kind words! I made it to the end and learned a lot while working on the project simultaneously.
3.15am South African time, I managed to finish the project. It was tough had to start over, missed some stuff but I did it. I just completed my Meta Data Analyst Professional Course and decided to do this Boot camp. As someone coming from the Beauty Industry, I was so lost. Thank you @AlextheAnalyst for helping people like me who want to learn and excel but cannot afford tuition for University or most Institutions.
how did you download the datasets from his github??
Timestamp: 18:59: after unchecking the Safe updates mode , you don't have to restart MySQL; go to Query on the top left side of your window then select reconnect to server and then run your query.
Thank You!
just click the bar under Tools
What do I do if that didn’t work? 🙃
Alex! This is why I subscribed, thank you so much for doing this in MySQL!!
Hola Alex, soy de Argentina y estoy viendo todo el Bootcamp con subtítulos. Eres una persona maravillosa a la hora de explicar las cosas, siento como que un amigo me está explicando los temas. Muchas gracias por lo que hiciste.
Hey Alex! I'm from India, I have been following you for a months but really couldn't make any project. But from the first encounter of your content, I knew I'm gonna walk on your foot steps. I loved it and also I was looking for some data like this last 2 weeks, I did tried on most the US, Indian data bureau(s) and where not. Yesterday I decided to make this project at hand, AND WOW It was that data I was looking for. Thank you so much. This is my second project ever in SQL. I love it totally from the beginning to the end. And I had so much fun doing this project, literally. I was laughing with the funny parts, even overwhelmed in the end at that industry populating part juts like. I cheered yayy. I made a mistake too, I forgot to run the deleting query of the duplicates. I had to run from the start to find out where did I miss. I love your energy and how to take things so calmly and carry the learner with you till the very motive. I probably have written too much now. I am so excited to follow this project till the visualization part. And here's one more thing I tried, I wanna show you - In the populating part, we can do it without making the blanks, null. This is the query I tried,
UPDATE layoffs_staging2 t1
JOIN layoffs_staging2 t2
ON (t1.company = t2.company AND t1.location = t2.location)
SET t1.industry = t2.industry
WHERE (t1.industry IS NULL OR t1.industry = '') AND (t2.industry IS NOT NULL AND t2.industry != '');
hey i'm working on the latest layoff dataset, matched some unknowns in the stage column by doing a self join on company column, should i update those unknown values?
I'm so proud to have made it this far!
All made very possible by Alex's natural teaching skills.
Thanks Alex! This is absolutely going on my portfolio! 👍
I love the way you teach. It is so important to see the errors when learning something new.
For anyone that struggled with the date error code, I've spent countless troubleshoot and was able to find out what was causing the issues.
In my case it was the DATE not showing as NULL but showing as 'None' or blank like this ' '
to fix this, you will need to update it as NULL depending if yours is showing 'None' or ' '
update layoffs_staging2
set `date` = NULL
where `date` = 'None' OR ' ' (NO SPACE, DEPENDING WHAT YOURS IS SHOWING)
p.s
Thank you Alex for this project.
it was really helpful!
I was having issues too. For some reason the dates were not matching up. I had to do this code to get it to work:
UPDATE layoffs_staging2
SET `date` = CASE
WHEN `date` LIKE '%/%/%' THEN STR_TO_DATE(`date`, '%m/%d/%Y')
ELSE `date`
END;
Finally completed this after a second trial and now I have my first project on MySQL
I have started my journey of learning data analytics.... And with the help of your content i understand so many things quickly and easily... Hopefully i will get the job at the end of thid course.... Thanks a lot alex... From pak
Hey, me too. btw, would you like to share the name of city you are from?
Patiently waiting for the exploratory aspect of the clean data.
Thanks very much
Thanks Alex. Great video. The best tip so far was from the data cleaning vid. I didn’t realize that I could check the results before executing changes on the database. Like start_time, length(start_time), SUBSTRING(start_time,1,19) to check truncating that string prior to the real deal.
I don't know but I'm too thankful for you, I really hope see people could be so great like you in my daily life. Thank you, Alex, I hope you getting too much positive things with numbers of people who are got benefit from your videos.
I must commend you Alex, had a little bit of trouble populating the null values following the instructions in this video, had to go and check the code on your Github which was super helpful and straightforward. Thanks for your lecture and am sure going to complete this boot camp playlist❤
man the course you have develop is just wooow , amazing thank you so so much
Thank you, Alex. I can't believe I did my first Data Cleaning project. I have been following the playlist. I did some cleaning by myself. My dataset is more refined though. Lots of love and respect!
watch till the end it's awesome Alex ! thanks so much
Alex, You're so natural. The Best yet!
absolutely enjoyed the tutorial and learnt loads of stuff!! When you said at the end that not everyone is able to make till the end, I felt really good that I could follow along and made it till the end. :)
thank you very much Alex for putting out this tutorial for all of us newbies. Really looking forward to begin Exploratory data analysis tutorial!! Cheers..!
Thanks for this video, Alex! Really need it
Alex videos are always so real authentic and so relevant!
Thank you Alex for the kind words at the end of this video. It made me feel good 😊
In a previous video I remember you saying that sometimes it is just easier to import data into excel to clean it up before running it for analysis in SQL. After watching your Excel series I COMPLETELY SEE WHY! Excel is definitely more point and click, but I cant imagine the lag trying to work a million lines of code on Excel would be like.
Thanks Alex, Can't wait to Start this project
Dear Alex,
I wanted to take a moment to express my heartfelt thanks for the incredibly useful lesson you provided on data cleaning. Your generosity in sharing your knowledge for free has been invaluable, and the techniques you've taught me will undoubtedly help me handle data more effectively in my work. I truly appreciate your time, effort, and willingness to help others grow in this essential skill. Thank you once again for your guidance and support!
Warm regards,
Saheel Mowlana
...and another lesson taken... 47:35 "I can't trust that data, I really can't!" We should get to this level before confidently deleting 'useless' rows! As always, Alex you're the best! Thank you very much for all your contribution!
Has anybody told you that you are not just good but you are AWESOME !👑
Alex congratulating me for making it to the end like i have a choice. i have made it to the end like 3 times now😂. great work man. i bless the day i discovered your channel.
Alex The Analyst, You are a Blessing to this Generation...
Thanks for doing a project in SQL. Waited for long.
Great work, Alex. we love you
Awesome! I can't believe I followed through to the end. You're an incredible tutor.
Made it to the end!! Thank you Alex. Much love from Tanzania:)
twinsies
16:49 If anyone is having trouble here, I had an error code 1175. You just need to go to Edit-> Preferences-> SQL Editor -> and deselect the option "Safe Edits". You then have to exit and open again (save before you exit) and then run again.
Edit: Alex explains this at 18:52
"Deleting data is a very interesting thing to do, you have to be confident. Am i 100 confident? not really", LOL 😂
exactly what i was looking for! thanks a lot 🙌🏻
Amazing Alex, this is exactly what I'm looking for my project too. Thank you so much
Thanks for sharing this helpful content, Alex! We need more of this.
The only thing I am interested in is, how you can make MySQL this interesting, these videos are so great that I am watching it till the end from the very beginning. Thanks Alex 👍
Timestamp 24:27 This is how I resolved all those, just the way you taught us. It is usually caused by difference in character type.
SELECT location
FROM layoffs_staging2;
SELECT DISTINCT location, country
FROM layoffs_staging2;
SELECT *
FROM layoffs_staging2
WHERE location LIKE 'Mal%';
UPDATE layoffs_staging2
SET location = 'Malmo'
WHERE location LIKE 'Mal%';
SELECT *
FROM layoffs_staging2
WHERE location LIKE 'Flor%';
UPDATE layoffs_staging2
SET location = 'Florianopolis'
WHERE location LIKE 'Flor%';
SELECT *
FROM layoffs_staging2
WHERE location LIKE '%sseldorf';
UPDATE layoffs_staging2
SET location = 'Dusseldorf'
WHERE location LIKE '%sseldorf';
Hope this is okay?
i did same, its easier but i think he try to teach us every way
Hey! You can also use a Case Statement. I did like this:
UPDATE layoffs_staging2
SET location = CASE
WHEN location = 'Düsseldorf' THEN 'Düsseldorf'
WHEN location = 'Florianópolis' THEN 'Florianópolis'
WHEN location = 'Malmö' THEN 'Malmö'
ELSE location
END
WHERE location IN ('Düsseldorf','Florianópolis','Malmö')
;
Thank you Alex! I have learnt so much from this incredible Data cleaning project.
✅thank you! I finally made it to the end
To be honest, I found it very challenging and some of my results are different- for example when changing things, Alex had 2 rows returned I had around 4. I'm not sure where I went wrong but I did follow all the steps. I might have to revisit this again but I'm just happy I finally completed it!
Excellent Alex!!! You read my mind, man! This is just what I needed to put in my portfolio. THANK YOU
A great tutorial, Thanks Alex the Analyst. It also helps me connect with SQL coding which I have learned in the tutorials. 😃😃
I shouldn't be having this much fun doing homework. Thank you Alex!
Just finished doing this one, really fun and practical. Now heading to the EDA part.
A question, I am not sure how to post this projects in a portfolio. I normally publish projects in my github page when it's about web or app development but I've never done SQL projects before, how is it supposed to be published to make it properly visible for recruiters for example.
Thank Alex for all the value you share
if you got any answers regarding this, please share that with us too.
Hi,
your comments appreciated on followin removing duplicates sql statement in very easy way:
Create table test (TABLE NAME)
select distinct * from layof;
39:40 I'm learning a lot from this tutorial, so thank you!
I'm looking at the SQL command and I can't help thinking it would look less confusing to a co-worker if it were a FOR LOOP to get a similar result.
Thanks a lot for simplifying MYSQL Alex!
awesome work Alex! and thanks for providing this kind of contents
Absolutely phenomenal work. Thank you very very much for this. Cleared some of the concepts and at the same time created a project. Absolutely love it.
Took me all day but yayyy I’m done my first sql project!!
you are amazing person!! thanks for this course 🌟
while deleting or updating find error as safe update or delete so ether you can run this "SET SQL_SAFE_UPDATES = 0;"i or watch 19:07 first.
thank you alex! this work is very much appreciated! more power!
Thank you ,Alex. Now that I can see how sql works in a real project. Although I've finished all of your sql videos I have no idea how to utilize it in a real project. But thanks to this video I feel like I can move forward. Again , thank you. You are the best teacher for a self learner like me.
i Appreciat your work Alex, well done
Learned a lot following along through this, excited to follow the EDA next!
Thank you so much for sharing your expertise. I learned and I laughed (your comments 😄) throughout the tutorial. You're awesome!
Haha glad to hear it! Learning should be fun :D
@Alex The Analyst ,
Thank you so much for sharing a genuine Content .
Till now i have learnt lots of SQL Tutorial , (there is one issue on fixing Text Type to Date ) . I hope when you find this message definately help me out there.
Hey alex , thanks for the video.
Please cover data cleaning in stata or R as well.
just on time 🤩
learned a lot from this project. Special Thanks to Alex
Great , The Alex. thank you so much. I have learned a lot from basic to Cleaning Data.
Thank you once again.
This was very insightful, thank you so much for this Alex.
Alex u r the best! Thank you so very much... Plzzz do more videos on data cleaning.
thanks @alex i convert the csv to json and it worked
Amazing video, thank you !
Incredible tutorial Alex. Thank you!
thanks alex for this data cleaning practice
Thanks a lot . First time watching your video and u were great 👍
That was absolutely very informative and helpful. it's a real-world data. I guess there are a lot of dataset that are more messy which would a different approach.
This was an amazing tutorial!! Thank you
About the duplicates: I used a subquery to reference the table in the 'WITH' statement, that way I get to delete duplicates without creating another table. Just wanna know if it's a valid procedure ☺. Love this course, learning a lot.❤
can u help me
with the query
Excellent my friend, I will come back soon and support the channel.
Hey Alex, thank you for this awesome bootcamp, can you make a video about how to use this project on our portofolio?
Thank you Alex
Dear Alex!
Firstly, i wanted to thank you for this series, you're truly amazing and its been so so fun following along your videos.
I have faced a small problem In #4 when removing duplicates in the companies. I have followed every step to the T but instead of having 2 rows , I have 6 returned!!
I retraced my steps but again it gives the same output and i'm quite frustrated. Is there any reason for this?
I also welcome any comments from the community if you can help out!
Thank you so much in advance :)
Great work! Love how it was all broken down
I understand the ease of which using the upload tool makes it to create a table and import the data all at once, but I find that it makes it cumbersome in this case since there is no unique column. Is there a reason that you wouldn't create the schema for the table and create an auto incrementing id column that is the primary key to assign a unique id to every row, then use the row_number function to search for duplicate rows using all the columns except the id column. This would save you from having to create a duplicate table to store row_num as you could just use the id column to delete the duplicate records. This also seems like it would make your database easier to deal with since it would have a proper primary key. Sure, it is a meaningless primary key, but it would certainly make updating the data easier and faster in many cases.
while you were updating the date format but there were null values and you could still go ahead and continue the update query. but when i tried to do the same it is not letting me do it and throws an erroe `Error Code: 1411. Incorrect datetime value: 'NULL' for function str_to_date`
hello, were you able to fix this? im facing the same error..please help
@@arnavchopra7708 hello,
I am facing the same problem. how did you fix this issue
@@okodedefaith6582 confirm if you are putting % before the Y
same bruh
@@arnavchopra7708 hey it's actually changed check the correct table
I have a question. I do the exact same thing, but there is nothing when I do the 'removing duplicates' part. I did all the processes again and just realized that mine is only 564 records. I don't know why. Can you explain how to fix it?
I'm having this issue, did you figure it out?
@@ericawelch4218 unfortunately, I have not found the solution. but just keep doing it and ignore those 'differences'.
@@ericawelch4218have you found the solution?
Reload the file and redo the query.
@@Chinmayimanvi hi can you help me with that please?
Thanks for this Alex! I learned a lot from all your videos. Watching this on 2025.
Good evening, I have a problem with downloading data in mysql. The file layoffs.csv in the description has 2361 row and after downloading on my pc too. After importing the data into MySQL using the method in this episode (table data import wizard), only the first 564 rows remain in my table. Has anyone encountered this problem before? At the same time I take this opportunity to thank Alex for the work and time spent
copied this from @nandikran
"If you are facing only 500+ rows of data getting imported where as there are 2000+ rows of data, here is how to fix it:
Step 1: Convert your csv file to JSON using any online CSV to JSON converter.
Step 2: You will have 'None' instead of NULL in your data now. Run this query
UPDATE
layoffs
SET
company = CASE company WHEN 'None' THEN NULL ELSE company END,
location = CASE location WHEN 'None' THEN NULL ELSE location END,
industry = CASE industry WHEN 'None' THEN NULL ELSE industry END,
total_laid_off = CASE total_laid_off WHEN 'None' THEN NULL ELSE total_laid_off END,
percentage_laid_off = CASE percentage_laid_off WHEN 'None' THEN NULL ELSE percentage_laid_off END,
`date` = CASE `date` WHEN 'None' THEN NULL ELSE `date` END,
country = CASE country WHEN 'None' THEN NULL ELSE country END,
funds_raised_millions = CASE funds_raised_millions WHEN 'None' THEN NULL ELSE funds_raised_millions END;
Step 3: You will have all fields as Text, to convert the wrong ones into Int, run this query
ALTER TABLE world_layoffs.layoffs
CHANGE COLUMN total_laid_off total_laid_off INT NULL DEFAULT NULL ,
CHANGE COLUMN funds_raised_millions funds_raised_millions INT NULL DEFAULT NULL ;
Now your data is identical to Alex's"
Hey did you resolve this?? I am facing the same problem. It just imports 7 records from the Excel file.
saw a comment below and I got it. Converted the CSV to a JSON file
You may try to import it again, left click on the layoffs table that have been created (or imported) before, choose Table Data Import Wizard, repeat the same process as before, the only difference is use existing table instead of create new one. This fix it for me without converting the raw data from csv to json.
@@jessrodrigues6252 Thank you. Very helpful.
Truly Blessing for us.
Also i wanted to add for the null values when it came to total_laid_off and percentage_laid_off, for companies that had multiple rows and some of those multiple rows had null values but at least one had all values we could have gotten the values for those one by using eg: total_laid_off = 20 and percentage_laid_off = 0.5 then we can assume that that company had a total of 40 and then used that to populate the similar rows for the company if it had say the percentage laid off or the total laid off but that only works if hiring did not happen and a different date could still share the same base population as the one we found
Hi Alex, i can't thank you enough, its my 19th video and i can feel like im getting to understand the idea behind the language. i want to ask, concerning values like percentage laid off, is that where measures of central tendencies takes place to fill in the nulls?. Just thinking out loud. Thank you once again Alex
Thanks Alex for this amazing video
Looking forward to next project
I'm having problem importing the data, the orginal data has 2000+ rows, but when I import it, it only has 564. Does anyone know how to fix this issue?
Same, if anyone knows a video I can watch to fix it just let me know. 👍🏽
@@Pato-rt1vh convert that .csv into a json!
I am facing the same issue. I just came to the comment section to see if anyone can bring some light to it. I really want to practice and do the project, it's frustrating to be stuck right before starting.
I’m having the same issue
I faced the same issue. I don't have a solution but a suggestion. Just follow through the video with whatever data u could import. Won't be perfect, but try to get what we're trying to do here. Then just try to practice data cleaning on some raw data which you can find on kaggle.
Thank you, Data guy, brilliant work)
Thank you for your help. I appreciate your videos a lot
Thank you very much Alex!!!
Great tutorial, thanks. Only, you started by deleting duplicates, and later on you standardize data, but by doing so you might have created new duplicates. so I suggest to swap those 2 steps.
Hi Alex, this is indeed a great project to add in my portfolio project. Thanks for this project and taking your time to go through the tasks step by step. I wanted to find out if the order of executing the tasks is important. For instance, I would have handled the project in the order: standardization, handling missing and null values, remove duplicates, e.t.c. What is your take on this?
Dear Alex,
First and foremost thank you very much for this bootcamp which is just mind-blowing (and also for making it for free). Secondly, I just dowloaded the gitHub file but the moment I import the csv file it shows "564 records imported"instead of more than 2000. How come? What am I missing? Thank you in advance boss!
Hey @blankninja5158 I had this exact same problem! It took me ages to figure out lol but basically I think MySQL has a problem with reading non-ASCII characters, which are like the weird looking characters. What I did was open the CSV file in an excel sheet and edited the data file there. In the company column, there was a company called Ualá which I changed to Uala, and in the location column there was Düsseldorf, Florianópolis and Malmö which I changed to Dusseldorf, Florianopolis and Malmo (in the real names of those places the vowels have accents which I think caused the problem in the first place). I then saved the excel file as a CSV file again and it imported all the data no problem. Hope this helps!
I have no idea why Alex didn't have this problem in the video though, but I'm using a mac so maybe it had something to do with that?
@@AviShandilya Mac user as well.. I don't know but no one is replying and I am literally going mad. Found no solution and cannot continue or end the course.. so annoying
Hello, great video. However, I'm having trouble transferring all the data from the file to MySQL. Do you have any suggestions?
Bro, same here. I only get around 500 imported.
@@nicolekanigina Did u end up finding a solution?
also having this problem