Netflix Data Cleaning and Analysis Project | End to End Data Engineering Project (SQL + Python)
HTML-код
- Опубликовано: 26 июл 2024
- In this video we will implement an end to end ELT project. ELT stands for Extract, Load and Transform . We will use Netflix dataset to clean and analyze the data using SQL and Python.
LinkedIn: / ankitbansal6
High quality Data Analytics affordable courses: www.namastesql.com/
End to End ETL project : • End to End Data Analyt...
Netflix dataset: www.kaggle.com/datasets/shiva...
GitHub Project Link: github.com/ankitbansal6/netfl...
Zero to hero(Advance) SQL Aggregation:
• All About SQL Aggregat...
Most Asked Join Based Interview Question:
• Most Asked SQL JOIN ba...
Solving 4 Trick SQL problems:
• Solving 4 Tricky SQL P...
Data Analyst Spotify Case Study:
• Data Analyst Spotify C...
Top 10 SQL interview Questions:
• Top 10 SQL interview Q...
Interview Question based on FULL OUTER JOIN:
• SQL Interview Question...
Playlist to master SQL :
• Complex SQL Questions ...
Rank, Dense_Rank and Row_Number:
• RANK, DENSE_RANK, ROW_...
#sql #dataengineering #projects
Please like the video as it takes a lot of effort to record a video of more than 1 hour. It will motivate me to create more long form videos.
GitHub and all related links in the the description box. Thanks for watching !!!
Thank you Ankit this is really amzing .. once started and finished in one go...
10:22 Try many times but not get korean name in sql database
i created a table and put insert also but it shows only ????
now i surrender😒
Use nvarchar
love it Ankit... what an effort
really good work, easy to understand.
Great efforts sir
Great effort in putting the whole project together 🤟🏻
Thanks a ton!
well-done.
Thank you so much Sir 🙏 . Thank you for giving hope for people like me . Keep inspiring ✨
It's my pleasure
It was a really nice project. Had a good hands on in sql.
Great 😊
Thanks for this project
My pleasure
Thanks so much @Ankit this valueable video for me. I have an interview with red hat in up coming 3 days as an associate data analysts. I learn lot from your Videos. You are litterly SQL king because you write in very simple manner so that every one can understand . You are my mentor with your videos I am able to solve questions like you . Salute you @Ankit 😎😎😎
@rahulrachhoya2716 I have seen career portal, no such DA role, can you help me for the same, as we are on the same boat, thanks in advance .
Thank You Very Much Sir
Most welcome
thanks Ankit, the effort you put in your lectures is admirable, learned a lot of new things today from this video 💌
My pleasure 😊
Thanks a lot sir for this valuable project.Can you please make a video on cross apply . I have watched your SQL course didn't find it .
Awesome, can we have series on Python from basics that can be useful for analysis..
Hi Ankit
there is column duration in netflix_raw table having values with min ,season so if need to find avg of duration for season as well how to get the details ,I believe we need to populate the values like other table we did. can you guide how we can do it..
Thank you very much sirji.... 🙏🙏🙏
Most welcome
Thanks a lot. Could you please show how to download data from API?
Great content btw.
Check the first part of this video
ruclips.net/video/uL0-6kfiH3g/видео.html
I am facing issues in jupyter notebook like path not exist
please do a detail video on how to do connection from jupyter to sql and explain all engine conn, sqlalchemy etc
Watch previous project video
Even after giving data type as nvarchar, I cannot see other characters rather than english in my database
This is wonderful, can I use this same method for postgresql? Please help me...
Yes
Please help me any of video to give me to create directory of kaggle in local machine.m
Hi @Ankit Bansal, Are there any additional settings needs to be done in SQL server management studio for the special characters to be visible. I have followed the steps twice, but still it is showing question mark for me.
Data type should be nvarchar ?
@@ankitbansal6yes I am giving nvarchar only
@@ankitbansal6 even after using nvarchar, special characters are still showing as ? ? ? ?.
check reply to @VaibhaviSuresh-bw8hq
Hi Ankit, I can't see the Japanese characters in title post changing the dtype to nvarchar it's showing question marks. I've been searching what could be the reason. Need you suggestion to resolve this.
Do you solve it? I have the same problem after changing the data type
park
Thanks alot sir.
I have a suggestion please at the end of the video or in description please put how someone can mention this project in resume with project description in bullet points I am a fresher so it will help me alot.
Thank you so much sir ❤
In Netflix table why its 8807 it should be 8804 after removing 3 duplicates and where clause is removed by mistake?
You are right where clause I missed to retain unique rows. My bad.
at 28:40, you said that we can't see null because of string split.. Just my thought, isn't it because you removed null at 8:44?
I didn't remove it. It was just checking the max length and that time removed in analysis only. Not in actual data
@@ankitbansal6 oh yes, you're right, my bad.
Your content is really helpful and very easy to follow. keep uploading such videos. Thank you!
Hi @Ankit Bansal, I have tried a lot in creating a table using the nvarchar but still it shows the ??? Question mark sign and I have seen all the replies in the comment box but I couldn't find the solution for it. Please help it out so that I can proceed with the project.
You can leave it as it is and proceed to the next tasks .
@@ankitbansal6 Okay, Thanks
For question number 2 for SQL analysis.
Your inner join with netflix table how you are joining on ng.show_id = nc.show_id.....shouldn't be ng.show_id = n.show_id ??
Please clarify my doubt 🙋♂️ 🙏.
And if its wrong , how it still gave output for below mapping??
ng.show_id = nc.show_id
Please help me to by this combo course, i want to learn SQL in Hindi and python in English
Send email to sql.namaste@gmail.com
How can we removed special sign like ₹ sign symbol in MSSQL server ,i am not able to do it.
Use replace function
@@adilmajeed8439 not working for ₹ sign.
Is it end to end data engineering project ? Looks like etl only rught
It's a end to end ETL project which comes under Data Engineering. Hope you got it.
How to use PostgreSQL here if MS SQL is not present any ref video will be helpful..
You can just Google . It's a simple change.
i have figure it out how to import in postgreSQL thakns for amzing project video
Hello Sir!,
Really appreciate your efforts sir, i wanna ask something that can i add this project in my resume for data analyst role?
Someone who see this, please reply....
Yes, you can
@@ankitbansal6 Okay Sir😊
Hi @ankitbansal6, Thanks for making this video its really helpful and informative. I am also trying to implement the same but encountering one small issue, I am not able to convert the special characters into string even after changing the table definition to nvarchar still I ma getting the value as '????'. Can anyone help me with this? I have also tried to load the data using the encoding encoding='utf-8' in my pyspark script.
There seems to be a problem with collation & along with 'nvarchar', we need to change the collation for database as well.
You can fix that by writing this code:
ALTER DATABASE [Database_Name] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
GO
ALTER DATABASE [Database_Name] Latin1_General_100_CS_AS_KS_WS_SC_UTF8;
GO
ALTER DATABASE [Database_Name] SET MULTI_USER;
GO
just adjust your database name in below [Database_Name] & it should work fine!
[Edit: these is slight change in the collation name]
Okay Thankyou!😊 I will try
@@vaibhavisuresh04 Hi, Could you please let me know, if the issue got resolved or not?
Sir how I can connect with mysql
import pandas as pd
import pymysql
from sqlalchemy import create_engine
# Database connection details
username = 'your username'
password = 'your password'
host = 'host'
port = 'port number'
database = 'your database name'
# Create pymysql connection
connection = pymysql.connect(host=host,
port=port,
user=username,
passwd=password,
db=database)
df = pd.read_csv('netflix_titles.csv')
connection_string = f"mysql+mysqlconnector://{username}:{password}@{host}/{database}"
engine = create_engine(connection_string)
try:
df.to_sql('netflix_raw', con=engine, index=False, if_exists='append')
print("DataFrame written to MySQL table 'netflix_raw' successfully.")
except Exception as e:
print(f"Error: {e}")
Just Google. It's a simple change
sir, my title stiill remain froreign language even after i change the data types to nvarchar, how do i fix this problem? thank you sir
Leave it. Proceed to next steps
@ankitbansal6 alright sir 👍
bhai tumhara sql to bot bhadiya hai
Still I am getting Questions marks for title even it is nvarchar, how can I resolve it?😒
@ankithbansal6