Netflix Data Cleaning and Analysis Project | End to End Data Engineering Project (SQL + Python)

Поделиться
HTML-код
  • Опубликовано: 26 июл 2024
  • In this video we will implement an end to end ELT project. ELT stands for Extract, Load and Transform . We will use Netflix dataset to clean and analyze the data using SQL and Python.
    LinkedIn: / ankitbansal6
    High quality Data Analytics affordable courses: www.namastesql.com/
    End to End ETL project : • End to End Data Analyt...
    Netflix dataset: www.kaggle.com/datasets/shiva...
    GitHub Project Link: github.com/ankitbansal6/netfl...
    Zero to hero(Advance) SQL Aggregation:
    • All About SQL Aggregat...
    Most Asked Join Based Interview Question:
    • Most Asked SQL JOIN ba...
    Solving 4 Trick SQL problems:
    • Solving 4 Tricky SQL P...
    Data Analyst Spotify Case Study:
    • Data Analyst Spotify C...
    Top 10 SQL interview Questions:
    • Top 10 SQL interview Q...
    Interview Question based on FULL OUTER JOIN:
    • SQL Interview Question...
    Playlist to master SQL :
    • Complex SQL Questions ...
    Rank, Dense_Rank and Row_Number:
    • RANK, DENSE_RANK, ROW_...
    #sql #dataengineering #projects

Комментарии • 86

  • @ankitbansal6
    @ankitbansal6  2 месяца назад +12

    Please like the video as it takes a lot of effort to record a video of more than 1 hour. It will motivate me to create more long form videos.
    GitHub and all related links in the the description box. Thanks for watching !!!

    • @simplytech4u898
      @simplytech4u898 2 месяца назад +1

      Thank you Ankit this is really amzing .. once started and finished in one go...

    • @Hope-xb5jv
      @Hope-xb5jv 2 месяца назад +1

      10:22 Try many times but not get korean name in sql database
      i created a table and put insert also but it shows only ????
      now i surrender😒

    • @kumarsumit6117
      @kumarsumit6117 Месяц назад

      Use nvarchar

  • @vijayakanthanannamlai
    @vijayakanthanannamlai 2 месяца назад +1

    love it Ankit... what an effort

  • @ishmeenkaur8299
    @ishmeenkaur8299 Месяц назад

    really good work, easy to understand.

  • @ritu-pf1jy
    @ritu-pf1jy 2 месяца назад +1

    Great efforts sir

  • @pavitrashailaja850
    @pavitrashailaja850 2 месяца назад +2

    Great effort in putting the whole project together 🤟🏻

  • @neeraj_dama
    @neeraj_dama 2 месяца назад +1

    well-done.

  • @livelovelaugh4050
    @livelovelaugh4050 2 месяца назад +1

    Thank you so much Sir 🙏 . Thank you for giving hope for people like me . Keep inspiring ✨

  • @msk-pl3hw
    @msk-pl3hw Месяц назад +1

    It was a really nice project. Had a good hands on in sql.

  • @Random_World_
    @Random_World_ 2 месяца назад +1

    Thanks for this project

  • @rahulrachhoya2716
    @rahulrachhoya2716 2 месяца назад +2

    Thanks so much @Ankit this valueable video for me. I have an interview with red hat in up coming 3 days as an associate data analysts. I learn lot from your Videos. You are litterly SQL king because you write in very simple manner so that every one can understand . You are my mentor with your videos I am able to solve questions like you . Salute you @Ankit 😎😎😎

    • @saikanth447
      @saikanth447 2 месяца назад

      @rahulrachhoya2716 I have seen career portal, no such DA role, can you help me for the same, as we are on the same boat, thanks in advance .

  • @pavanmadamset
    @pavanmadamset 2 месяца назад

    Thank You Very Much Sir

  • @manishpal2937
    @manishpal2937 Месяц назад

    thanks Ankit, the effort you put in your lectures is admirable, learned a lot of new things today from this video 💌

  • @saikatofficial420
    @saikatofficial420 2 месяца назад +1

    Thanks a lot sir for this valuable project.Can you please make a video on cross apply . I have watched your SQL course didn't find it .

  • @MiteshYadav
    @MiteshYadav Месяц назад

    Awesome, can we have series on Python from basics that can be useful for analysis..

  • @simplytech4u898
    @simplytech4u898 2 месяца назад +1

    Hi Ankit
    there is column duration in netflix_raw table having values with min ,season so if need to find avg of duration for season as well how to get the details ,I believe we need to populate the values like other table we did. can you guide how we can do it..

  • @user-xl4zd8yu1e
    @user-xl4zd8yu1e 2 месяца назад +1

    Thank you very much sirji.... 🙏🙏🙏

  • @eemayo5889
    @eemayo5889 2 месяца назад +1

    Thanks a lot. Could you please show how to download data from API?
    Great content btw.

    • @ankitbansal6
      @ankitbansal6  2 месяца назад +2

      Check the first part of this video
      ruclips.net/video/uL0-6kfiH3g/видео.html

  • @sakshiawadhiya7267
    @sakshiawadhiya7267 2 месяца назад +1

    I am facing issues in jupyter notebook like path not exist

  • @MayankGadiya-uq1el
    @MayankGadiya-uq1el 2 месяца назад

    please do a detail video on how to do connection from jupyter to sql and explain all engine conn, sqlalchemy etc

    • @ankitbansal6
      @ankitbansal6  2 месяца назад

      Watch previous project video

  • @niravshah5038
    @niravshah5038 2 месяца назад

    Even after giving data type as nvarchar, I cannot see other characters rather than english in my database

  • @austinmkruahsr.615
    @austinmkruahsr.615 Месяц назад

    This is wonderful, can I use this same method for postgresql? Please help me...

  • @adityajoshi2797
    @adityajoshi2797 Месяц назад

    Please help me any of video to give me to create directory of kaggle in local machine.m

  • @tanyachugh1640
    @tanyachugh1640 2 месяца назад +2

    Hi @Ankit Bansal, Are there any additional settings needs to be done in SQL server management studio for the special characters to be visible. I have followed the steps twice, but still it is showing question mark for me.

    • @ankitbansal6
      @ankitbansal6  2 месяца назад

      Data type should be nvarchar ?

    • @tanyachugh1640
      @tanyachugh1640 2 месяца назад

      @@ankitbansal6yes I am giving nvarchar only

    • @TarunDhimanOfficial
      @TarunDhimanOfficial 2 месяца назад

      @@ankitbansal6 even after using nvarchar, special characters are still showing as ? ? ? ?.

    • @piyushsharma8294
      @piyushsharma8294 2 месяца назад

      check reply to @VaibhaviSuresh-bw8hq

  • @mansinayak3360
    @mansinayak3360 24 дня назад +3

    Hi Ankit, I can't see the Japanese characters in title post changing the dtype to nvarchar it's showing question marks. I've been searching what could be the reason. Need you suggestion to resolve this.

    • @Kelvin2568
      @Kelvin2568 17 дней назад +1

      Do you solve it? I have the same problem after changing the data type

    • @aminfaisalla
      @aminfaisalla 5 дней назад

      park

  • @mohammadfurquan241
    @mohammadfurquan241 2 месяца назад

    Thanks alot sir.
    I have a suggestion please at the end of the video or in description please put how someone can mention this project in resume with project description in bullet points I am a fresher so it will help me alot.
    Thank you so much sir ❤

  • @roopesh3837
    @roopesh3837 2 месяца назад +1

    In Netflix table why its 8807 it should be 8804 after removing 3 duplicates and where clause is removed by mistake?

    • @ankitbansal6
      @ankitbansal6  2 месяца назад +1

      You are right where clause I missed to retain unique rows. My bad.

  • @manishasaxena9829
    @manishasaxena9829 2 месяца назад

    at 28:40, you said that we can't see null because of string split.. Just my thought, isn't it because you removed null at 8:44?

    • @ankitbansal6
      @ankitbansal6  2 месяца назад

      I didn't remove it. It was just checking the max length and that time removed in analysis only. Not in actual data

    • @manishasaxena9829
      @manishasaxena9829 2 месяца назад

      @@ankitbansal6 oh yes, you're right, my bad.
      Your content is really helpful and very easy to follow. keep uploading such videos. Thank you!

  • @shubhamravikar6029
    @shubhamravikar6029 2 месяца назад +1

    Hi @Ankit Bansal, I have tried a lot in creating a table using the nvarchar but still it shows the ??? Question mark sign and I have seen all the replies in the comment box but I couldn't find the solution for it. Please help it out so that I can proceed with the project.

    • @ankitbansal6
      @ankitbansal6  2 месяца назад

      You can leave it as it is and proceed to the next tasks .

    • @shubhamravikar6029
      @shubhamravikar6029 2 месяца назад +1

      @@ankitbansal6 Okay, Thanks

  • @abhinavumrao8453
    @abhinavumrao8453 2 месяца назад

    For question number 2 for SQL analysis.
    Your inner join with netflix table how you are joining on ng.show_id = nc.show_id.....shouldn't be ng.show_id = n.show_id ??
    Please clarify my doubt 🙋‍♂️ 🙏.

    • @abhinavumrao8453
      @abhinavumrao8453 2 месяца назад

      And if its wrong , how it still gave output for below mapping??
      ng.show_id = nc.show_id

  • @rachitkeelpur
    @rachitkeelpur 2 месяца назад

    Please help me to by this combo course, i want to learn SQL in Hindi and python in English

    • @ankitbansal6
      @ankitbansal6  2 месяца назад

      Send email to sql.namaste@gmail.com

  • @itsyogijangir
    @itsyogijangir 2 месяца назад

    How can we removed special sign like ₹ sign symbol in MSSQL server ,i am not able to do it.

    • @adilmajeed8439
      @adilmajeed8439 2 месяца назад

      Use replace function

    • @itsyogijangir
      @itsyogijangir 2 месяца назад

      @@adilmajeed8439 not working for ₹ sign.

  • @BhakthiYoutube
    @BhakthiYoutube 2 месяца назад

    Is it end to end data engineering project ? Looks like etl only rught

    • @mohammadfurquan241
      @mohammadfurquan241 2 месяца назад +1

      It's a end to end ETL project which comes under Data Engineering. Hope you got it.

  • @simplytech4u898
    @simplytech4u898 2 месяца назад

    How to use PostgreSQL here if MS SQL is not present any ref video will be helpful..

    • @ankitbansal6
      @ankitbansal6  2 месяца назад

      You can just Google . It's a simple change.

    • @simplytech4u898
      @simplytech4u898 2 месяца назад

      i have figure it out how to import in postgreSQL thakns for amzing project video

  • @LaxmiNarayan-pd8qn
    @LaxmiNarayan-pd8qn 7 дней назад

    Hello Sir!,
    Really appreciate your efforts sir, i wanna ask something that can i add this project in my resume for data analyst role?
    Someone who see this, please reply....

  • @VaibhaviSuresh-bw8hq
    @VaibhaviSuresh-bw8hq 2 месяца назад

    Hi @ankitbansal6, Thanks for making this video its really helpful and informative. I am also trying to implement the same but encountering one small issue, I am not able to convert the special characters into string even after changing the table definition to nvarchar still I ma getting the value as '????'. Can anyone help me with this? I have also tried to load the data using the encoding encoding='utf-8' in my pyspark script.

    • @piyushsharma8294
      @piyushsharma8294 2 месяца назад

      There seems to be a problem with collation & along with 'nvarchar', we need to change the collation for database as well.
      You can fix that by writing this code:
      ALTER DATABASE [Database_Name] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
      GO
      ALTER DATABASE [Database_Name] Latin1_General_100_CS_AS_KS_WS_SC_UTF8;
      GO
      ALTER DATABASE [Database_Name] SET MULTI_USER;
      GO
      just adjust your database name in below [Database_Name] & it should work fine!
      [Edit: these is slight change in the collation name]

    • @vaibhavisuresh04
      @vaibhavisuresh04 2 месяца назад

      Okay Thankyou!😊 I will try

    • @tanyachugh1640
      @tanyachugh1640 2 месяца назад

      @@vaibhavisuresh04 Hi, Could you please let me know, if the issue got resolved or not?

  • @gamingfun5309
    @gamingfun5309 2 месяца назад +1

    Sir how I can connect with mysql

    • @LearnDataSceince
      @LearnDataSceince 2 месяца назад

      import pandas as pd
      import pymysql
      from sqlalchemy import create_engine
      # Database connection details
      username = 'your username'
      password = 'your password'
      host = 'host'
      port = 'port number'
      database = 'your database name'
      # Create pymysql connection
      connection = pymysql.connect(host=host,
      port=port,
      user=username,
      passwd=password,
      db=database)
      df = pd.read_csv('netflix_titles.csv')
      connection_string = f"mysql+mysqlconnector://{username}:{password}@{host}/{database}"
      engine = create_engine(connection_string)
      try:
      df.to_sql('netflix_raw', con=engine, index=False, if_exists='append')
      print("DataFrame written to MySQL table 'netflix_raw' successfully.")
      except Exception as e:
      print(f"Error: {e}")

    • @ankitbansal6
      @ankitbansal6  2 месяца назад

      Just Google. It's a simple change

  • @aminfaisalla
    @aminfaisalla 5 дней назад

    sir, my title stiill remain froreign language even after i change the data types to nvarchar, how do i fix this problem? thank you sir

  • @sukhwinder101
    @sukhwinder101 2 месяца назад

    bhai tumhara sql to bot bhadiya hai

  • @ladiashrith5230
    @ladiashrith5230 2 месяца назад

    Still I am getting Questions marks for title even it is nvarchar, how can I resolve it?😒
    @ankithbansal6