How I Work With MILLIONS OF ROWS DATA using PYTHON | PYSPARK & BIG DATA

Поделиться
HTML-код
  • Опубликовано: 21 сен 2024

Комментарии • 47

  • @mo-chen
    @mo-chen  Год назад +4

    🎉 Check out Bright Data ➡︎ brdta.com/datawithmo

  • @hamedalcherif9064
    @hamedalcherif9064 Год назад +7

    Big Mo is back.

  • @lorenzreparip4525
    @lorenzreparip4525 Год назад +1

    Now that I've seen this. I am more motivated to learn phyton!! Thanks!

    • @mo-chen
      @mo-chen  Год назад

      I’m glad to hear that! Thanks for watching 😃

  • @sawdawyah
    @sawdawyah 8 месяцев назад +1

    If I can press this 👍more than once I would press it MILLIONS of times [ Thanks! 😁 ] AAAnd I love your videos a lot

    • @mo-chen
      @mo-chen  8 месяцев назад

      That's very kind of you. Thanks a lot for watching!

  • @nhimallansupramaniam2626
    @nhimallansupramaniam2626 Год назад

    Mo i love your videos. Please do a data analytics tutorial that covers python and pyspark

    • @mo-chen
      @mo-chen  Год назад

      Thanks for the kind words! I have a couple other videos on data analysis with Python on the channel already, feel free to check them out. And I'll try my best to make more content with PySpark 😁

  • @nikjojo
    @nikjojo Год назад +2

    do you visualise your findings more on Python or Tableau?

    • @mo-chen
      @mo-chen  Год назад

      Tableau just because it’s so much easier and looks way nicer 😃

  • @TheRhinorock
    @TheRhinorock 5 месяцев назад

    What would be preferred method reading billions of records and aggregating over 400 fields sum, min, max and writing into file or DB.

  • @erickajanee
    @erickajanee Год назад

    Would you recommend any MBA in analytics schools? In person and online? If so maybe a video idea!

    • @mo-chen
      @mo-chen  Год назад +1

      MBA is something that more experienced workers tend to do later down in their careers. If you're starting out, you should focus on your core data analyst skills first. Thanks a lot for watching 😄

  • @mrtayyab3101
    @mrtayyab3101 Год назад

    Please bring a next Streamlit tutorial

    • @mo-chen
      @mo-chen  Год назад

      Great video idea, I'll see what I can do in the future!

  • @airmen_fresh
    @airmen_fresh Год назад

    Out of curiosity what are some pro's and cons' to becoming a data analyst? I'm currently in an Entry Level position to IT (help Desk) and am looking to upgrade or elevate myself in the IT field and have an interest in this field.

    • @mo-chen
      @mo-chen  Год назад

      Pros for me are that I really enjoy my work and get paid well for it. No cons in general. If I really didn't like what I did on a daily basis, I'd just do something different 😃

    • @airmen_fresh
      @airmen_fresh Год назад

      @@mo-chen I'm happy you find something you enjoy and get paid for but what would you say the most complaints and or negativity have you heard about your job?

  • @irfanali8106
    @irfanali8106 Год назад +1

    Hi Mo Chen,
    I'm a big fan of your work! I've been learning data science for the past 3 years, but I'm not sure how to start my career. Your RUclips channel has been a great resource for me, and I'm grateful for your kindness and loyalty.
    how to access the brightdata, site Error(DNS_PROBE_FINISHED_NXDOMAIN) or sample data?
    I have one question. I'm not able to access the site "brightdata" because it's blocked in my country. Would you be able to share same data samples with us so that I can practice on this project? I would be very grateful for your support.
    Thanks!

    • @mo-chen
      @mo-chen  Год назад +1

      Thanks so much for the kind words 😁Using a VPN would be the best way. The sample data doesn't contain many rows so I wouldn't build the project on that. Thanks a lot for watching!

  • @sharath6346
    @sharath6346 Год назад +1

    The code just look like SQL query…. Is pyspark similar to SQL?

    • @mo-chen
      @mo-chen  Год назад +1

      Yes, it has lots of SQL and Python syntax as well. Thanks a lot for watching 😄

  • @welcometomathy
    @welcometomathy Год назад

    Always thought big data was about multiple types of data as dataset like images videos sound, 3d objects and other things stored as a data base . My 100GB power bi dataset doesn't fit big data?

    • @mo-chen
      @mo-chen  Год назад

      There is no clear definition of big data in terms of size. What you're mentioning is unstructured data. Your Power BI 100gb data can safely be considered big data. Thanks a lot for watching 😁

  • @CaribouDataScience
    @CaribouDataScience Год назад +1

    I vote for upgrade computer.

    • @mo-chen
      @mo-chen  Год назад +1

      If money is no issue, of course 😃

  • @karl2477
    @karl2477 Год назад +1

    where is your jumper from?

    • @mo-chen
      @mo-chen  Год назад

      It’s a Massimo Dutti jumper

  • @samikshabhosale6634
    @samikshabhosale6634 9 месяцев назад

    Is this data has changed now? On website?

  • @balixong9704
    @balixong9704 Год назад

    Would you use google sheets over microsoft excel? If yes, then why?

    • @mo-chen
      @mo-chen  Год назад

      I wouldn't. Google Sheets is free which is why most people use it.

  • @samira_pmn6488
    @samira_pmn6488 Год назад

    hello there , can I use this spark things for my HDF5 dataset too? it is so big and exactly as you said I can't work with it even with chunking :(

    • @mo-chen
      @mo-chen  Год назад

      Yes, absolutely!

  • @amanpoojary5782
    @amanpoojary5782 Год назад

    Hi, from where can I learn excel for Data Analytics?? I am fully confused.

    • @mo-chen
      @mo-chen  Год назад

      Please see the website link I put in my other answer to your other comment 😄

  • @IlhamRhamadan-mf8yx
    @IlhamRhamadan-mf8yx Год назад +1

    here we go☕

    • @mo-chen
      @mo-chen  Год назад +1

      Thanks for watching 😃

  • @ajinkyapantode5100
    @ajinkyapantode5100 Год назад

    Do you provide 1on1 mentorship

    • @mo-chen
      @mo-chen  Год назад

      121 mentoring is not something I do right now unfortunately 😅

  • @theav.1313
    @theav.1313 Год назад

    Ok that focus cursor is not helping. It’s distracting. Great content though.

    • @mo-chen
      @mo-chen  Год назад +1

      I'm glad you liked the video! Most people really like the cursor highlighter so I'll keep it for now. Thanks a lot for watching 😁

  • @aayushdedhia5781
    @aayushdedhia5781 Год назад

    Is the dataset free?

    • @mo-chen
      @mo-chen  Год назад

      The sample is 😁