Designing a Data Pipeline | What is Data Pipeline | Big Data | Data Engineering | SCALER

Поделиться
HTML-код
  • Опубликовано: 23 янв 2025

Комментарии • 60

  • @SCALER
    @SCALER  2 года назад +3

    Check out our FREE masterclasses by leading industry experts now: bit.ly/3Apojjv

    • @ankitKumar-js1ow
      @ankitKumar-js1ow 2 года назад +2

      I think scaler should have separate course for Data engineering with Dsa and system design with industry level courses as most of guys are working in data engineer field than as Data science
      Waiting for such quality course to move into product based company

    • @sandeepdash5652
      @sandeepdash5652 10 месяцев назад

      @@ankitKumar-js1ow Till now they do not have a plan/module for Data Engineering .They are simply not interested ..And what they have is DE is just not digestable

  • @akhilcoder
    @akhilcoder 2 года назад +29

    Regular content. Can be easily searched over internet.

    • @coding3438
      @coding3438 Год назад

      Haha

    • @sandeepdash5652
      @sandeepdash5652 11 месяцев назад

      Paid Content is terrible .

    • @sanilkumarbarik9151
      @sanilkumarbarik9151 3 месяца назад

      ​@@sandeepdash5652 is it?

    • @sandeepdash5652
      @sandeepdash5652 2 месяца назад

      @@sanilkumarbarik9151 For DataEngineers its horrible ..not worth enough the time and money if you join for learning DE

  • @ArunSingh-rk7mm
    @ArunSingh-rk7mm 2 года назад +3

    Thank you for talking about a demo pipeline, this could come in handy in interviews.

  • @NasimKhan-vu8oi
    @NasimKhan-vu8oi 6 месяцев назад +1

    Excellent presentation. Presented very nicely, concisely, and to the point.

  • @AkashKumar-kx9vj
    @AkashKumar-kx9vj 2 года назад +1

    Shashank just makes everything so easy to understand

  • @NehaSingh-wp4mf
    @NehaSingh-wp4mf 11 месяцев назад

    Very well explained and all important topics were covered, thankyou for your efforts. Very helpful.

    • @SCALER
      @SCALER  11 месяцев назад

      Thanks! Glad this was helpful! 😃

  • @arunsundar3739
    @arunsundar3739 9 месяцев назад +1

    helps to see the big picture, thank you very much :)

  • @madhivanandurai452
    @madhivanandurai452 28 дней назад +1

    Good one thanks

  • @shaistaqureshi8408
    @shaistaqureshi8408 2 года назад +1

    I just wanna say thank you for this video

  • @StartDataLate
    @StartDataLate 8 месяцев назад

    here is a summary:
    00:57 - Understanding of data domains (example: finance data terminology, what is the relationship, primary key, foreign key. Give business side a clear image what can data engineers provide)
    02:57 - Choosing data sources (example: sql database, distributed file system, API, sensor data, web application generated)
    04:43 - Determine the data ingestion strategy( full load or incremental load)
    08:37 - Design the data processing plan (pipeline design real-time process, or batch process)
    11:11 - Set up storage for the pipeline output ( amazon s3 HDFS for datalake, AWS redshift, Hive for datawarehouse, dump back in transational databases)
    13:19 - Plan the data workflow (scheduler, Apache airflow, apache nifi, Azkaban)
    14:42 - Monitoring and governance tools (alert for pipeline failing, tools: Kibana, Grafana, DataDog, PagerDuty)

  • @AmitSharma-xv6sh
    @AmitSharma-xv6sh Год назад

    This is really really a very detailed and great explanation of end-to-end data pipeline building architecture. Hatsoff to your hardwork and putting this video out there for us brother. It will definitely clear the doubts and picture about how pipeline work for data migration/ingestion/integration based projects.
    Thanks a lot. 🙏

    • @SCALER
      @SCALER  Год назад

      Thanks! Glad this was helpful! 😃

  • @TheSoumyakole
    @TheSoumyakole Год назад +2

    How can NOSQL (specifically Cassandra, MongoDB ) be good for ad-hoc analytical queries as mentioned during 12:05?

  • @daniyaqureshi6201
    @daniyaqureshi6201 2 года назад +1

    Thank you for brilliant video

  • @Rk-mv8sz
    @Rk-mv8sz 2 года назад +1

    Good content . Thank you🙏

  • @healthificteam8465
    @healthificteam8465 2 года назад

    Can't wait!

  • @MarkyGoldstein
    @MarkyGoldstein 5 месяцев назад

    Well presented, thanks

  • @endpermia
    @endpermia Год назад

    Thank you! This was really helpful and well-explained.

    • @SCALER
      @SCALER  Год назад

      Happy to hear that! 🙌🏼

  • @krishnasaksena2364
    @krishnasaksena2364 2 года назад

    Thanks scaler! 🔥

  • @umakantyadav9972
    @umakantyadav9972 2 года назад +1

    Thanks Shashank for explaining in very understandable manner,
    But i have one question you have not discussed about Staging Area??

  • @panktikhurana8906
    @panktikhurana8906 2 года назад +1

    Awesome content 🙂

  • @shrutiikarla1055
    @shrutiikarla1055 2 года назад

    Thank you scaler

  • @asishjoshi5774
    @asishjoshi5774 2 года назад

    very nice.. thanks a ton!

  • @saniyasharif9861
    @saniyasharif9861 2 года назад

    Brilliant video again

  • @avshekraj
    @avshekraj Год назад

    thank you for the nice explanantion

    • @SCALER
      @SCALER  Год назад +1

      Happy to hear that! 🙌🏼

  • @marksun6420
    @marksun6420 Год назад +1

    Thanks

  • @tamannamam3563
    @tamannamam3563 2 года назад

    I easily understand this video

  • @divyanshtayal5077
    @divyanshtayal5077 2 года назад

    Make more vedios Gurudev thankyou very much

  • @abhisekchowdhury8584
    @abhisekchowdhury8584 2 года назад

    Awesome Video

  • @ruthmk
    @ruthmk 10 месяцев назад

    Double like 👍🏽
    Thank you

  • @ramangupta6159
    @ramangupta6159 2 года назад +1

    Grafana is a really good monitoring tool

  • @obiradaniel
    @obiradaniel 2 года назад

    Thank you.

  • @cutipy433
    @cutipy433 2 года назад

    Very nice content

  • @saibabatelagamsetty2538
    @saibabatelagamsetty2538 2 года назад

    Really good Content

  • @FaizanKhan-ct7pc
    @FaizanKhan-ct7pc 2 года назад +1

    As a data engineer, should you know all of these tech before getting a job or is it acquired during one?

    • @Watson22j
      @Watson22j Год назад

      you can easily get an entry level job in data engineering if you know good sql, basic python, basic cloud and hadoop architecture.

  • @saniyapoetry8386
    @saniyapoetry8386 2 года назад

    Very nice 🙂

  • @justdataengineer3138
    @justdataengineer3138 2 года назад

    When will complete Data Engineering course will be launched from Scaler?

  • @it3374
    @it3374 Год назад +2

    Please 1 pipeline practical karke dikhao ...RUclips PE Aisa ek bhi vdo nhiye Jo big data ki pipe line create karke dikhaya ho...

  • @shanayakhan839
    @shanayakhan839 2 года назад

    Redshift is already setup on the cloud, what about Hive?

  • @Sameerkhan-kt5jj
    @Sameerkhan-kt5jj 2 года назад

    More Data engineering related content please

  • @parisreview4651
    @parisreview4651 2 года назад

    You guys did a great job.

  • @nandlaljaiswal7217
    @nandlaljaiswal7217 2 года назад +1

    Need full course for Data Engineer

  • @prachiipandeyy
    @prachiipandeyy 2 года назад

    🔥🔥🔥

  • @piyushjain419
    @piyushjain419 2 года назад +1

    Scaler knows what us students are searching for on google before an exam lol

  • @PankajKumar-vv5db
    @PankajKumar-vv5db 2 года назад

    Here the data source is MySQL, what if there was data coming in from multiple sources.

  • @bangalibangalore2404
    @bangalibangalore2404 2 года назад

    Data Modelling part was missed I guess

  • @ashutoshrai5342
    @ashutoshrai5342 2 года назад +1

    Bumb explanation.What he is explaining is based on his experience.Its not at all generic.He himself needs to improve

  • @nemodbuniversity
    @nemodbuniversity Год назад

    Aadha adhura gyan

  • @sheenagupta896
    @sheenagupta896 2 года назад +1

    Thank you for talking about a demo pipeline, this could come in handy in interviews.

  • @fazaila2047
    @fazaila2047 2 года назад

    Grafana is a really good monitoring tool