@@indexima6517 I guess the videos on ur channel deals with more on, what do we do after receiving the data, analytics if I understand correctly. Here, its more of pumping the data from one place to a common place, and make it available for interested people down the lane
That was great! As a data engineer in the making, this is the first time I have understood the concept of data pipelines so clearly. Thank you very much
Loved this video, probably the best explanation on advanced data pipeline out there. If in your next videos, maybe create a playlist which can show each of the section of this pipeline in detail with little examples using Python or any language etc. Just an idea, brilliant work!
Great video - it seems while technology has advanced, the concepts of batch loads and real-time data is actually decades old. Back in early 2000's we controlled all ETL and real-time loads with Unix or DOS or SQL scripts that provided return codes for success/failure which triggered alert emails, and we had KPI's for Data quality, backing-out jobs for failed loads, and many other control systems. It just seems there are more 'out-of-the-box' software to handle these now as opposed to custom-built solutions. Great presentation!
Thank you! I had read a lot of papers about Data Pipeline, but I couldn't get the main idea. However, your video was so easy to understand!! Now I have a better picture of the complete process. Thanks again.
This topic is so complex as a beginner, but I understand this explanation so well. I didn't even have to go back in the video or rewatch it to understand. This is beautiful.
This would be the Best start for the Data Engineers.. A clear precise and short pictorial representation of Data Pipeline (Basics). Best video so far I had seen.. 😊 Thanks.. Much Appreciated.. 👍
Do data analysts also use data pipeline creation in their jobs ? Or are they expected to know it ? Asking as some companies write knowledge of ETL in JDs.
@@vivekjoshi3769 knowing any of the ETL tools would help in constructing the pipelines and they can visualize data flow from source to target.. Yes mostly it is used..
Would love to learn more about how to choose the right frameworks/technologies for data pipelines and data warehouses/lakes for differing requirements. It would be nice to see a playlist of you designing or comparing solutions for an analytic stack.
Thanks MrBignate I have created various playlists one of which is " Crunching Data Series "...I will surely make more videos on similar topic. It is because of encouragement from audience like you which helps me move forward so thanks and really grateful for your positive feedback.
Thank you. The best way of explanation. I was looking for this kind of video for long time. As a traditional ETL developer, I questioned my self, why people are using a term called 'Data pipeline' though we have ETL process and what is the exact difference between them. Thanks again.
Thank you for the video, I learnt what data lake hydration projects are, my previous company had no proper KT, I struggled to grasp what I was doing. This was very nicely explained and cleared the doubts that I had.
☺️I’m new in Data Engineering and man you created a clear picture of what I’ve been learning and trying to understand 🙂love this… definitely subscribing 🤩
Thank you very much, very elaborate and concise, this import for everyone in the technical data cycle, data engineer, analyst, administrator and data scientist.
It is good that u explain the concept of data pipeline by referring to water pipeline. So much easier to understand and remember. Thank you for your video!!
Hi Anshul, your video was helpful. I have experience with ETL but didn't know that it was a specific type of data pipeline. Thanks for showing the different type of systems and technologies used for the concept visual that you explained with.
Wov, I think I just watched one of the best explanation video in my life. You did an amazing job! The structure you explain the details and use cases, the examples you give in real world applications made a lot of sense to me. Thank you so much!
This is a very good explanation and the best I have seen so far in my quest to understand this concept. Thank you very much. Now I can confidently visualize and explain the same concept with ease and a great understanding of it.
Great visual layout. Would love to see this applied to an ELT model with Snowflake and it's advantages/disadvantages. Possibly a suggestion on ML complementary tools like Looker and Kraken.
Thank you so much for a great and easy to understand data pipeline introduction. I love how you focus on the concepts and not jargons, as it allows for people to understand the essence of data pipeline.
Excellent high level overview Anshul, I appreciate that you differentiated between batch data and real time data with the Lambda Architecture as it seems most applicable to modern organizations. Your explanation of dashboards as consumers was also very realistic. Your video helped me better understand the general steps in the process. +1 Subscriber.
Thanks for a great overview of how the Lambda architecture can expedite the delivery of data to data consumers. For future videos, it would be helpful to map this to the roles, responsibilities, and skill requirements needed to manage this environment.
Thank you so much brother, for clarifying some of the concepts.. Truly appreciate it. Can you suggest - Which way is the Tech Heading now - Data Warehouse Vs. Data Lake? Are DWH a thing of past?
Thanks Sourabh, DWH is here to stay its not going anywhere. Today data world has become enormously huge and there is space for DWH and DL to co exist also Datalake can not solve all business problem. There is a hybrid approach coming up wherein you have your DWH on top of your Datalake
Anshul: Thanks a lot for this great video, you not only explained clearly the concepts, but also gave us the name of useful products for doing each step of the process. Thank you very much.
One of the best tutorials in youtube so far which gives an overview of data engineering process and that too within 10 minutes. Really appreciate your effort and time you put into making this video. Thank you so much. Please keep doing more such tutorials.
Great video. Thanks for the brief explanation of data pipeline. I‘m not a technical guy more from the business side and I could easily follow your concept. So, kudos for you and a big thank you!
Third time I think I am saying this on your channel comments. Your tutorials are useful for professionals like me - I am a marketer in a SaaS product firm. I come here regularly so that I can be better informed when interacting with developers. Well done. And more importantly thanks!
Superb! In 10 minutes, you have put such a clear picture of data pipeline in my mind that I will never forget! Many thanks for your time and sharing this valuable piece!
WoW....Your lecture on Data pipeline is so simple and lucid to understand unlike other youtubers. I loved the way you have explained the concepts....Please do more videos on Azure and AWS...Count on me as a new subscriber has been added!!
best video ever to learn from, it precisely helps me to understant this topic. just lovedd it. just go for it without a second thought... i can asure uh.
Thank you so much for this video! Very informative and gives holistic idea about the data pipelines. We use data pipelines in our project too. This is very useful. Keep up your great work
Tnx so much for this explanation! been looking for a couple of days video to understand this subject and this one was without any doubt the best! short,sharp, visualisation that made it really reachable,amazing🙏🏻
Thank you for your high-quality videos! In our use case, we ingest daily a .zip file containing 3 .csv’s related to sales, inventory and orders from different shops (20-30) and CRMs (4-5 ; each one with its own naming convention, dtypes, …). How would you improve the following pipeline? - Raw zip files are uploaded to a GCP bucket - The upload triggers a Python GCP Cloud function that transforms the data to create single naming/dtypes conventions and brief new columns (e.g. timestamp by merging date + time) - Transformed data is uploaded to MongoDB - 3 separate collection for sales, inventory and orders - and raw .csv’s to a separate GCP bucket as parquet files (1 folder for each CRM and PoS as subfolder) - A PubSub message posted by the function triggers a GCP Function that loads processed data from MongoDB, applies ML models and stores results in separate collections (1 for each analysis type; e.g. forecast, anomaly detection, …) - A Python web app directly reads ML output data from MongoDB Thank you so much and love your videos; 🤗
Master piece tutorial for data engineering
Thanks Siva
hey! don't hesitate to follow us and to take a look at our videos which deal with the same topics :)
@@indexima6517 I guess the videos on ur channel deals with more on, what do we do after receiving the data, analytics if I understand correctly.
Here, its more of pumping the data from one place to a common place, and make it available for interested people down the lane
@@ITkFunde It's truly one of the finest and easiest video to follow and relate. Many thanks. Will check other videos.
Thank you for breaking down concepts that are difficult to understand!
Something I’ve noticed is that Indians are good teachers and give great illustrations. Good work. Greetings from the US.
Thanks Nathan for making me feel even more proud of being an Indian thank you for the compliment means a lot brother 🙏😊
Yes Indians like to make difficult concept easy
I don’t use my real name online, but I do give real compliments.
@@nathancarranza9860 Plot Twist: His real name was not Nathan. It was always Vladimir Putin.
Can't believe Putin is from US
That was great! As a data engineer in the making, this is the first time I have understood the concept of data pipelines so clearly. Thank you very much
Hello Eric, I'd love to know how it's going for you at the moment with the DE track
Loved this video, probably the best explanation on advanced data pipeline out there. If in your next videos, maybe create a playlist which can show each of the section of this pipeline in detail with little examples using Python or any language etc. Just an idea, brilliant work!
Great Teacher!
Used your Video for my teaching in switzerland
Thanks Mate, so happy to see it's helping 😊😊
Great video - it seems while technology has advanced, the concepts of batch loads and real-time data is actually decades old. Back in early 2000's we controlled all ETL and real-time loads with Unix or DOS or SQL scripts that provided return codes for success/failure which triggered alert emails, and we had KPI's for Data quality, backing-out jobs for failed loads, and many other control systems. It just seems there are more 'out-of-the-box' software to handle these now as opposed to custom-built solutions. Great presentation!
Nice
Thanks!
thanks
Please continue to create videos like these! So easy to understand. Love your visual teaching style and the examples you give.
Thank you MrBignate...The aim is to simplify these techie jargons for everyone to correlate and enjoy learning.
Thank you! I had read a lot of papers about Data Pipeline, but I couldn't get the main idea. However, your video was so easy to understand!! Now I have a better picture of the complete process. Thanks again.
Thank you Alexander !!!
Simply one of the best videos on data pipeline on RUclips. Deserves so much more attention.
This topic is so complex as a beginner, but I understand this explanation so well. I didn't even have to go back in the video or rewatch it to understand. This is beautiful.
Thank you so much for your kind words and support 🙏🙏♥♥
And again, another easy-to-digest video. Thumbs up!
Thank you 🙏🙏☺️
Great intro, just what I needed. I learned the distinction between ETL and general pipe lines, and Kafka's place in the architecture.
Thanks Ronnie☺️
This would be the Best start for the Data Engineers.. A clear precise and short pictorial representation of Data Pipeline (Basics). Best video so far I had seen.. 😊 Thanks.. Much Appreciated.. 👍
Thanks Prabu 👍☺️🙏
Do data analysts also use data pipeline creation in their jobs ? Or are they expected to know it ?
Asking as some companies write knowledge of ETL in JDs.
@@vivekjoshi3769 knowing any of the ETL tools would help in constructing the pipelines and they can visualize data flow from source to target.. Yes mostly it is used..
Would love to learn more about how to choose the right frameworks/technologies for data pipelines and data warehouses/lakes for differing requirements. It would be nice to see a playlist of you designing or comparing solutions for an analytic stack.
Thanks MrBignate I have created various playlists one of which is " Crunching Data Series "...I will surely make more videos on similar topic. It is because of encouragement from audience like you which helps me move forward so thanks and really grateful for your positive feedback.
Thank you. The best way of explanation. I was looking for this kind of video for long time. As a traditional ETL developer, I questioned my self, why people are using a term called 'Data pipeline' though we have ETL process and what is the exact difference between them. Thanks again.
Thanks Rama for your positive feedback !!
This guy really explain everything clearly and simple!
Good job brother, keep sharing and contributing! You're a great teacher :)
Thanks Julian 😊❤️🙏
Thank you for the video, I learnt what data lake hydration projects are, my previous company had no proper KT, I struggled to grasp what I was doing. This was very nicely explained and cleared the doubts that I had.
Thanks♥️
Great job explaining the difference between Data Pipelines and ETL.
Thanks Ken 🙏☺️
☺️I’m new in Data Engineering and man you created a clear picture of what I’ve been learning and trying to understand 🙂love this… definitely subscribing 🤩
Your way of explaining these concepts is excellent, thank you!
Thanks a lot
Great explanation for introduction to data pipelines. Thanks for clarifying the distinction between ETL and data Pipelines.
Love your way of teaching in a simple understandable concepts. Im mad of you..!
Thanks Kalyan for your feeback it helps a lot..
Thank you very much, very elaborate and concise, this import for everyone in the technical data cycle, data engineer, analyst, administrator and data scientist.
Simplified and clear explanation of the concepts. Great diction and presentation. Well done!
thanks Kolawale
Very elegant way to explain data pipelining and ETL approach. I appreciate the examples given especially the master data management. Well done.
Very informative, especially for a non-computer science guy like myself. Thanks!
Thanks Brent that is the essence of this channel - Making I.T. interesting for everyone.
This is meant to be a compliment. I appreciate how articulate your English is with each word you speak! Easy to listen to!
I got more out of your video than reading 5 articles on the matter! Your content is great!
It is good that u explain the concept of data pipeline by referring to water pipeline. So much easier to understand and remember. Thank you for your video!!
Hi Anshul, your video was helpful. I have experience with ETL but didn't know that it was a specific type of data pipeline. Thanks for showing the different type of systems and technologies used for the concept visual that you explained with.
Thank you Kyle coming from an experienced guy means a lot. Hoping for continued support !!
Wov, I think I just watched one of the best explanation video in my life. You did an amazing job! The structure you explain the details and use cases, the examples you give in real world applications made a lot of sense to me. Thank you so much!
Thanks Elif for your kind words means a lot ☺️🙏
This is superb!. I am very strange to Data Engineering, and this video gave me a super insight! Keep up the good work
Thanks Ravindu ☺️
This is excellent. Really interesting and easy to follow. I am just starting training with IBM to be a Data Engineer. Leaving healthcare for good!
Thanks a lot ☺️☺️🙏
This is a very good explanation and the best I have seen so far in my quest to understand this concept. Thank you very much. Now I can confidently visualize and explain the same concept with ease and a great understanding of it.
Thanks Jibril glad it helped 🙏☺️
I am a newbie to this ETL process, confused with all jargons! This definitely helped to get the picture of it. Keep up the good work
Great visual layout. Would love to see this applied to an ELT model with Snowflake and it's advantages/disadvantages. Possibly a suggestion on ML complementary tools like Looker and Kraken.
A very clear explanation of the differences between the two methods. Often I see everything limped under an ETL umbrella, when it may not accurate.
Thanks 🙏
best productive 10 minutes of my life.
Thanks Dhritiman for this super comment you made my day 🙏☺️
Your teaching technique is amazing. Thank you for sharing the knowledge on data pipeline. My all doubts related to data pipeline is clear now.
Thank you so much for a great and easy to understand data pipeline introduction. I love how you focus on the concepts and not jargons, as it allows for people to understand the essence of data pipeline.
Thank you for this simple and clear explanation of data pipeline. Now I have a clear picture of how data flows from consumer to producer
Excellent high level overview Anshul, I appreciate that you differentiated between batch data and real time data with the Lambda Architecture as it seems most applicable to modern organizations. Your explanation of dashboards as consumers was also very realistic. Your video helped me better understand the general steps in the process. +1 Subscriber.
Thanks Matthew for supporting ❤️
It’s eyes opening and matching pieces in my head into logic, really thankful !
Great explanation and examples used. Thanks a ton !!
Thanks Manny
Wonderful Explanation and you really hit the point straight and clear about data pipe lines in a short and precise manner. ThanQ very much.
Thanks 🙏☺️
Excellent Explanation. Keep making more videos regarding Data Engineering, AI, and Data Science.
Thanks a lot mate for your feedback and suggestion!!
Very effective lecture in introducing the data pipeline and promote to adopt in improving the Business /egovernance services and advisories
Thanks Anjani
Thanks for a great overview of how the Lambda architecture can expedite the delivery of data to data consumers. For future videos, it would be helpful to map this to the roles, responsibilities, and skill requirements needed to manage this environment.
Thanks Mike for suggestion will try to add this
You are a real "Data Pipeline Spiderman".... fantastic instructor..please share more videos....thanks
Thanks Rama ☺️☺️
Really content. Bravo from France 👏👏👏
Merci Mael 😊
Yes I must say this is very concise and how he names the commercial vendors as examples really augments the value further.
Thank you so much brother, for clarifying some of the concepts.. Truly appreciate it. Can you suggest - Which way is the Tech Heading now - Data Warehouse Vs. Data Lake? Are DWH a thing of past?
Thanks Sourabh, DWH is here to stay its not going anywhere. Today data world has become enormously huge and there is space for DWH and DL to co exist also Datalake can not solve all business problem. There is a hybrid approach coming up wherein you have your DWH on top of your Datalake
Data Mesh
I am prepping for an interview and preparing how to talk about this topic. You explain this very simple and easy to follow. Thank you.
Very neat and simplified approach. Cleared my doubts about the need for data pipeline vs ETL. Thanks for sharing!
Great! The part that I liked the most was the one in wich he explained the difference between ETL and data pipeline
Anshul: Thanks a lot for this great video, you not only explained clearly the concepts, but also gave us the name of useful products for doing each step of the process. Thank you very much.
Very crisp and clear explanation of data pipeline. Thank you very much for explaining in detail. Much helpful.
Thanks Smita
One of the best tutorials in youtube so far which gives an overview of data engineering process and that too within 10 minutes. Really appreciate your effort and time you put into making this video. Thank you so much. Please keep doing more such tutorials.
Excellent Video. In simple Diagram explained very neatly about Batch and Realtime pipeline along with Data Pipeline architecture. Kudos!!
Great video. Thanks for the brief explanation of data pipeline. I‘m not a technical guy more from the business side and I could easily follow your concept. So, kudos for you and a big thank you!
Thanks Konard 🙏☺️
bhai itna lucid style mein kisi ne nai smjhaya. great work!!
Third time I think I am saying this on your channel comments. Your tutorials are useful for professionals like me - I am a marketer in a SaaS product firm. I come here regularly so that I can be better informed when interacting with developers. Well done. And more importantly thanks!
Thanks Karthik for adding your comments regularly, I read it and I am grateful for your support 🙏❤
Couldn't have asked for more. Very well explained, Thank you mate.
in my life this is the best explanation I ever heard ,PERFECT .keep doing that good luck sir.🙏
this was such a well detailed explanation of Datapipeline and more. I am so elighted!! Structured so well Thank you!
One of the best presentation to know more about data pipeline. thanks.
this is a very good explanation. one of the best technical videos I've ever watched on YT. thank you!
Finally understood the pipeline in 10 mints... thank u
Such an awesome explanation, short, crisp and to the point. Great!
Thanks so much! Just subscribed! My knowledge just increased 100 fold.
Thanks a lot
Superb! In 10 minutes, you have put such a clear picture of data pipeline in my mind that I will never forget! Many thanks for your time and sharing this valuable piece!
Very clear and detailed explanation. You put the parts I know in context and helped expand my current knowledge. Thank you
Your video helped me to understand better the steps in the process. i love too much the way you explane the process. Thanks Master!
So much valuable content in such short duration video... with so much clarity. Awesome !! Thank you !!
Fantastic!!! Thanks for your time and explaining the basics!!!
My pleasure!
Thank you.. quick, simple and informative. It allows me to understand more difficult information on data pipelines. Also I’m subscribed.
Thanks Tristan for support 🙏
Hats off Anshul Sir ji... thank you for sharing all the knowledge in so simple way.
Thanks Ashish hope you are doing well
Big concepts explained very quickly in an easy to understand manner. Thanks!
A simple and superb explanation about Data pipeline structure. Thanks a lot. Really appreciate!
Too good explained thanks a lot for the info. I subscribed expecting more and more such videos which motivates us too for Data engineering career
WoW....Your lecture on Data pipeline is so simple and lucid to understand unlike other youtubers. I loved the way you have explained the concepts....Please do more videos on Azure and AWS...Count on me as a new subscriber has been added!!
Very well explained session on Data Pipeline and comparison with traditional ETL. Thanks so much!
You've made it simple and clear, thanks for sharing!
Thanks Andre :)
Great job man! Very straight to the point and very informative. Thank you so much!
best video ever to learn from, it precisely helps me to understant this topic. just lovedd it. just go for it without a second thought... i can asure uh.
So glad, that I've found your channel on RUclips. Thanks a lot!
Very good tutorial with valuable explanations. Thanks.
Thanks Othman
Good one to understand Data Pipeline. Thanks!
Thanks🙏
This was just an amazing video, especially for someone new to the subject. Thank you!
Thank you so much for this video! Very informative and gives holistic idea about the data pipelines. We use data pipelines in our project too. This is very useful. Keep up your great work
Thanks dear 🙏🙏
wonderful. God bless you. Very detailed high level explanation . Boss!!!!!
Thanks Ife for your support and wishes 🙏🙏☺️
Amazing analogy. Amazing explanation of data pipeline. This is just awesome.
Tnx so much for this explanation! been looking for a couple of days video to understand this subject and this one was without any doubt the best! short,sharp, visualisation that made it really reachable,amazing🙏🏻
thanks dear I am truly grateful for your kind words 🙏😊♥
This is such a clear and useful explanation. Thank you!
Very nice architecture in a simple hand drawn picture and presentation also. Awesome job
Really Helpful, Succinct and easily digestible video - great overview in just 10 minutes. Thanks!
Hello Anshul.. Thanks a lot for the video. This one in particular demystified a lot of concepts for me. Loved the examples and analogy.
Thanks Karthik
Great video about the overview. Explained in very easy and holistic way. Exactly what I needed.
Thank you for your high-quality videos! In our use case, we ingest daily a .zip file containing 3 .csv’s related to sales, inventory and orders from different shops (20-30) and CRMs (4-5 ; each one with its own naming convention, dtypes, …).
How would you improve the following pipeline?
- Raw zip files are uploaded to a GCP bucket
- The upload triggers a Python GCP Cloud function that transforms the data to create single naming/dtypes conventions and brief new columns (e.g. timestamp by merging date + time)
- Transformed data is uploaded to MongoDB - 3 separate collection for sales, inventory and orders - and raw .csv’s to a separate GCP bucket as parquet files (1 folder for each CRM and PoS as subfolder)
- A PubSub message posted by the function triggers a GCP Function that loads processed data from MongoDB, applies ML models and stores results in separate collections (1 for each analysis type; e.g. forecast, anomaly detection, …)
- A Python web app directly reads ML output data from MongoDB
Thank you so much and love your videos; 🤗