Woah woah I know nothing about AWS Then why the heck did i totally totally understand this video It was crystal clear I usually don't subscribe to channel but this time it was not even a question 👍🏾 Man I wish you could teach me all about data engineer
hey there, could you give a link to a resource for data engineering? im about to start a job in DE and im kinda intimidated with the various skills needed for the job. I already know Python and SQL (which is why i was hired, or so im told) but i know nothing about DE. im about to start this udemy course on Python, SQL, and Pyspark, but im afraid it might not be enough. any help would be appreciated, thanks!
Your channel is a god send. Data Engineering channels are rare on youtube and those that do exist are tailored towards Indian Students. Thank you for the content and you've got a new subscriber.
The concepts in this video went inside my brain like a hot knife going in butter. Great video for someone like me who comes from a functional background. Great work...really appreciated.
This is excellent video for a person, who has database background and who is wiling to enter into AWS side, this is something I was looking for. Wow, full appreciation from me. though it may look simple, but for me it was great - because it gave me the direction what to pick up me AWS, when there are 1000 services. Thanks A Lot.
I am starting as Data Engineer with a company that uses AWS ( I am from Azure background), this video has been really helpful with the architecture and services.
Great video. Something to add here: S3 Select can be used for quick and adhoc querying dealing with single S3 file. Athena can also work directly with S3 files if you just need some quick data understanding and investigation. EMR Serverless can address the headache for managing EMR cluster and in the meantime gives your more power for ML.
I'm currently working on my AWS certification and will be referencing the diagram from this video often. Thanks for the clear and concise walk through of the context of each of these services!
As someone else commented. I'm learning to be a Data engineer and learning what each application is used for has been a struggle. I'm learning the Azure system, but seeing this visual helped. New sub.
Quickly subscribed. Currently a AWS Cloud Engineer for a AI Company so I've been upskilling in Data Engineering . Planning to take the DEA-C01 exam. Great information and your presentation style is perfect!
@DataEng Uncomplicated. This has to be one of the best explanation of how I can use AWS for my data analytics engineering workloads. Thank you for the detailed summary of the various services.
Great overview and I think your method of slowly explaining the diagram section by section is brilliant! A follow up video of a real use case would be even better. Subbed!
Hi Adrian, Thanks for the feedback and subscribing! Can you elaborate on your suggestion? Are you thinking of an actual hands on tutorial or overview use case type video?
i had watched so many other videos on same topic..this is the one i was looking for even though i didn't know what exactly i was looking for as everything was new
This is such a great video. Any chance you will be doing a full fldge video on implementing these tools together? And I love your teaching style, I would love to know if you offer any courses that I can take.
Hi Nazz, thank you for your kind words! Yes I plan on making a playlist that has technical tutorials on implementing each component so if you subscribe to my channel, you will get notified when those videos are released! Unfortunately I don't offer any courses at this time, I'm just focusing on making RUclips videos to help data engineers on AWS!
Question: If you are pulling data from external API's would you use Glue to do this or would you use something else to get this infromation and store it in S3 first and then use glue to trasform the data in s3?
Great question Dave! I would recommend using lambda functions to ingest the data in S3. Glue is for processing large amounts of data and has a bit of a start up time. You will probably want a lambda function pulling data from your API frequently so the data load size would probably be relatively low.
Thanks, thats direction I went. I have a meta lambda, a datasource lambda (1 for each data source) and a s3 upload lambda. Using step functions. This way the meta lambda gets the customer infromation required, spawns parallel datasource lambdas which all pass data to s3 upload lambdas. My only concern is how to best structure it in s3 for Glue. End goal here is Athena / Quicksite for BI purposes. I looked at AppFlow for some of this but hated it since I couldn't get all the object at one time and had to build an object per flow. So if a single data source has a lot of objects thats a lot of flows which seems annoying.
@@DataEngUncomplicated Thanks. I went with step fucntions. A master function that gets customer meta data and spawns functions for different datasources that all end up calling a s3 upload function. Now my only concern is am I storing the data properly in s3 for glue to make use of. Something like # Format the file name based on the current date and data type file_name = f"{data_type}_{year}-{month}-{day}.json" # Update the S3 key (path) to use the 'year=YYYY/month=MM/day=DD' partitioning convention s3_key = f"{customer_id}/year={year}/month={month}/day={day}/{data_type}/{file_name}"
@@DaveThomson I hope you figured this structure out by now, but if you want to use athena, you need to have your datasets seperated into different objects (folders) in S3. I would add a partitioning strategy as well which will save you in query costs if you know how your data will be queried.
When you say alternatives, do you mean AWS native alternatives? AWS announced an auto load feature from s3 to redshift I guess assuming that the schemas are the same.
@@DataEngUncomplicated Yes. For example: lambda and EMR before curated layer and after from curated to redshift and Athena do you recommend any aws service to load the data?
Your work is truly impressive; it reminds me of a book I read that had a similar impact. "AWS Unleashed: Mastering Amazon Web Services for Software Engineers" by Harrison Quill
AWS glue appears to have salesforce connectors so that would be an option. I'm sure you could do it in lambda functions as well if your data is small enough as well
I'm preparing for an data engineer interview. The company is looking for someone good at creating pipelines in aws. I'm going to use your videos. I read so many different definition for "ingestion". Ingestion comes right after extraction in the ETL process, right.
Sure, I can provide some suggestions on how to start learning AWS services: 1. Start with the AWS Free Tier: AWS offers a free tier for many of its services, which allows you to explore and experiment with them without incurring any charges. This is a great way to get started and familiarize yourself with the AWS platform. 2.Take online courses and tutorials: AWS provides a wealth of resources for learning, including online courses, tutorials, and documentation. You can start with the AWS Training and Certification website, which provides a range of free and paid courses on various AWS services. 3. Join AWS user groups and forums: Joining user groups and forums can be a great way to learn from other AWS users and get answers to your questions. AWS provides an official forum, as well as many user groups around the world. 4. Practice with real-world scenarios: Once you have a basic understanding of AWS services, try to apply what you have learned to real-world scenarios. This will help you understand how the services work together and how they can be used to solve real-world problems. 5. Get certified: AWS offers a range of certifications for different roles and levels of expertise. Getting certified can be a great way to demonstrate your skills and knowledge to potential employers.
Hi, What Services should I use if I have a source which sends CSV files and the schema changes every week? The column names are different and new columns were added each time. Ideally need to expose the data from these files into tables. Any suggestions as to which services should I use?
Hi Draco, It sounds like using the glue catalog would be a good choice to register your data in as it handles drifting schema. You can use a crawler to automatically scan and identify the changes in the schema
Good content! But how about AWS Managed Workflows for Apache Airflow for orchestration? Wouldn’t it be better to orchestrate lambdas and glue jobs with MWAA?
Hi Gabriel, thank you! Yea this is a great point, this could have been a service added to the orchestration component of the diagram. It's a good option but I don't think it's "better "necessarily since you it's another server you have to pay for the server to keep running 24/7 vs step functions and glue orchestration are serverless and only pay per x # of invocations.
@@DataEngUncomplicated Thanks for the answer and the great insight there. I guess going serveless is always the best option. But execution logs of both from glue orchestration and step functions are accessible in cloud trail?
Very informative video, Thank you. I am trying to learn Data engineering and trying to do some real world projects. Could you create few videos for End to End data engineering projects with and also some real world projects/ideas to try.
I think your asking how to ingest data from an api into aws? There are many ways to do this but for your purpose you can write a lambda function that uses the requests library to read data from the API and use the python library aws data wrangler to write the data to s3.
@@DataEngUncomplicated Thats exactly what I was asking, thanks. Can you make sure this sounds correct though? 1. AWS lambda to ingest data from API call and write that data to an s3 bucket 2. Read data from s3 using Python notebook file (that is using PySpark package) or read data from s3 using AWS EMR
@@rememberthename911g yup this works, you might want to define your data source in a glue catalog table so it will be more easily ingested into a glue job or pyspark job.
Thank you for the great video. I have one question. Wouldn't it be very costly to use all of the AWS services? I store lots of data in S3 and it costs $100-150 a month.
Hi, it all depends on your use case for your data and access patterns. For example although you have 10 TB of data, it doesn't mean you are querying all 10 TB in every query and rather only doing queries on subsets of your data.
aws is a not a career but just a cloud platform right? where we can put our skills and start working in cloud based environment right or not? pls clear me out that if i just directly with a non tech or no data anylytics background persue data analytics certification of aws but prepare through the learning material provided by aws and also hands on practice would i get the job easily? or i need to specialize all the 200 services? and also other python etc pls guide pls not getting answer to this anywhere
Hello, these are good questions that lots of people starting with AWS might have! Yes, AWS is just a cloud platform. I would say you should still have the foundational data analytics skillset in order to he succesful. You definitely don't need to specialize in 200 services to get a job. I would focus on learning the services that are relevant for a particular role. Nobody knows every single AWS service there is just too many. For your question about is AWS certification enough to get a job, it all depends on the role, the employer and what they are looking for. I would say it can't hurt your chances of getting a job if you are looking for a role that involves AWS.
Thanks Jamison! Yes! This is important but there isn't a dedicated data lineage or governance service released yet in aws...datazone was announced at reinvent which should fill this gap hopefully
Hey Ben, good call out on DMS. DMS is a good service for data engineers to learn if their focus is on data migration. For database changes, I have used both aws glue or lambda functions depending on the size of the data and building the delta logic in python.
Great video I working as AWS data engineer from past two years overall experience is 11 years. Could you recommend what certification I have to do as data engineer confused as different types of AWS certification exists
Hi Naveen! yea it is confusing because there isn't really a specific data engineering certification. The Developer associate and the AWS Data Analytics Specialty are the best one. I would also go after the database specialty if you think you will be working a lot with databases
Hi sir currently am learning sql and python I should start learning Big data am not knowing the proper way to start the way you were telling was so good so I felt like asking it will be really grateful if you please help me through this how can I contact you sir
Woah woah
I know nothing about AWS
Then why the heck did i totally totally understand this video
It was crystal clear
I usually don't subscribe to channel but this time it was not even a question 👍🏾
Man I wish you could teach me all about data engineer
Thanks for the wonderful feedback! I'm glad the way I explained it was helpful. Thanks for subscribing! More AWS related content to come.
@@DataEngUncomplicated thank you
Eagerly waiting for more content 😁
Thanks, I try to strike a balance with overview videos and technical tutorials on aws. Fun fact, I think you are my 5,000 subscriber!
@@DataEngUncomplicated 👏🥳🎉🎊
You will just understand it. But will fail to answer any interview questions unless you use them. Theory is always easy to understand
As a self taught data engineering student, figuring out what services to start with aws is very hard - this indeed uncomplicates everything!
Thank you for the kind words Renz! I'm glad it was helpful.
hey there, could you give a link to a resource for data engineering? im about to start a job in DE and im kinda intimidated with the various skills needed for the job. I already know Python and SQL (which is why i was hired, or so im told) but i know nothing about DE. im about to start this udemy course on Python, SQL, and Pyspark, but im afraid it might not be enough. any help would be appreciated, thanks!
@@francismagnusson378 Hi, how did you proceed? How's your job going on?
This video is really great. As an ETL developer, I aspire to become a data engineer in the next few years. Your explanation is very clear!
Glad it was helpful!
Your channel is a god send. Data Engineering channels are rare on youtube and those that do exist are tailored towards Indian Students. Thank you for the content and you've got a new subscriber.
You're welcome! Thanks for subscribing!
Hands down the number 1 video for beginner Data Engineers
Thanks for your kind words!
The concepts in this video went inside my brain like a hot knife going in butter. Great video for someone like me who comes from a functional background. Great work...really appreciated.
This is excellent video for a person, who has database background and who is wiling to enter into AWS side, this is something I was looking for. Wow, full appreciation from me. though it may look simple, but for me it was great - because it gave me the direction what to pick up me AWS, when there are 1000 services. Thanks A Lot.
Thanks for your kind words! I was hoping this video would be a good starting point for folks into data engineering and AWS!
I am starting as Data Engineer with a company that uses AWS ( I am from Azure background), this video has been really helpful with the architecture and services.
Thanks Tiisetso! I'm glad this video was helpful. Thank you for leaving me a comment
Great video. Something to add here: S3 Select can be used for quick and adhoc querying dealing with single S3 file. Athena can also work directly with S3 files if you just need some quick data understanding and investigation. EMR Serverless can address the headache for managing EMR cluster and in the meantime gives your more power for ML.
I'm currently working on my AWS certification and will be referencing the diagram from this video often. Thanks for the clear and concise walk through of the context of each of these services!
You're welcome Chris, I'm glad it was helpful. Good luck on your AWS certification!
One of the best AWS explanation I saw so far
Thanks Shrey, much appreciated!
It is so far the most helpful video I saw about aws services for DE. I hope there are more likewise. Thanks a lot for sharing!
Thanks for the comment! Yes, new videos related to data engineering and AWS every week!
As someone else commented. I'm learning to be a Data engineer and learning what each application is used for has been a struggle. I'm learning the Azure system, but seeing this visual helped. New sub.
Thanks Obie! Much appreciated
Quickly subscribed. Currently a AWS Cloud Engineer for a AI Company so I've been upskilling in Data Engineering . Planning to take the DEA-C01 exam. Great information and your presentation style is perfect!
Thanks so much for the kind words! I'm glad it was helpful. Good luck on the exam!
Love your clarity on the topic. Subscribed! Can't wait to explore all your videos👀
Thanks for the feedback and subscribing!
Still a great overview. Makes everything a lot clearer. Thank you.
Glad it was helpful!
Thanks for creating this video. You explained the concepts very clearly.
You're welcome. I'm glad you found the video helpful
I found this video very useful as a learner. Thank you!
Thanks Sushila, I'm glad you found it helpful!
@DataEng Uncomplicated. This has to be one of the best explanation of how I can use AWS for my data analytics engineering workloads. Thank you for the detailed summary of the various services.
Thanks Victor, much appreciated!
Great video Adriano! It helped me understand all the AWS services better.
Thank you! I'm glad the video was help.
This is pure gold. Thanks!
Great overview and I think your method of slowly explaining the diagram section by section is brilliant! A follow up video of a real use case would be even better. Subbed!
Hi Adrian, Thanks for the feedback and subscribing! Can you elaborate on your suggestion? Are you thinking of an actual hands on tutorial or overview use case type video?
Amazing Man!! Good one
Thanks!
i had watched so many other videos on same topic..this is the one i was looking for even though i didn't know what exactly i was looking for as everything was new
Thanks sail! I'm glad it was what you were looking for. What were you searching for on RUclips exactly?
@@DataEngUncomplicated i am familiar with hadoop environment..i wanted to know how to do all of it in aws..now i know! thanks
Amazing Video!
Thanks Dan!
This was perfect! Exactly what I was looking for lol.
Thanks Andraya!
I am glad I found this video. Brilliant overview. cheers !!
Thank u so much. Your tutorial helps me a lot.
Such a great video! Summarized basic AWS services for data engineering very nicely! One of the best! Thanks!
Thank you very much!
Super solid. Any chance for a 2025 updated version. You rock and love the flow and viz.
A great video to link up all the AWS components. I guess AWS likes open source 😂
You sir, are the main man. Thank you.
Haha. You're welcome!
fantastic video. Thanks for this.
I am looking for hands on experience. This video helps me understand concepts better
Thanks Senthil, I'm glad it was helpful.
Excellent video.. the sequence you have covered this in is seamless.
I am surely having this for quick reference.
Thanks so much for your kind words Saidulu. I really appreciate it.
Great video. Super well summarised
Thanks Nic!
Great video, thanks!
Thank you!
Awesome job!
Clear and concise
simply well articulated
Thanks for the content
This is such a great video. Any chance you will be doing a full fldge video on implementing these tools together? And I love your teaching style, I would love to know if you offer any courses that I can take.
Hi Nazz, thank you for your kind words! Yes I plan on making a playlist that has technical tutorials on implementing each component so if you subscribe to my channel, you will get notified when those videos are released! Unfortunately I don't offer any courses at this time, I'm just focusing on making RUclips videos to help data engineers on AWS!
Very informative, thanks!
Great video!! Thanks for sharing, it really help me to better understand AWS tools
You're welcome. I'm glad it was helpful!
Thank you so much for this video! I have an interview tomorrow and this boosted my understanding and confidence. Great explanations!
Thanks, Best of luck with your interview!
God bless the works of your hand....great job
Thank you Oluwatobi!
Channel name checks out
You are a legend sir
Thanks for your kind words John!
Why there is an arrow from AWS Glue Catalog to the Data warehouse (Red Shift)?
Glue catalog works on databases as well as data lakes so you can define your redshift datasets in AWS glue to keep track of them
It works! Thanks a lot.
love this
If we clean the data after loading it into S3, this would be ELT right?
Yea you got it!
Great can you do a complete end to end AWS coding using etl, analytics videos
wow this is awesome!
Great video thanks
You're welcome Sergio, thanks for leaving a comment.
Great Video..!! AWS App flow is missing........... Thank you
Great point, I know this service is being used more recently
Question: If you are pulling data from external API's would you use Glue to do this or would you use something else to get this infromation and store it in S3 first and then use glue to trasform the data in s3?
Great question Dave! I would recommend using lambda functions to ingest the data in S3. Glue is for processing large amounts of data and has a bit of a start up time. You will probably want a lambda function pulling data from your API frequently so the data load size would probably be relatively low.
Thanks, thats direction I went.
I have a meta lambda, a datasource lambda (1 for each data source) and a s3 upload lambda. Using step functions.
This way the meta lambda gets the customer infromation required, spawns parallel datasource lambdas which all pass data to s3 upload lambdas.
My only concern is how to best structure it in s3 for Glue.
End goal here is Athena / Quicksite for BI purposes.
I looked at AppFlow for some of this but hated it since I couldn't get all the object at one time and had to build an object per flow. So if a single data source has a lot of objects thats a lot of flows which seems annoying.
@@DataEngUncomplicated Thanks. I went with step fucntions.
A master function that gets customer meta data and spawns functions for different datasources that all end up calling a s3 upload function.
Now my only concern is am I storing the data properly in s3 for glue to make use of.
Something like
# Format the file name based on the current date and data type
file_name = f"{data_type}_{year}-{month}-{day}.json"
# Update the S3 key (path) to use the 'year=YYYY/month=MM/day=DD' partitioning convention
s3_key = f"{customer_id}/year={year}/month={month}/day={day}/{data_type}/{file_name}"
@@DaveThomson I hope you figured this structure out by now, but if you want to use athena, you need to have your datasets seperated into different objects (folders) in S3. I would add a partitioning strategy as well which will save you in query costs if you know how your data will be queried.
@@DataEngUncomplicated Thanks!
This is pure perfection. I read a book with similar content, and it was pure perfection. "Mastering AWS: A Software Engineers Guide" by Nathan Vale
Thanks Larry!
What's the alternative to load data from curated zone to Redshit and Athena. Lambda + Glue or it isn't necessary?
When you say alternatives, do you mean AWS native alternatives? AWS announced an auto load feature from s3 to redshift I guess assuming that the schemas are the same.
@@DataEngUncomplicated Yes. For example: lambda and EMR before curated layer and after from curated to redshift and Athena do you recommend any aws service to load the data?
Your work is truly impressive; it reminds me of a book I read that had a similar impact. "AWS Unleashed: Mastering Amazon Web Services for Software Engineers" by Harrison Quill
You are awesome
great introduction to these services. is there a specific data integration service to get data from salesforce (cRM) source?
AWS glue appears to have salesforce connectors so that would be an option. I'm sure you could do it in lambda functions as well if your data is small enough as well
Thanks for sharing
You're welcome!
Wowww excellent video. Thank you very much. Is there any course that you could recommend to learn these specific tools?
I'm preparing for an data engineer interview. The company is looking for someone good at creating pipelines in aws. I'm going to use your videos. I read so many different definition for "ingestion". Ingestion comes right after extraction in the ETL process, right.
Thanks! Glad the videos are helpful. I hope your interview went well!
Can you do the same but for Azure services?
Hi Suleiman, sorry I'm not as familiar with Azure services. AWS is what I am currently focus on.
Hi there, thanks for such a wonderful explanation of a complex topic. Can you share the diagram picture through a link please?
I have one doubt. Can we host multiple kafka producers in one ec2 instance?
Are you talking about using Amazon Managed Streaming for Apache Kafka?
@@DataEngUncomplicated yes!
Hi, Great video. Can you give suggestions on how to start learning these services?
Sure, I can provide some suggestions on how to start learning AWS services:
1. Start with the AWS Free Tier: AWS offers a free tier for many of its services, which allows you to explore and experiment with them without incurring any charges. This is a great way to get started and familiarize yourself with the AWS platform.
2.Take online courses and tutorials: AWS provides a wealth of resources for learning, including online courses, tutorials, and documentation. You can start with the AWS Training and Certification website, which provides a range of free and paid courses on various AWS services.
3. Join AWS user groups and forums: Joining user groups and forums can be a great way to learn from other AWS users and get answers to your questions. AWS provides an official forum, as well as many user groups around the world.
4. Practice with real-world scenarios: Once you have a basic understanding of AWS services, try to apply what you have learned to real-world scenarios. This will help you understand how the services work together and how they can be used to solve real-world problems.
5. Get certified: AWS offers a range of certifications for different roles and levels of expertise. Getting certified can be a great way to demonstrate your skills and knowledge to potential employers.
Is there a pdf file to print out the diagrams
No sorry, unfortunately I don't have one.
Hi, What Services should I use if I have a source which sends CSV files and the schema changes every week? The column names are different and new columns were added each time. Ideally need to expose the data from these files into tables. Any suggestions as to which services should I use?
Hi Draco, It sounds like using the glue catalog would be a good choice to register your data in as it handles drifting schema. You can use a crawler to automatically scan and identify the changes in the schema
Good content! But how about AWS Managed Workflows for Apache Airflow for orchestration? Wouldn’t it be better to orchestrate lambdas and glue jobs with MWAA?
Hi Gabriel, thank you! Yea this is a great point, this could have been a service added to the orchestration component of the diagram. It's a good option but I don't think it's "better "necessarily since you it's another server you have to pay for the server to keep running 24/7 vs step functions and glue orchestration are serverless and only pay per x # of invocations.
@@DataEngUncomplicated Thanks for the answer and the great insight there. I guess going serveless is always the best option. But execution logs of both from glue orchestration and step functions are accessible in cloud trail?
The logs for glue and step functions are actually accessible in cloud watch logs.
Bruh, thank you SO MUCH!
Thanks Andrew!
Very informative video, Thank you. I am trying to learn Data engineering and trying to do some real world projects. Could you create few videos for End to End data engineering projects with and also some real world projects/ideas to try.
Hi Sandeep, yes, this is high on my video list! Thanks for the suggestion!
How would I encorparate AWS into my project if I am using a websites API as the source of my data?
Its not much data. A max of a couple hundred lines but I still want to be able to show an employer I can use different services
I think your asking how to ingest data from an api into aws? There are many ways to do this but for your purpose you can write a lambda function that uses the requests library to read data from the API and use the python library aws data wrangler to write the data to s3.
@@DataEngUncomplicated Thats exactly what I was asking, thanks. Can you make sure this sounds correct though?
1. AWS lambda to ingest data from API call and write that data to an s3 bucket
2. Read data from s3 using Python notebook file (that is using PySpark package) or read data from s3 using AWS EMR
@@rememberthename911g yup this works, you might want to define your data source in a glue catalog table so it will be more easily ingested into a glue job or pyspark job.
thanks man
You're welcome!
Thank you for the great video.
I have one question. Wouldn't it be very costly to use all of the AWS services? I store lots of data in S3 and it costs $100-150 a month.
Hi, it all depends on your use case for your data and access patterns. For example although you have 10 TB of data, it doesn't mean you are querying all 10 TB in every query and rather only doing queries on subsets of your data.
Interesting! Great job but the author di not speak on Security. I think we need security too.
Thanks Dan, you are right, I left out security. I would throw up KMS in the security section if individuals wanted to encrypt their data with kms keys
aws is a not a career but just a cloud platform right? where we can put our skills and start working in cloud based environment right or not? pls clear me out that if i just directly with a non tech or no data anylytics background persue data analytics certification of aws but prepare through the learning material provided by aws and also hands on practice would i get the job easily? or i need to specialize all the 200 services? and also other python etc pls guide pls not getting answer to this anywhere
Hello, these are good questions that lots of people starting with AWS might have! Yes, AWS is just a cloud platform. I would say you should still have the foundational data analytics skillset in order to he succesful. You definitely don't need to specialize in 200 services to get a job. I would focus on learning the services that are relevant for a particular role. Nobody knows every single AWS service there is just too many. For your question about is AWS certification enough to get a job, it all depends on the role, the employer and what they are looking for. I would say it can't hurt your chances of getting a job if you are looking for a role that involves AWS.
can you do entire project for this?
Yes, I have done projects using most of these services in the past.
Very well done! Don't forget the importance of data lineage though. Big time clients always want the capability to visually track data lineage.
Thanks Jamison! Yes! This is important but there isn't a dedicated data lineage or governance service released yet in aws...datazone was announced at reinvent which should fill this gap hopefully
Do you do any consulting?
Hey David, I'm actually a full-time AWS D&A consultant for a company that is an AWS partner. Let me know if you want to chat.
@@DataEngUncomplicated I would like to chat. I too work full time for a partner.
@@DaveThomson Great, feel free to contact me through the email I have posted on my channel.
@@DataEngUncomplicated sent you an email.
thanks for this. do you have a course on Udemy on Data Engineering?
Thanks Dare, Unfortunately i don't yet
please for data scientist
Subscribed
Great video! 2 questions:
1) Any reason you didn’t mention DMS?
2) What services help you out with database changes (deltas)?
Hey Ben, good call out on DMS. DMS is a good service for data engineers to learn if their focus is on data migration. For database changes, I have used both aws glue or lambda functions depending on the size of the data and building the delta logic in python.
Use AWS Batch to Batch Data Ingestion
hopefully zero ETL is going to change a major chunk of dependencies when managing the data within the aws ecosystem.
Great video I working as AWS data engineer from past two years overall experience is 11 years.
Could you recommend what certification I have to do as data engineer confused as different types of AWS certification exists
Hi Naveen! yea it is confusing because there isn't really a specific data engineering certification. The Developer associate and the AWS Data Analytics Specialty are the best one. I would also go after the database specialty if you think you will be working a lot with databases
As the channel name says you're making things uncomplicated. 🎉😅
interesting
PRO cess. LMAO
you know it's funny...I didn't even realize I say that :D
Millions or billions...
I have no context to what this means but I'm going to respond with we can process millions or billions of records in data engineer with AWS 😉
Hi sir currently am learning sql and python I should start learning Big data am not knowing the proper way to start the way you were telling was so good so I felt like asking it will be really grateful if you please help me through this how can I contact you sir
Hello I know there are a lot of concepts and technologies to learn! you can reach me at dataenguncomplicated@gmail.com