If you'd like to read up on my updates about the data field, then you can sign up for our newsletter here. seattledataguy.substack.com/ Some other resources you might find helpful Udemy's Basic Data Warehousing Course. Here is a link bit.ly/3wg94E2 Kimballs Data Warehousing Guide aatinegar.com/wp-content/uploads/2016/05/Kimball_The-Data-Warehouse-Toolkit-3rd-Edition.pdf
Got to throw back to the originals. I remember the old days. When the techlead just returned, Joma was just a data scientist and Joshua Fluke hadn't even passed 100k subs.
Hey, thanks to you im giving data engineering a shot. I thought it was going to be boring tbh, but its not and its way easier for a junior dev like me to pick up in comparison to DS which has a lot of complicated disciplines and math. Also, that part about being done, unlike data science, is awesome.
@@parkuuu I am doing through datacamp, but i had the pleasure of taking one free-lance job to apply the knowlegde, which it as a lot of using StackOverflow hahah, ya know like any Junior hahaha ;).
Interviewed with Amazon for a DE position a little while back and I'd say you really hit the nail on the head with some of the things to look out for. Ultimately I ended up passing as I wasn't 100% sold on moving out to Seattle.
@@SeattleDataGuy Thanks! I really like the city but looking to close on a house within the next year and the Seattle market is unfortunately a little too expensive for me.
"the truth is you don't exist". That's the most important advice for data engineers like me because we are technical people and we don't like selling to others and we just want to stay on the cloud rather than presenting the analysis results. Thank you so much for that.
Just wrapping up a data science Bootcamp at BrainStation. One of my instructors mentioned the importance of data engineers within a data ecosystem. Data engineering wasn't something on my radar until recently. I really appreciate your insights, Ben! I'll continue following this lead as there might be some value for a career transition-er to take on more of a generalist role between data engineering/analysis/insights, in case it makes me more valuable for start-up companies looking for breadth over depth.
+10000 for the last part ! Content creation even as part of your learning journey is such a good mental exercise no matter what you expect from the output 💚
Appreciate the further insight on an entry-level role for Data Engineers. Despite having completed a data analytics bootcamp from Thinkful and 3 career tracks from DataCamp (Data Analyst with SQL, Python Programmer, and Data Engineering with Python), I feel like I'm no different from when I first started this journey. However it's good to know I'm probably well on my way to being job-ready soon.
Mr. Fluke has a point though. Lots of job openings in tech treat "junior" and "entry level" as 2+ years of experience (I've seen jobs in programming for five!). HR departments are also notorious for over bloating job descriptions in order to get the most qualified candidate for the cheapest salary.
Thank you Ben for putting your content out here. How to stand-out is the part I needed to hear the most and enjoyed the most. Sharing the learning is always enjoyable to do, right?
What do you think of the coursera Meta Data Engineer cert? I have a bachelors in IT (recent grad). Was thinking of taking this course to polish my skills and get a junior data engineering role
Hey Ben, your videos have been extremely eye opening/helpful and I can't thank you enough for your contributions to the DE community. Do you have any plans to make a video around ETL testing and what your experience with that has been in the past? That has come up a lot for me for many Senior DE interviews and I mention making sure nulls are handled/row counts match/ transformation failures & surrogate key lookup failures are logged, but I never feel confident in my answer. I'd love to hear from someone like you on the best approach/answer to this question if you feel it'd help other DE's in the community.
Breaking into a Data Engineer position, would you recommend doing this through a Data Analyst role or working to really building the fundamentals required of the role for a junior position? Would you consider certifications related to SQL or Snowflake necessary ? Thanks for all the content you share to help others out!
08:00 - I can explain that: Junior Payment, Senior Work :D Great Video as always! A Mid Level -> Senior video would be interesting. What really makes you senior on a (team-)communication and on a tech-skill level?
Thanks! Yeah in general sr. positions are a combination of better system level thinking, the ability to communicate and drive projects forward, guiding JRs, and helping prioritize projects. I think, in an ideal state it would be like. Jrs. - Doing A lot of hands on work and Learning And Making Mistakes Mid - Doing most of the hands on work and learning how to be impactful outside of tech Srs - Doing some hands on work for examples, making big technical decisions, setting standards, and communicating with the business to make sure expectations are set.
How much coding there is usually required for Data Engineering roles ? i like coding but in my company soft engineering positions are mainly web dev projects which am not really a fan of Been exploring data science field these days and this role in particular seems to have piqued my interest
I think understanding the basics of DS and A as well as OOP is solid. You don't need to necessarily need to know how to reverse a linked list, but you should know how to work with arrays, hashmaps, queues, etc.
I would like to mention that there are actually Master's degree based completely on data engineering. Surprisingly they are job-oriented and their curriculum is also updated on yearly basis.
Hey Benjamin, that was a great video !! But I do have few questions- - How do you sell yourself when you're not sure if the skills you possess are actually that useful in the market ? - Lets take my example if I have SQL, Tableau skills majorly, I don't know how can I sell my skill when 1000s people out their have the same set of skills ? - Standing out probably sounds really cool, but how to become one ? Looking forward for your reply. Have a great day ahead !!
Great content as always! Would like to ask some few questions, what python libraries are commonly used in Data Engineering jobs? Which is better for Data engineering, Scala or Java? Thank youuuu
I would say, worry less about libraries and make sure your coding skills are solid. OOP, DS&A, etc. Sure you can learn some pyspark, but I think understanding how to write good code in general is better. If you want to work for airbnb or netflix then scala otherwise you can probably pick on or the other.
Like I've seen Darshils video on getting caught up in the "certification loop" but what I need to understand is how to attain that vital information without going through bootcamps. For example, in my university CS Final year students have "Data Warehousing and Data Mining" courses offered to them and even though I'm a computer engineer there are no such courses for offer. My question being that can you make a detailed guideline of what courses should I be doing as I feel that your engineering roadmap was good in the sense of providing valid information in a hierarchy but "outdated" in terms of the courses (for example can you look into Udemy courses like Data Engineering Essential Hands on- SQL, Python, and Spark or even high end specializations by IBM on coursera or Udacitys Data Engineering program) Much love from Pakistan.
Please do extensively look into Udemy courses like Data Engineering Essential Hands on- SQL, Python, and Spark or even high end specializations by IBM on coursera or Udacitys Data Engineering program, because most of us newbies don't even know where to start in terms of ETLs, Big Data or even cloud platforms.
Idk in witch country you are but in Europe if you don’t wanna be senior data engineer probably 5% of companies ask even primary school. Most importantly is knowledge of job specific then some head chefs, plastic surgery or web major highest education diploma. That make sense.
Comparing data engineer to a backend developer. Which one has the better work life balance? Which work is more tiring in your opinion? Also, which one is more prone to be on call? Which one is more suitable for remote work? What about the pay? Also, can I transition to backend dev from DE if I like to after a few years, any transferrable skills? I see significantly more backend positions but still DE is a secure job if I am planning to work remote in future and love in different countries? Thanks.
A common sentiment is that Data Engineers get paid considerably less than Software Developers/Engineers in general. Do you think that is true, and if so, why do you think that is true? And if it is true, why be a Data Engineer if you could easily transition to be a standard SWE and get paid considerably more?
It can be true. It does depend where you work. For example, Netflix and Lyft pay their data engineers like SWEs from what I hear. But yes, at facebook data engineers make less.
Have 10+ years as a etl developer and made the switch last year to a data engineer working with dbt, git on a snowflake database. This year I want to start making content but find it hard to actually start. Do you have suggestions?
The biggest tip is don't worry about being perfect. Just try things. If you want to make videos, do it. Don't worry if your mic is perfect or your camera is perfect. First, work on practicing actually outputting content.
I don't know if it just me but many companies believe that somehow a data engineer is also a backend one. Similar between data engineer and database admin/analyst.
Great content as always! For carrier-changers without much of data/tech background, do you recommend aiming for a Data Analyst position as a first step and transitioning to Data Engineering later? Or going straight for Data Engineering is doable as well?
It would probably be easier to go from analyst to de. If you saw the Booz job required 1 year of analyst experience..which does make sense to a degree. So I think that can be a natural shift
I think getting some form of CS degree is great. That way its easier to switch jobs in case you don't like data engineering. Never know what you like until you start doing the work.
I don't really think there are too many cons. Other than a lot of what you will learn won't provide value...but that can be said about a lot of degrees. Even my MIS degree...other than the database and programming courses..there wasn't a ton of value.
Great video. Currently focused on learning Python and SQL and I hope by end of 2022 to have solid understanding of all the mentioned subjects in your video (SQL, Python, Data Modeling, ETL, Data Warehousing, etc). However, I don't have any Bachelor's Degree whatsoever. I have been working in an IT Helpdesk environment (5 years as L1/L2, currently 6 years as Helpdesk Manager so all in all 10+ years) though. You think that will count for something and will allow me to land a junior/mid DE job?
It could help. It is hard without a bachelors degree. But maybe try applying? You never know. I have worked with people with all different backgrounds.
Hey, I work as a Client Partner in a technology company, we have plentyData engineers, developers, devops interviewed engineers by our customers, from what I have seen in the past 4 years, they are more or less focused on the skillset and experience and less keen on degrees. We have over 250 developers and engineers working with customers across the globe. Keep going, I believe you are on the right path and give me shout if you need help with resume or if you're interested in working with us. Cheers
@@SeattleDataGuy what if I got a bachelor´s degree in indutrial engineer? In my university they teach me python and rstudios, all focus in data and how to understand stats and work with them, but if im trying to search for a job i dont know if it would help.
Hey thanks for sharing, again. I got a question, is it only me or I can't find too many channels related to data engineering, as much as other software areas? Do you know why is that?
Ben, I want to pursue data engineering. I have been an analyst in a few different industries the past three years. I am wondering if moving to a mid-level data engineering position makes sense given my experience?
It may or may not. It depends on if you were doing DE style work like ETL development, Data modeling, and so on. If you have, maybe there is a chance. But, generally speaking 3 years is enough to just break you into mid-level (at most faangs). So it might be hard to make such a technical jump to mid-level without a solid tech foundation.
Hey Ben, I was wondering if you would think that the Oreilly book 'Learning SQL' would be sufficient in terms of preparing you for the SQL knowledge for working on a project and on cracking interview?
I haven't checked it out. I think overall, when it comes to learning SQL you need the basic SQL clauses, window functions, some understand of data modeling and everything else likely can be googled. Like spending time figuring out how to format dates is a huge time suck..every DB has a different way to format a date and you could lose a lot of time spending time trying to remember every date formatting function.
It's a hard sell, but not impossible. I think most companies technically want you to have a bachelors. But what that is in, doesn't have to be tech. My manager at Meta who also worked at Netflix had an english degree. Experience, does tend to trump everything. But it depends where the experience is at. You can see if you can try to pivot your current job into a larger company.
The. Company I work for is offering a data engineer boot amp. They don't require any previous skills or knowledge. I have been teaching myself front end development for quite some time now and thought this bootcamp could be a great way to go. Any advice on standing out in this sort of situation? It is for a 16 week boot camp that if passed leads onto an entry level position.
You will likely want a solid foundation in SQL, python and databases. If you can come into that bootcamp with that knowledge they will hopefully get you some of the rest of the way there
@@sashajosephs1892 it was in-house only. I didn't even get an interview. Passed the aptitude test, did the technical test and didn't get any further. Found out my scores where 5/10 on the aptitude and 90/100 on the technical.
Hey Ben, i know that this is considered an "old" video lol, but how much SQL should i know before moving on the Python? I am good with querying, creating tables, databases, joining tables/query results. Should i know about SSIS, SSRS? Thanks!
For someone with sql background and 10+ years of exp with sql and only relational DB, java programming and no exp with cloud technologies or nosql dbs, if they want to switch to data engineering space whats the best road map
If you can still apply for an internship, because generally you need to be a student to do so, do it! It's much easier to get a job as an intern because the bar is lower. And if you get a return offer, thats amazing. The one downside is you might have a little less experience on more difficult technical interviews
Currently getting my masters In Data science and it's a little upsetting that there aren't any courses offered around data engineering. I guess my best bet would be to use supplemental onl9ne materials?
Yes, you will have to look into courses on data modeling/data warehousing, SQL(if your ds courses don't cover it well), and programming outside of pandas and numpy. This is a baseline.
I am a former software tester who is very interested in data engineering. I already know SQL and am learning Python on my own. What's a good way to learn data modeling?
@@cateclism316 Here are links to them aatinegar.com/wp-content/uploads/2016/05/Kimball_The-Data-Warehouse-Toolkit-3rd-Edition.pdf www.wiley.com/en-us/Building+the+Data+Warehouse%2C+4th+Edition-p-9780764599446
Hi Ben, I am long time admirer of your content. This is a solid video. Great info. I am a data engineer, currently persuing my masters in Computer Science. Is it possible to get my resume reviewed by you as I am currently looking for DE internships. Let me know. Thanks.
@Seattle Data Guy Hello Mate.One honest request here.Would you be kind to create a video on how do people with close to 10-12 years of IT experience prepare to be a Data Enginner if they so wish to transition into the said field.Thank you very much in advance
I’m trying to find a data engineer position; however, it seems these days they expect you to know Python or similar- which being in a Microsoft for 8 years now, I have not really dabbled much in. What would you say one should do to be more relevant in the data engineer space today? Also, what would say are things one needs to know about data modeling?
I think I'm doing some wrong! I started in a company and I do all the pipelines with python and use some tools from cloud(GCP), like airflow(composer) and serverless(cloud function). All my day job is that! I use a lot python code to create the workflow. Should I change my way to develop pipelines to really achieve my data engineer career?
@@SeattleDataGuy Most of data come from gcs(zipped, thats is already a problem to unzipp 9Gb file in cloud environment), but there are different characteristics each other, for example, one is positional, others are separate by ; or |, there are differents sizes (since 1 gb until 60gb), whether has header. I've been trying to choose some tool thats fits for all files sources, but I couldn't.So I've created a compute engine (VM) and I've done one python code for each file source but it seens improductive.Compute engine+python code solve the problem of huge zip(and huge file), but functions solve the problem of common files, maybe dataproc for standard pipelines, etc. I feel kind lost to choose right tools(because each tool seems for solve a specific problem) in this beginning of my data engineer journey and I end up developing my own code. For sure, I spend much time to coding and I thing that is no standarlization in my environment.
I have dealt with a lot of similar issues. Have you tried any low-code solutions? Perhaps fivetran. I don't recall if fivetran does positional files. if you do go the code route, generally speaking you do want to split files first when they are too big. So create a script that breaks down the files and then maybe push them into another bucket. Whether they have headers or not or how they are split can be managed by some form of meta database or config file. All of which get fed to some main program. That program then has different routes in terms of how to treat each file. Is that what you are doing/
@@SeattleDataGuy Thanks for the tips! I've create a python program where each class is responsible for one step of ingestion, like download(from storage, ftp, api, etc) is one, unzip is another, transform and load. Some of them is easy to abstract for all pipelines, others like transform is hard, some file detail is enough to make another function from class. I'm afraid that the code be too complex. I think I have to decide/create a design pattern to my code! All workflows separated for example, or together as it is... I'm trying to put in the same "route", just put some "if-elses" to the decide which path to transform...
Good work man... seriously....good work. My question... How easy is getting a DW job remotely and freelancing...for ¹) Junior ND ²) Mid-level folks. Thanks
Hello Ben, thanks for your videos. I am software engineer with 8 years experience, mostly web, SQL, PHP, Golang and so on. Please, tell, what is a better way transion my existing knowledge to data engineering field? Thank you.
Thats a great base. If you can apply your SQL to analytics/data modeling and add in some data warehousing development, then it can be an easyish switch
Hey Ben, thanks for the amazing content, it has been really helpful for me this past year. I do have a question for you regarding this whole Finding DE jobs. I have 4+ years of experience as a Data Scientist (purely modeling and some deployment) and I've been looking to transition into DE to cover the full spectrum of any data project. Given this, would do you consider "worth it" or "wise" to get a substantial reduction of your salary simply to get into DE as an entry-level? It's true, I don't have the full ETL + Cloud practical experience but going from a Senior to an entry-level role just to break into DE, seems like a big change/risk. How would you bridge the gap of experience? Thanks again for your videos, looking forward to your reply. Cheers!
I would see if you can break into at least a mid-level career. You have other valuable skills that aren't technical that could drive value as well. I would say see if you can get some ETL development at your job, maybe try moving laterally into a DE position at your company. I don't know if it's "worth it". Truly, only you can answer that question. If you completely hate DS work, then yeah. But if you're just looking for something new, then I would attempt to take on an ETL project at work and see if you like it.
It just depends on the tools you are looking to use. You could use airflow to orchestrate some pieces, you could use SSIS, it just depends on the tool.
Why not both? We both have very different perspectives! And if anything, we can both recommend the other persons video at the end. People love hearing a lot of different perspectives!
Do you think a guy with an International relationa bachelor and HR background but with experience with SQL, Python, power BI and automation of reports can become a data analyst or engineer?
You're right. I was trying to work a full-time job, consult, put out a newsletter and way too many other things. I recently quit facebook and am more focused. At some point I do want to do a new data engineering project. But the lesson from that first attempt was first finish the project then make the video series.
I'd say more than 50% of titles labelled "junior data engineer" are requiring min 2 years of experience & pay less than data analysts. Lol...these recruiters and managers need to get their heads out their ass and get their shit together. Lol
@@SeattleDataGuy Btw, I just missed the $59 sale on Datacamp for annual fee. Gonna wait until next month to see if they have it again. Lol...still got Dataquest. And based on what you mentioned in your vid...going through Part 5 & 7 before my subscription ends. Are these intros enough to start the projects (beginner/mid) that you mentioned in your other vid?
@@SeattleDataGuy I had an opportunity to interview with them, but I foolishly didn’t take it and took an offer I should have refused so I will forever remember them.
@@SeattleDataGuy they never wanted me to get promoted, because they only wanted me to keep fixing there problems then they hired other people when I decided to leave they offered a few things but not much, and majority of the people they hired to keep me in my place they either left transferee or got fired.
If you'd like to read up on my updates about the data field, then you can sign up for our newsletter here.
seattledataguy.substack.com/
Some other resources you might find helpful
Udemy's Basic Data Warehousing Course. Here is a link
bit.ly/3wg94E2
Kimballs Data Warehousing Guide
aatinegar.com/wp-content/uploads/2016/05/Kimball_The-Data-Warehouse-Toolkit-3rd-Edition.pdf
Hey Ben, I am a Data Engineer (Spark/ETL) and really love your content. Thanks for being a coach/mentor/data buddy to all of us. Best wishes for 2022
Thank you so much! I am so glad I can provide any help!
I've been waiting all year for this one!! 🙌🏼
You're too kind as always! And if anyone prefers the way of the data analyst. Check out Lukes Videos!!!
Also loved the Joshua Fluke reference 🤣😂
Got to throw back to the originals. I remember the old days. When the techlead just returned, Joma was just a data scientist and Joshua Fluke hadn't even passed 100k subs.
@@SeattleDataGuy The good ole days!! 🙌🏼 And.. might add.. when Casey Neistat used to make vlogs 🤣
Oh yeah, he is a must watch.
Hey, thanks to you im giving data engineering a shot. I thought it was going to be boring tbh, but its not and its way easier for a junior dev like me to pick up in comparison to DS which has a lot of complicated disciplines and math. Also, that part about being done, unlike data science, is awesome.
Glad you're enjoying your DE journey!
@@SeattleDataGuy Very ^^. Would not started without ya. Thanks again :D
Hi, may I ask the roadmap you are currently taking for DE? I already am familiar with Python and SQL, and now I'm lost haha
@@parkuuu I am doing through datacamp, but i had the pleasure of taking one free-lance job to apply the knowlegde, which it as a lot of using StackOverflow hahah, ya know like any Junior hahaha ;).
Great information-packed video, taking notes :)
Glad you found it informative. I am trying to provide a little more in terms of research...trying
Interviewed with Amazon for a DE position a little while back and I'd say you really hit the nail on the head with some of the things to look out for. Ultimately I ended up passing as I wasn't 100% sold on moving out to Seattle.
I mean I think Seattle is great, if you don't mind the grey. But congrats on being able to pass on it!
@@SeattleDataGuy Thanks! I really like the city but looking to close on a house within the next year and the Seattle market is unfortunately a little too expensive for me.
I concur. It's way too expensive
"the truth is you don't exist". That's the most important advice for data engineers like me because we are technical people and we don't like selling to others and we just want to stay on the cloud rather than presenting the analysis results. Thank you so much for that.
Yeah selling ourselves is hard. I am always working on this!
Just wrapping up a data science Bootcamp at BrainStation. One of my instructors mentioned the importance of data engineers within a data ecosystem. Data engineering wasn't something on my radar until recently. I really appreciate your insights, Ben! I'll continue following this lead as there might be some value for a career transition-er to take on more of a generalist role between data engineering/analysis/insights, in case it makes me more valuable for start-up companies looking for breadth over depth.
Glad you enjoyed the content! Hopefully some of my current as well as older content cna help provide perspective on data engineering!
+10000 for the last part ! Content creation even as part of your learning journey is such a good mental exercise no matter what you expect from the output 💚
Yes! Even if you don't share it. It's a great practice
Appreciate the further insight on an entry-level role for Data Engineers. Despite having completed a data analytics bootcamp from Thinkful and 3 career tracks from DataCamp (Data Analyst with SQL, Python Programmer, and Data Engineering with Python), I feel like I'm no different from when I first started this journey. However it's good to know I'm probably well on my way to being job-ready soon.
Good luck! Have you built any fun projects?
@@SeattleDataGuy couple data analytics projects. Looking forward to doing data engineer projects soon
Awesome! Have you put together any posts?
@@SeattleDataGuy I've done like posts of my progress in my learning on LinkedIn, if that's what's you mean
hi @BJ Tan can you please tell me if doing Data Engineering career track by DataCamp worth it or should try something else??
Great Stuff ! Loved the part on "how to stand out in 2022", specifically.
Yeah! Good luck. I now don't recall if i said 2021 or 2022.
@@SeattleDataGuy Ah! It was 2022 ! However, no worries, it stands true for any year I guess 😀
Mr. Fluke has a point though. Lots of job openings in tech treat "junior" and "entry level" as 2+ years of experience (I've seen jobs in programming for five!). HR departments are also notorious for over bloating job descriptions in order to get the most qualified candidate for the cheapest salary.
Yes HR departments really are
I love your content
Thank you!
Thank you Ben for putting your content out here. How to stand-out is the part I needed to hear the most and enjoyed the most. Sharing the learning is always enjoyable to do, right?
What do you think of the coursera Meta Data Engineer cert? I have a bachelors in IT (recent grad). Was thinking of taking this course to polish my skills and get a junior data engineering role
Haven't checked it out just yet. It looked mostly like a database engineer cert with a little DE stuff at the end. But I need to go through it.
Hey Ben, your videos have been extremely eye opening/helpful and I can't thank you enough for your contributions to the DE community.
Do you have any plans to make a video around ETL testing and what your experience with that has been in the past?
That has come up a lot for me for many Senior DE interviews and I mention making sure nulls are handled/row counts match/ transformation failures & surrogate key lookup failures are logged, but I never feel confident in my answer. I'd love to hear from someone like you on the best approach/answer to this question if you feel it'd help other DE's in the community.
Glad you liked it. I might need to consider that at some point.
Breaking into a Data Engineer position, would you recommend doing this through a Data Analyst role or working to really building the fundamentals required of the role for a junior position? Would you consider certifications related to SQL or Snowflake necessary ? Thanks for all the content you share to help others out!
08:00 - I can explain that: Junior Payment, Senior Work :D
Great Video as always! A Mid Level -> Senior video would be interesting. What really makes you senior on a (team-)communication and on a tech-skill level?
Thanks! Yeah in general sr. positions are a combination of better system level thinking, the ability to communicate and drive projects forward, guiding JRs, and helping prioritize projects. I think, in an ideal state it would be like.
Jrs. - Doing A lot of hands on work and Learning And Making Mistakes
Mid - Doing most of the hands on work and learning how to be impactful outside of tech
Srs - Doing some hands on work for examples, making big technical decisions, setting standards, and communicating with the business to make sure expectations are set.
How much coding there is usually required for Data Engineering roles ? i like coding but in my company soft engineering positions are mainly web dev projects which am not really a fan of
Been exploring data science field these days and this role in particular seems to have piqued my interest
I think understanding the basics of DS and A as well as OOP is solid. You don't need to necessarily need to know how to reverse a linked list, but you should know how to work with arrays, hashmaps, queues, etc.
@@SeattleDataGuy so like how much coding does a DE do on a day to day basis ? is it comparable to a Software Engineer
I would like to mention that there are actually Master's degree based completely on data engineering. Surprisingly they are job-oriented and their curriculum is also updated on yearly basis.
You remind me of someone so strongly but I can’t figure it out. Great vid btw
Interesting! Not sure who!
Great man
Hey Benjamin, that was a great video !!
But I do have few questions-
- How do you sell yourself when you're not sure if the skills you possess are actually that useful in the market ?
- Lets take my example if I have SQL, Tableau skills majorly, I don't know how can I sell my skill when 1000s people out their have the same set of skills ?
- Standing out probably sounds really cool, but how to become one ?
Looking forward for your reply.
Have a great day ahead !!
i just transitioned my career in data, im going for data science, but after watching ur videos i kinda want to become a data engineer now. lol
Haha! Don't stress too much. Most of us start going down the data science route but eventually find we might like other things.
Great content as always! Would like to ask some few questions, what python libraries are commonly used in Data Engineering jobs? Which is better for Data engineering, Scala or Java? Thank youuuu
I would say, worry less about libraries and make sure your coding skills are solid. OOP, DS&A, etc. Sure you can learn some pyspark, but I think understanding how to write good code in general is better. If you want to work for airbnb or netflix then scala otherwise you can probably pick on or the other.
Like I've seen Darshils video on getting caught up in the "certification loop" but what I need to understand is how to attain that vital information without going through bootcamps. For example, in my university CS Final year students have "Data Warehousing and Data Mining" courses offered to them and even though I'm a computer engineer there are no such courses for offer. My question being that can you make a detailed guideline of what courses should I be doing as I feel that your engineering roadmap was good in the sense of providing valid information in a hierarchy but "outdated" in terms of the courses (for example can you look into Udemy courses like Data Engineering Essential Hands on- SQL, Python, and Spark or even high end specializations by IBM on coursera or Udacitys Data Engineering program)
Much love from Pakistan.
I am working on an updated version for this since I know some of the courses need some refreshing.
Please do extensively look into Udemy courses like Data Engineering Essential Hands on- SQL, Python, and Spark or even high end specializations by IBM on coursera or Udacitys Data Engineering program, because most of us newbies don't even know where to start in terms of ETLs, Big Data or even cloud platforms.
Idk in witch country you are but in Europe if you don’t wanna be senior data engineer probably 5% of companies ask even primary school. Most importantly is knowledge of job specific then some head chefs, plastic surgery or web major highest education diploma. That make sense.
Comparing data engineer to a backend developer. Which one has the better work life balance? Which work is more tiring in your opinion? Also, which one is more prone to be on call? Which one is more suitable for remote work? What about the pay? Also, can I transition to backend dev from DE if I like to after a few years, any transferrable skills? I see significantly more backend positions but still DE is a secure job if I am planning to work remote in future and love in different countries?
Thanks.
A common sentiment is that Data Engineers get paid considerably less than Software Developers/Engineers in general. Do you think that is true, and if so, why do you think that is true? And if it is true, why be a Data Engineer if you could easily transition to be a standard SWE and get paid considerably more?
It can be true. It does depend where you work. For example, Netflix and Lyft pay their data engineers like SWEs from what I hear. But yes, at facebook data engineers make less.
Have 10+ years as a etl developer and made the switch last year to a data engineer working with dbt, git on a snowflake database. This year I want to start making content but find it hard to actually start. Do you have suggestions?
The biggest tip is don't worry about being perfect. Just try things. If you want to make videos, do it. Don't worry if your mic is perfect or your camera is perfect. First, work on practicing actually outputting content.
I don't know if it just me but many companies believe that somehow a data engineer is also a backend one. Similar between data engineer and database admin/analyst.
Yeah, there is a mix depending on the company. At some companies people are more like swes other places they are like analytics engineers.
Great content as always! For carrier-changers without much of data/tech background, do you recommend aiming for a Data Analyst position as a first step and transitioning to Data Engineering later? Or going straight for Data Engineering is doable as well?
It would probably be easier to go from analyst to de. If you saw the Booz job required 1 year of analyst experience..which does make sense to a degree. So I think that can be a natural shift
Hey I love your video currently I only have two available choices IT or computer engineering which is the best for data engineering? Tnx in advance
I think getting some form of CS degree is great. That way its easier to switch jobs in case you don't like data engineering. Never know what you like until you start doing the work.
@@SeattleDataGuy what about prones and cons of computer engineering in data engineering
I don't really think there are too many cons. Other than a lot of what you will learn won't provide value...but that can be said about a lot of degrees. Even my MIS degree...other than the database and programming courses..there wasn't a ton of value.
Great video. Currently focused on learning Python and SQL and I hope by end of 2022 to have solid understanding of all the mentioned subjects in your video (SQL, Python, Data Modeling, ETL, Data Warehousing, etc). However, I don't have any Bachelor's Degree whatsoever. I have been working in an IT Helpdesk environment (5 years as L1/L2, currently 6 years as Helpdesk Manager so all in all 10+ years) though. You think that will count for something and will allow me to land a junior/mid DE job?
It could help. It is hard without a bachelors degree. But maybe try applying? You never know. I have worked with people with all different backgrounds.
Hey, I work as a Client Partner in a technology company, we have plentyData engineers, developers, devops interviewed engineers by our customers, from what I have seen in the past 4 years, they are more or less focused on the skillset and experience and less keen on degrees. We have over 250 developers and engineers working with customers across the globe. Keep going, I believe you are on the right path and give me shout if you need help with resume or if you're interested in working with us. Cheers
@@SeattleDataGuy what if I got a bachelor´s degree in indutrial engineer? In my university they teach me python and rstudios, all focus in data and how to understand stats and work with them, but if im trying to search for a job i dont know if it would help.
Bachelors degree are just a tie breaker. If you are really GOOD and DEDICATED and a willing and QUICK learner, Elon Musk will HIRE you.
Hey thanks for sharing, again. I got a question, is it only me or I can't find too many channels related to data engineering, as much as other software areas? Do you know why is that?
Data engineering is just starting to pick up steam. Plus, we are camera shy.
Ben, I want to pursue data engineering. I have been an analyst in a few different industries the past three years. I am wondering if moving to a mid-level data engineering position makes sense given my experience?
It may or may not. It depends on if you were doing DE style work like ETL development, Data modeling, and so on. If you have, maybe there is a chance. But, generally speaking 3 years is enough to just break you into mid-level (at most faangs). So it might be hard to make such a technical jump to mid-level without a solid tech foundation.
Hey Ben, I was wondering if you would think that the Oreilly book 'Learning SQL' would be sufficient in terms of preparing you for the SQL knowledge for working on a project and on cracking interview?
I haven't checked it out. I think overall, when it comes to learning SQL you need the basic SQL clauses, window functions, some understand of data modeling and everything else likely can be googled. Like spending time figuring out how to format dates is a huge time suck..every DB has a different way to format a date and you could lose a lot of time spending time trying to remember every date formatting function.
What would you say to someone with no bachelor's (just an associate's in music 🙃) but more than a year of experience at a small company in BI?
It's a hard sell, but not impossible. I think most companies technically want you to have a bachelors. But what that is in, doesn't have to be tech. My manager at Meta who also worked at Netflix had an english degree. Experience, does tend to trump everything. But it depends where the experience is at. You can see if you can try to pivot your current job into a larger company.
The. Company I work for is offering a data engineer boot amp. They don't require any previous skills or knowledge.
I have been teaching myself front end development for quite some time now and thought this bootcamp could be a great way to go.
Any advice on standing out in this sort of situation? It is for a 16 week boot camp that if passed leads onto an entry level position.
You will likely want a solid foundation in SQL, python and databases. If you can come into that bootcamp with that knowledge they will hopefully get you some of the rest of the way there
@@SeattleDataGuy Thats awsome. Thanks.
what company is offering this?
@@sashajosephs1892 it was in-house only. I didn't even get an interview. Passed the aptitude test, did the technical test and didn't get any further. Found out my scores where 5/10 on the aptitude and 90/100 on the technical.
Hey Ben, i know that this is considered an "old" video lol, but how much SQL should i know before moving on the Python? I am good with querying, creating tables, databases, joining tables/query results. Should i know about SSIS, SSRS? Thanks!
What is the website you're using to find positions?
For someone with sql background and 10+ years of exp with sql and only relational DB, java programming and no exp with cloud technologies or nosql dbs, if they want to switch to data engineering space whats the best road map
Most likely focus on cloud. NOSQL plays some role, but in general classic SQL is usually enough to get the job.
Do you think it is better to do an internship before you apply for a permanent position instead of apply directly for a permanent position?
If you can still apply for an internship, because generally you need to be a student to do so, do it! It's much easier to get a job as an intern because the bar is lower. And if you get a return offer, thats amazing. The one downside is you might have a little less experience on more difficult technical interviews
Currently getting my masters In Data science and it's a little upsetting that there aren't any courses offered around data engineering. I guess my best bet would be to use supplemental onl9ne materials?
Yes, you will have to look into courses on data modeling/data warehousing, SQL(if your ds courses don't cover it well), and programming outside of pandas and numpy. This is a baseline.
Can you share the resources like where to learn spark, AWS , and all other skills?
Hi, what advice do you have for someone changing careers and trying to move into data engineering?
Learn SQL, Python and data warehousing to start and then build from there.
I am a former software tester who is very interested in data engineering. I already know SQL and am learning Python on my own. What's a good way to learn data modeling?
You can read kimball and inmon's books. Have you heard of these two?
@@SeattleDataGuy I have not heard of them...what are the titles?
@@cateclism316 Here are links to them
aatinegar.com/wp-content/uploads/2016/05/Kimball_The-Data-Warehouse-Toolkit-3rd-Edition.pdf
www.wiley.com/en-us/Building+the+Data+Warehouse%2C+4th+Edition-p-9780764599446
Hi Ben, I am long time admirer of your content. This is a solid video. Great info. I am a data engineer, currently persuing my masters in Computer Science. Is it possible to get my resume reviewed by you as I am currently looking for DE internships. Let me know. Thanks.
Thanks for reaching out! I just put out a post asking for emails. I now need to send out emails asking for resumes.
@Seattle Data Guy Hello Mate.One honest request here.Would you be kind to create a video on how do people with close to 10-12 years of IT experience prepare to be a Data Enginner if they so wish to transition into the said field.Thank you very much in advance
I’m trying to find a data engineer position; however, it seems these days they expect you to know Python or similar- which being in a Microsoft for 8 years now, I have not really dabbled much in. What would you say one should do to be more relevant in the data engineer space today?
Also, what would say are things one needs to know about data modeling?
I think I'm doing some wrong! I started in a company and I do all the pipelines with python and use some tools from cloud(GCP), like airflow(composer) and serverless(cloud function). All my day job is that! I use a lot python code to create the workflow. Should I change my way to develop pipelines to really achieve my data engineer career?
Where is your data coming from, can you find solutions to save how much time you spend coding?
@@SeattleDataGuy Most of data come from gcs(zipped, thats is already a problem to unzipp 9Gb file in cloud environment), but there are different characteristics each other, for example, one is positional, others are separate by ; or |, there are differents sizes (since 1 gb until 60gb), whether has header. I've been trying to choose some tool thats fits for all files sources, but I couldn't.So I've created a compute engine (VM) and I've done one python code for each file source but it seens improductive.Compute engine+python code solve the problem of huge zip(and huge file), but functions solve the problem of common files, maybe dataproc for standard pipelines, etc. I feel kind lost to choose right tools(because each tool seems for solve a specific problem) in this beginning of my data engineer journey and I end up developing my own code. For sure, I spend much time to coding and I thing that is no standarlization in my environment.
I have dealt with a lot of similar issues. Have you tried any low-code solutions? Perhaps fivetran. I don't recall if fivetran does positional files.
if you do go the code route, generally speaking you do want to split files first when they are too big. So create a script that breaks down the files and then maybe push them into another bucket. Whether they have headers or not or how they are split can be managed by some form of meta database or config file. All of which get fed to some main program. That program then has different routes in terms of how to treat each file. Is that what you are doing/
@@SeattleDataGuy Thanks for the tips! I've create a python program where each class is responsible for one step of ingestion, like download(from storage, ftp, api, etc) is one, unzip is another, transform and load. Some of them is easy to abstract for all pipelines, others like transform is hard, some file detail is enough to make another function from class. I'm afraid that the code be too complex. I think I have to decide/create a design pattern to my code! All workflows separated for example, or together as it is...
I'm trying to put in the same "route", just put some "if-elses" to the decide which path to transform...
Good work man... seriously....good work.
My question...
How easy is getting a DW job remotely and freelancing...for ¹) Junior ND ²) Mid-level folks.
Thanks
You know, Darshil has a lot of great content on freelancing which I think would be helpful for you. Check him out ruclips.net/user/DarshilParmarvideos
Can I start as a python developer and then go to a data engineer job?
You will probably just need some SQL and data modeling
@@SeattleDataGuy well, i have this. Being honest, I just don't like python that much, i prefer java. Anyway, I Love programming.
Hahaha, Maybe you can look for a role in Pinterest, Airbnb or Netflix.
Hello Ben, thanks for your videos. I am software engineer with 8 years experience, mostly web, SQL, PHP, Golang and so on. Please, tell, what is a better way transion my existing knowledge to data engineering field? Thank you.
Thats a great base. If you can apply your SQL to analytics/data modeling and add in some data warehousing development, then it can be an easyish switch
@@SeattleDataGuy Thank you
Hey Ben, thanks for the amazing content, it has been really helpful for me this past year.
I do have a question for you regarding this whole Finding DE jobs.
I have 4+ years of experience as a Data Scientist (purely modeling and some deployment) and I've been looking to transition into DE to cover the full spectrum of any data project. Given this, would do you consider "worth it" or "wise" to get a substantial reduction of your salary simply to get into DE as an entry-level? It's true, I don't have the full ETL + Cloud practical experience but going from a Senior to an entry-level role just to break into DE, seems like a big change/risk. How would you bridge the gap of experience?
Thanks again for your videos, looking forward to your reply.
Cheers!
I would see if you can break into at least a mid-level career. You have other valuable skills that aren't technical that could drive value as well.
I would say see if you can get some ETL development at your job, maybe try moving laterally into a DE position at your company.
I don't know if it's "worth it". Truly, only you can answer that question. If you completely hate DS work, then yeah. But if you're just looking for something new, then I would attempt to take on an ETL project at work and see if you like it.
Hey thank you so much it and it would really help me if you could tell me where can I learn pipelines from plz plz reply
It just depends on the tools you are looking to use. You could use airflow to orchestrate some pieces, you could use SSIS, it just depends on the tool.
If those who don't have IT backround so they also do this cource because i am really interested in data engineering please tell me
It does help. What is your background in?
I was planning to do the same video 🙃 amazing video btw, I will make my Indian version :P
Why not both? We both have very different perspectives! And if anything, we can both recommend the other persons video at the end. People love hearing a lot of different perspectives!
Do you think a guy with an International relationa bachelor and HR background but with experience with SQL, Python, power BI and automation of reports can become a data analyst or engineer?
Analyst for sure! DE might take a little more work but also possible!
Hi,Is it possible for someone with an engineering background(not computer science) to transition into the data engineering field?
Yes, I have seen people with a broad set of backgrounds take on DE roles
Would it be a hard transition from having a data science degree to data engineering? Would there be a big difference and a lot more to learn?
You will need to learn about data management, data warehousing, ETLs/data pipelines, SQL, Spark and a few other skills.
@Seattle Data Guy Data Engineering needs Spark skills?
You should. Its a great skill
As a manual tester with 3+ yrs experience., can I eligible for DE job. If yes, how many months it takes Seattle
Have you been coding a lot or writing SQL. It really depends on your other skills.
Yes I hava Java, python and SQL basic knowledge
bro you haven’t finished the data engineering project series yet
You're right. I was trying to work a full-time job, consult, put out a newsletter and way too many other things. I recently quit facebook and am more focused. At some point I do want to do a new data engineering project. But the lesson from that first attempt was first finish the project then make the video series.
You sound a bit like Cal newport
I am not sure who that is!
Booz is pronounced like Booze 🥃Cheers!
Is it, I felt like that would be wrong 😅
What if you hate SQL but love every other part of this 😂
Boy do I have a video for you
Thanks for the video Ben! I sent you a note on from your website. Please check it out, thanks buddy!
Straight up Database Administrator is Dead! Data Engineer is alive and thriving!
Ooof, yeah DBAs are not a role i see anymore what so ever. I am sure they exist here and there.
I'd say more than 50% of titles labelled "junior data engineer" are requiring min 2 years of experience & pay less than data analysts. Lol...these recruiters and managers need to get their heads out their ass and get their shit together. Lol
Yeah..its really frustrating
@@SeattleDataGuy Btw, I just missed the $59 sale on Datacamp for annual fee. Gonna wait until next month to see if they have it again. Lol...still got Dataquest. And based on what you mentioned in your vid...going through Part 5 & 7 before my subscription ends.
Are these intros enough to start the projects (beginner/mid) that you mentioned in your other vid?
You should consider shaving the neckbeard ! Would clean up your look 100fold man! No hate, just seems like a missed opportunity
I concur :)
IBM with that garbage job description 🤣
I actually have seen some people asking for 5+ years of experience for jr roles
(pronounced Booze = Drink) Booze Allen Hamilton
For some reason that just didn't seem right! But if thats it, then thats it!
@@SeattleDataGuy I had an opportunity to interview with them, but I foolishly didn’t take it and took an offer I should have refused so I will forever remember them.
Are you not happy in your current role?
@@SeattleDataGuy they never wanted me to get promoted, because they only wanted me to keep fixing there problems then they hired other people when I decided to leave they offered a few things but not much, and majority of the people they hired to keep me in my place they either left transferee or got fired.
Data engineer is the new sexy!
Ha! Shh, it's a secret.