What tools should you know as a Data Engineer?

How to Create a Data Modeling Pipeline (3 Layer Approach)

Data Architecture 101: Kappa (Real-Time Data)

200 Miles Off-roading With Oliver Anthony

Olympic Mini Games Battle

SecretGarage Update #10 The New Concrete System!!

Modern Data Engineering Workflows, Explained

Kahan Data Solutions

Просмотров 5 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 27 июл 2024
Modern data engineering isn't all about tools & technologies.
One area that's often overlooked is the concept of "workflows".
In particular, data team workflows for continuously building projects.
This includes everything from environments, naming conventions, automation and more.
In this video, you will:
- Learn high level design of common team workflows
- See an example implementation
- Be able to identify whether or not you're following this yourself
Thank you for watching!
►► The Starter Guide for The Modern Data Stack (Free PDF)
Simplify the “modern” data stack + better understand common tools & components → bit.ly/starter-mds
Timestamps:
0:00 Intro
0:21 Why It's Important
1:33 Design & Process Review
4:39 Database Example (Snowflake)
Title & Tags:
Modern Data Engineering Workflows, Explained
#kahandatasolutions #dataengineering #datapipeline

Комментарии • 12

@KahanDataSolutions 8 месяцев назад ⁺²
►► The Starter Guide for Modern Data → bit.ly/starter-mds
Simplify “modern” architectures + better understand common tools & components
@jacobukokobili6457 8 месяцев назад ⁺¹
Thanks for this Kahan. Please make a video implementing the workflow like you've done with the CI/CD. Thanks again.
@dataruncoach 8 месяцев назад ⁺¹
Very clear and concise, thank you
@KahanDataSolutions 8 месяцев назад
Glad it was helpful!
@marcosoliveira8731 8 месяцев назад
A lot of good ideas from your videos has inspired me to improve my development flow.
@goosetaculous 8 месяцев назад
I love it. Already doing but it's a good reminder
@felipecondore4173 8 месяцев назад
Its a very clear explanation
@vishal_uk Месяц назад
Hi Mike! Could you please clarify the following:
After the developer makes some changes in the model and raises a PR so that his changes are reviewed/auto-tested in the QA/CI DB/Schema, and later merged to the Main branch, Is the QA/CI a replica of Prod DB(warehouse and marts) where it reads data from Staging and validates the changes prior getting merged to main? Thanks in advance!
@NicoWright-ly6en 3 месяца назад ⁺¹
Hi Kahan, a question I have after watching many of your videos. What about a client's situation makes you think one tool would fit better than another? For example Snowflake vs BigQuery.
@MrUbbers 8 месяцев назад
In our setup we have multiple environments (DEV, QA, PROD), all seperate including the raw sources including the ETL. This doubles our costs at least. The setup that you showed eliminates the extra costs for processing and storage by using one environment, right? How do you deal with upgrades and changes in the raw datasource layer? For example a source system that has significant changes in its database schema after an upgrade? Just add another schema in the raw database?
@EMBrown801 8 месяцев назад
Would you need separate dev schemas for the staging and marts? Let's say I want to develop a new mart. Would I put all of those models in the same dev schema before going to production?
@KahanDataSolutions 8 месяцев назад ⁺¹
I typically will do that. I like to keep all tables/views in a single Dev schema (ex. all Staging, Warehouse, Marts) to avoid excessive objects and keep it simple. The way I see it, nobody else is really looking at that schema so perfect separation & organization isn't as important. What's more important is that you can confirm models deploy, check the data, etc. Then once you move to "production", separate things out by specific schemas. Hope that helps!

Следующие

Автовоспроизведение

What tools should you know as a Data Engineer?

What tools should you know as a Data Engineer?

How to Create a Data Modeling Pipeline (3 Layer Approach)

How to Create a Data Modeling Pipeline (3 Layer Approach)

Data Architecture 101: Kappa (Real-Time Data)

Data Architecture 101: Kappa (Real-Time Data)

200 Miles Off-roading With Oliver Anthony

200 Miles Off-roading With Oliver Anthony

Olympic Mini Games Battle

Olympic Mini Games Battle

SecretGarage Update #10 The New Concrete System!!

SecretGarage Update #10 The New Concrete System!!

Bridgerton FINALLY made me CRY (season 3)

Bridgerton FINALLY made me CRY (season 3)

What is a Data Architecture? Modern Data Architectures Explained

What is a Data Architecture? Modern Data Architectures Explained

The Missing Piece in Many Data Pipelines

The Missing Piece in Many Data Pipelines

What is Data Mesh?

What is Data Mesh?

Data Automation (CI/CD) with a Real Life Example

Data Automation (CI/CD) with a Real Life Example

Data Modeling in the Modern Data Stack

Data Modeling in the Modern Data Stack

3 Must-Know Trends for Data Engineers | DataOps

3 Must-Know Trends for Data Engineers | DataOps

Java Is Better Than Rust

Java Is Better Than Rust

Best Toilet Gadgets and #Hacks you must try!!💩💩

Best Toilet Gadgets and #Hacks you must try!!💩💩

Лепим из пластилина🐍

Лепим из пластилина🐍

Братик украл мою песню #iribaby #shorts

Братик украл мою песню #iribaby #shorts

Самый вкусный датский хот дог на углях. #мангал #хотдог #еда

Самый вкусный датский хот дог на углях. #мангал #хотдог #еда

КРАСИМ ДЕНЬГИ В РОЗОВЫЙ!

КРАСИМ ДЕНЬГИ В РОЗОВЫЙ!

Итальянские банды. История, которую не покажут в кино | ФАЙБ

Итальянские банды. История, которую не покажут в кино | ФАЙБ

Почему адаптеры Apple могут быть опасны?

Почему адаптеры Apple могут быть опасны?

Пробуем КИТАЙСКИЙ Макдональдс и КФС с Катя Клэп и Анастасиз

Пробуем КИТАЙСКИЙ Макдональдс и КФС с Катя Клэп и Анастасиз