Save 50 percent of your Data Engineering effort via Continuous Queries

Real-time Analytics with Cloud Spanner CDC

Stream sharing with Pub/Sub using Analytics Hub

We Made Sushi, It's Scary! (Roblox Scary Sushi)

NEW DRAGON HUNTER NPC FULL GUIDE | DRAGON HEART QUEST? | Blox Fruits...

Rory McIlroy, Scottie Scheffler vs Bryson DeChambeau, Brooks Koepka | Crypto.com Showdown Highlights

Use BigQuery Transfer Service to Optimise a File Based CDC

PracticalGCP

Просмотров 562

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 31 дек 2024

Комментарии • 9

@abednegosantoso 2 месяца назад ⁺¹
Thank you for this superb explanation. If I get it right, this is about how we ingest the data to our BigQuery, right?
At the moment, our solution is to use Datastream. Which one do you think is better?
@practicalgcp2780 2 месяца назад
Yes it’s about ingesting data into BigQuery. However, it’s more focused on situations where your CDC solution’s target isn’t BigQuery but GCS (cloud storage).
Not everyone had already ingested data into BigQuery, some did it years ago where BigQuery connector isn’t supported, or have issues when it comes to handling schema changes. For these customers, changing it to another solution can be really costly and time consuming.
Datastream is a solution does CDC all the way to BigQuery, if you already has it, no need to change. A solution like Datastream ingest data directly into BigQuery offers much better latency (almost near real time), compared to file based CDC.
@abednegosantoso 2 месяца назад
@@practicalgcp2780 if I'm not mistaken, Datastream also allows GCS as destination?
Anyway, have you tried Iceberg on GCS and use BigQuery on top of it?
@andrelsjunior 2 месяца назад ⁺¹
Nice tips!
In that SCD2 type. How do you recommend to do the most optimized soft delete?
@practicalgcp2780 2 месяца назад
Thank you! I don’t know what’s most optimised, but the model I see the most is to have a valid_from and valid_to column with timestamps.
@andrelsjunior 2 месяца назад
But then running an update query after ingesting for old records?
@practicalgcp2780 2 месяца назад
@@andrelsjunior I haven’t done it for a while but typically on BigQuery this isn’t done through traditional update but DML Merge (I believe DBT can manage this quite easily without needing you to write the DML).
The other way to do it (again very mixed, some like it some don’t), it’s just to add a row with the new data the first time you see it, then you have both rows with their own timestamps and it works well with incremental workload without needing to update any pre existing records. But I am not sure if this follows any open standard, personally I find it’s easier to deal with.
Whichever way you do it, it’s really important to make sure the whole thing can always be rebuilt from data lake, this is easy enough to do and manage with tools like DBT but as a concept it’s very important regardless of the tooling choices, so if the logic is wrong or needs updating, it’s just a full rebuild and things are all fixed.
@andrelsjunior 2 месяца назад
@@practicalgcp2780 appreciate your pov!
I was thinking about using dbt specifically, but it adds an extra layer haha
Seems that this is the best way.
@practicalgcp2780 2 месяца назад
A good question got asked about schema management, see www.linkedin.com/feed/update/urn:li:activity:7248816174312960000?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7248816174312960000%2C7248834731771858944%29&replyUrn=urn%3Ali%3Acomment%3A%28activity%3A7248816174312960000%2C7248943929612468224%29&dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287248834731771858944%2Curn%3Ali%3Aactivity%3A7248816174312960000%29&dashReplyUrn=urn%3Ali%3Afsd_comment%3A%287248943929612468224%2Curn%3Ali%3Aactivity%3A7248816174312960000%29

Следующие

Автовоспроизведение

Save 50 percent of your Data Engineering effort via Continuous Queries

Save 50 percent of your Data Engineering effort via Continuous Queries

Real-time Analytics with Cloud Spanner CDC

Real-time Analytics with Cloud Spanner CDC

Stream sharing with Pub/Sub using Analytics Hub

Stream sharing with Pub/Sub using Analytics Hub

We Made Sushi, It's Scary! (Roblox Scary Sushi)

We Made Sushi, It's Scary! (Roblox Scary Sushi)

NEW DRAGON HUNTER NPC FULL GUIDE | DRAGON HEART QUEST? | Blox Fruits...

NEW DRAGON HUNTER NPC FULL GUIDE | DRAGON HEART QUEST? | Blox Fruits...

Rory McIlroy, Scottie Scheffler vs Bryson DeChambeau, Brooks Koepka | Crypto.com Showdown Highlights

Rory McIlroy, Scottie Scheffler vs Bryson DeChambeau, Brooks Koepka | Crypto.com Showdown Highlights

"BENDY: LONE WOLF" - Official Trailer - Coming 2025

"BENDY: LONE WOLF" - Official Trailer - Coming 2025

Run Cloud Composer Locally

Run Cloud Composer Locally

How to quickly edit Google Sheets spreadsheet using the API

How to quickly edit Google Sheets spreadsheet using the API

Use Continuous Query and LLM to Understand Customer Feedback

Use Continuous Query and LLM to Understand Customer Feedback

DBT Core on Cloud Run Job

DBT Core on Cloud Run Job

Scaling development teams with Cloud Workstations

Scaling development teams with Cloud Workstations

A practical application leveraging Langchain and BigQuery Vector Search

A practical application leveraging Langchain and BigQuery Vector Search

Analysing Logs Using Data Canvas

Analysing Logs Using Data Canvas

How effective is History-based Optimisations on BigQuery

How effective is History-based Optimisations on BigQuery

How to build a sustainable data ecosystem on Google Cloud

How to build a sustainable data ecosystem on Google Cloud

Переехал в аниме фул (фул)

Переехал в аниме фул (фул)

Новогодний концерт Жара 2025

Новогодний концерт Жара 2025

Подарок на Новый год 🎁

Подарок на Новый год 🎁

Скулбой 3: НОВЫЙ (СРАНЫЙ ) ГОД - ТРЕЙЛЕР ( Будьте Осторожны С Желаниями ) Schoolboy 2 Trailer

Скулбой 3: НОВЫЙ (СРАНЫЙ ) ГОД - ТРЕЙЛЕР ( Будьте Осторожны С Желаниями ) Schoolboy 2 Trailer

Я нарушила все свои «Женский правила» на первом свидании с будущим мужем🤌🏻😄

Я нарушила все свои «Женский правила» на первом свидании с будущим мужем🤌🏻😄

ВЫХОДНЫЕ за 100$ VS 1000$ ЧЕЛЛЕНДЖ!

ВЫХОДНЫЕ за 100$ VS 1000$ ЧЕЛЛЕНДЖ!

Пирогу Запилил!!!

Пирогу Запилил!!!

Монеточка - Выше крыш (Новогоднее видео 2025)

Монеточка - Выше крыш (Новогоднее видео 2025)