Datavault
Datavault
  • Видео 77
  • Просмотров 314 661
Links in Data Vault - The Datavault Podcast E8
Welcome to The Datavault Podcast, your go-to source for everything Data Vault!
In this episode, Alex Higgs (AutomateDV Product Manager) and Neil Strange (CEO and Founder of Datavault) explore Links in Data Vault.
For more expert insights, check out our services designed to help organizations like yours unlock the full potential of Data Vault.
👉 Explore our training courses to level up your data management skills: bit.ly/4eNkczq
👉 Download our free Data Vault resources: bit.ly/4f9AYs9
👉 Book a free consultation with our team: bit.ly/3BT6VXC
Subscribe to never miss an episode and let us know in the comments if you have any questions or topics you'd like us to cover in the future!
Просмотров: 48

Видео

Data Mesh & Data Vault on Snowflake
Просмотров 339День назад
Join Patrick Cuba, Senior Solutions Architect at Snowflake, as he shares his expertise on combining Data Mesh, Data Vault, and Domain-Driven Design. With over 20 years of experience, Patrick is a leading expert in Data Vault 2.0 and the author of 'The Data Vault Guru.' In this session, Patrick will explain how Data Mesh decentralises data ownership and promotes a data-driven culture. He'll disc...
PII Data in Data Vault - The Datavault Podcast E7
Просмотров 5414 дней назад
Welcome to The Datavault Podcast, your go-to source for everything Data Vault! In this episode, Alex Higgs (AutomateDV Product Manager) and Neil Strange (CEO and Founder of Datavault) describe how you to get started with Data Vault. Whether you're just getting started with Data Vault or are looking to optimize your current implementation, we’ve got you covered. For more expert insights, check o...
Data Vault Hubs Explained - The Datavault Podcast E5
Просмотров 6714 дней назад
Welcome to The Datavault Podcast, your go-to source for everything Data Vault! In this episode, Alex Higgs (AutomateDV Product Manager) and Neil Strange (CEO and Founder of Datavault) describe how you to get started with Data Vault. Whether you're just getting started with Data Vault or are looking to optimize your current implementation, we’ve got you covered. For more expert insights, check o...
The Data Vault Conference - The Datavault Podcast E6
Просмотров 38Месяц назад
Welcome to The Datavault Podcast, your go-to source for everything Data Vault! In this episode, Neil Strange and Alex Higgs debrief from the 2024 Data Vault User Group Conference at Royal Holloway, University of London. For more expert insights, check out our services designed to help organizations like yours unlock the full potential of Data Vault. 👉 Explore our training courses to level up yo...
Data Engineering with dbt - a pragmatic approach​
Просмотров 177Месяц назад
Catch up on our recent online meetup where Roberto Zagni introduced the Pragmatic Data Platform (PDP), a practical solution that combines the best of Software Engineering and modern data platform architectures. Learn how to leverage Data Vault with an easier learning curve and make the most of your existing skills. In this dynamic session, Roberto shared invaluable insights on efficient data st...
Understanding Business Keys in Data Vault - The Datavault Podcast E4
Просмотров 119Месяц назад
In this episode of the Data Vault Podcast, Neil Strange and Alex Higgs dive into the essential concept of business keys-those unique identifiers that ensure seamless integration and interoperability within Data Vault hubs. For more insights, check out our services designed to help organizations unlock the full potential of Data Vault: 👉 Explore our training courses: bit.ly/4eNkczq 👉 Download fr...
Perfect Harmony: Modeling Data with Ellie & Haley for the Willibald Team
Просмотров 108Месяц назад
Join Andreas Heitmann, a seasoned Business Consultant from Alligator-Company in Germany, as he shares his expertise in automated Data Vault solutions. With 18 years of consulting experience, Andreas specializes in tools like dbt, Vaultspeed, Data Vault Builder, AutomateDV, Snowflake, and Exasol. In this insightful virtual meetup, Andreas will cover the importance of defining the right scope and...
The Data Vault Q&A Forum - Datavault Podcast E3
Просмотров 39Месяц назад
Welcome to The Datavault Podcast, your go-to source for everything Data Vault! In this episode, Alex Higgs (AutomateDV Product Manager) and Neil Strange (CEO and Founder of Datavault) explore the Data Vault Q&A Forum. What you'll learn in this episode: 0:00 - Introduction 0:43 - What is the Data Vault Q&A Forum? 1:30 - Who is on the forum? 2:18 - Exploring the forum 4:23 - Thanks for watching! ...
Is Data Vault right for you? - The Datavault Podcast E2
Просмотров 992 месяца назад
Welcome to The Datavault Podcast, your go-to source for everything Data Vault! In this episode, Alex Higgs (AutomateDV Product Manager) and Neil Strange (CEO and Founder of Datavault) tackle the key question: Is Data Vault right for your business? They dive into the benefits of Data Vault, from rapid iteration and seamless integration of multiple data sources to enhancing audit, compliance, and...
How to get started with Data Vault - The Datavault Podcast E1
Просмотров 2312 месяца назад
Welcome to The Datavault Podcast, your go-to source for everything Data Vault! In this episode, Alex Higgs (AutomateDV Product Manager) and Neil Strange (CEO and Founder of Datavault) describe how you to get started with Data Vault. Whether you're just getting started with Data Vault or are looking to optimize your current implementation, we’ve got you covered. What you'll learn in this episode...
Automating your Data Platform for Self-Service
Просмотров 742 месяца назад
This is part 4 of our 4-part webinar series with erwin on practical Data Mesh Implementations. Discover the power of self-service in data management with our final webinar. Learn how Erwin Data Intelligence can automate your Data Vault data platform. The approach simplifies access but also ensures that data is reliable, and governance is maintained to unlock the potential of self-service for yo...
McDonald’s Nordics: Enabling improved focus on modelling and the business​​
Просмотров 852 месяца назад
Christian Ivanoff discusses how McDonald’s Nordics has embraced Data Vault to unify their complex reporting and business intelligence requirements. He will explain how the architecture selected made their technical work easier because of flexibility and automation. This enabled them to focus more on answering the business questions and data modelling. McDonald’s master franchisee Food Folk (McD...
Decentralising - getting the balance right with Federated Governance
Просмотров 692 месяца назад
Decentralising - getting the balance right with Federated Governance
Removing the Barriers to Delivery, Domain-Orientated Architecture
Просмотров 823 месяца назад
Removing the Barriers to Delivery, Domain-Orientated Architecture
Textual Data - A Brave New World
Просмотров 873 месяца назад
Textual Data - A Brave New World
Migration challenges in Data Warehousing
Просмотров 833 месяца назад
Migration challenges in Data Warehousing
Implementing Data as a product
Просмотров 2113 месяца назад
Implementing Data as a product
Data Vault for Developers Training
Просмотров 2945 месяцев назад
Data Vault for Developers Training
Data Vault User Group Conference 2024
Просмотров 1696 месяцев назад
Data Vault User Group Conference 2024
JOINing your data teams with dbt Mesh
Просмотров 3086 месяцев назад
JOINing your data teams with dbt Mesh
Model-Driven Data Vault Construction
Просмотров 3406 месяцев назад
Model-Driven Data Vault Construction
4 Rules for Successful Data Vault Projects
Просмотров 2277 месяцев назад
4 Rules for Successful Data Vault Projects
How Twine can efficiently move data from Data Vault to Data Mart
Просмотров 3337 месяцев назад
How Twine can efficiently move data from Data Vault to Data Mart
How supercharged CI/CD & Data Vault ensures data quality and development agility
Просмотров 1679 месяцев назад
How supercharged CI/CD & Data Vault ensures data quality and development agility
Clearing Skies for Cloud Data Warehousing
Просмотров 19710 месяцев назад
Clearing Skies for Cloud Data Warehousing
Data Mesh and Data Vault
Просмотров 801Год назад
Data Mesh and Data Vault
Fifty First Dates with Data Vault
Просмотров 208Год назад
Fifty First Dates with Data Vault
Agile building of Information using Data Vault 2.0
Просмотров 1,2 тыс.Год назад
Agile building of Information using Data Vault 2.0
Data Vault Performance & Constraints on Snowflake
Просмотров 2,1 тыс.Год назад
Data Vault Performance & Constraints on Snowflake

Комментарии

  • @devyaninair2793
    @devyaninair2793 28 дней назад

    Like the presentation, gives insights on practial apporches

  • @navneetsajwan765
    @navneetsajwan765 Месяц назад

    brand name booking date together cannot ensure a unique booking key

  • @Koshur1185
    @Koshur1185 2 месяца назад

    The voice is a bit low..it's only for me :(

  • @tahabekmez5072
    @tahabekmez5072 2 месяца назад

    Very informative video!

  • @АртёмМеркулов-ю3к
    @АртёмМеркулов-ю3к 2 месяца назад

    Thank you!

  • @oldguywholifts
    @oldguywholifts 3 месяца назад

    Useful and great intro. Covers essentials

  • @Sp00pySparySpeletons
    @Sp00pySparySpeletons 4 месяца назад

    This was a great presentation. This part in particular is what made data vault click. There is no "right model". The right model is the one that works for the needs of your project.

  • @sobebhaduri1158
    @sobebhaduri1158 4 месяца назад

    Booking Details satellite should have Booking hash key but it was mentioned Customer Hash

  • @JuanHernandez-pf6yg
    @JuanHernandez-pf6yg 5 месяцев назад

    Very useful series. Thank you!

    • @Datavault
      @Datavault 4 месяца назад

      Glad it was helpful!

  • @dataarq945
    @dataarq945 5 месяцев назад

  • @goldnutter412
    @goldnutter412 6 месяцев назад

    Information does not exist in this "physical realm" Only structured data. Go Datavault.🤩

  • @rupanag
    @rupanag 9 месяцев назад

    Can you please help me with these . Would be great if you can explain with some examples. 1. Why link to link relationship is not recommended in RDV? 2. In BDV bridge table, if we are storing only hash keys( not natural keys), then how in fact/dimension we are going to get natural keys?

    • @nglamont7015
      @nglamont7015 6 месяцев назад

      Hi, I also have a doubt about how the satellite to handle the SCD(slow changing data) case, could you help me to clarify it, thank! If there a satellite stored the address information with a hash key, when the incremental data come, a address was updated. will have two hash keys related to these two address info or only one hash keys and the different is the load_date?

  • @rupanag
    @rupanag 9 месяцев назад

    Can you please help me with these . Would be great if you can explain with some examples. 1. Why link to link relationship is not recommended in RDV? 2. In BDV bridge table, if we are storing only hash keys( not natural keys), then how in fact/dimension we are going to get natural keys?

    • @Palontras
      @Palontras Месяц назад

      well, they are not recommended for several reasons: first of all: loss of modularity data vault’s modular design separates hubs, links, and satellites to create a clean architecture. linking two links directly breaks this modularity, because links are supposed to represent associations between hubs, not between other links. if links connect links, it becomes harder to maintain and extend the model. redundancy and complexity: a link-to-link relationship introduces redundancy since the original hubs already contain the necessary relationships. querying becomes complex, requiring additional joins or custom logic to trace back to the original hubs. second thing: semantic clarity links are designed to represent business relationships, not relationships between relationships. if two links have a dependency, it may indicate a missing business concept or hub. best practice is to use a link that directly connects hubs instead of linking links, you can create a new link table that directly connects the relevant hubs. for example, if you have a salesorderlink (connecting customerhub and orderhub) and deliverylink (connecting orderhub and deliveryhub), and you need to relate customers to deliveries, create a new customerdeliverylink. let’s say: hubcustomer has a hash key for customers. huborder has a hash key for orders. hubdelivery has a hash key for deliveries. Instead of doing this: SalesOrderLink → DeliveryLink → Some indirect connection. Do this: Create a new link like CustomerDeliveryLink, which directly connects CustomerHub and DeliveryHub with their hash keys.

  • @rupanag
    @rupanag 9 месяцев назад

    Can you please help me with these . Would be great if you can explain with some examples. 1. Why link to link relationship is not recommended in RDV? 2. In BDV bridge table, if we are storing only hash keys( not natural keys), then how in fact/dimension we are going to get natural keys?

  • @marcoullasci
    @marcoullasci Год назад

    Around 2m37s for the link table the natural key isn't the concatenation of customer hash and booking hash? If it is so shouldn't the "customer booking hash" be calculated as the md5 of them instead of the md5 of the natural keys extracted from of the staging table? The md5 of the concatenated natural keys shouldn't happen to be identical to the md5 of the concatenated hashes of the natural keys, right?

  • @brookereeve276
    @brookereeve276 Год назад

    😩 *Promosm*

  • @Juan-Hdez
    @Juan-Hdez Год назад

    Helpful. Thank you.

  • @JohnoScott
    @JohnoScott Год назад

    This is a fantastic talk from Patrick going into great detail

    • @Datavault
      @Datavault Год назад

      Thanks for watching, we hope you enjoyed!

    • @emanueol
      @emanueol 7 месяцев назад

      vgood content as always with PCuba 😊

  • @furqan385
    @furqan385 Год назад

    what's the difference between hash_diff column and other hash column at 7:42. Aren't both same?

    • @Datavault
      @Datavault Год назад

      Both look the same, but they have a different purpose. Hashdiff is used to detect changes in the payload of a satellite. Rather than checking each column individually it combines all the columns to check together in one hash and then that's checked. If the hashdiff changes it will only be because the value of one of the payload columns has changed, therefore we can assume the data has changed. We hope this helps. Why not join in the Data Vault User Group Forum? It's full of industry experts to answer whatever questions you may have - forum.ukdatavaultusergroup.co.uk/

  • @Chillos100
    @Chillos100 Год назад

    Genius!! Thnx a lot!!

  • @yuezhu9961
    @yuezhu9961 Год назад

    LIKE the technique discussions, but DO NOT like the political examples.

    • @convoluted2348
      @convoluted2348 Год назад

      you cant avoid reality if your job is to mitigate these factors, you wanted a real life example you will get real life problems

  • @Алексей-ъ9е9щ
    @Алексей-ъ9е9щ Год назад

    data vault what form of data?

  • @isinpresshenning
    @isinpresshenning Год назад

    Very good presentation and content on streaming into a data vault. Thank you!

  • @imrankhan-jj6fs
    @imrankhan-jj6fs 2 года назад

    highly informative and was latched on to it all the time. Excellent.

  • @elmehdiouafiq
    @elmehdiouafiq 2 года назад

    Very good explanation!

  • @georgepolskiy8411
    @georgepolskiy8411 2 года назад

    Thank you!

  • @isinpresshenning
    @isinpresshenning 2 года назад

    Can you explain why you do not recommend implementing a DV on a sql server? Why do you think snowflake is a better choice?

  • @nikunj150188
    @nikunj150188 2 года назад

    Thank you. I was interested to look into satellite loading sql as well. Which book did you ask to refer to?

  • @adeo5384
    @adeo5384 2 года назад

    Thank you! This slide was the most important slide for me. Showing the architecture enabled me to understand how Data vault works and how it can be used in my organization.

    • @Datavault
      @Datavault 2 года назад

      We are glad it helped!

  • @briastarek7923
    @briastarek7923 2 года назад

    þrðmð§m 😡

  • @russellsearle7366
    @russellsearle7366 2 года назад

    Gert presentation Neil. You display a lovely style of communication. I recommend your videos to those coming to terms with DV for the first time

    • @Datavault
      @Datavault 2 года назад

      Thank you for your comment! We are glad they are useful!

  • @khanhscorpion6405
    @khanhscorpion6405 2 года назад

    thank you so much

  • @adityat8336
    @adityat8336 2 года назад

    Is there a next session continuation for this video..This is really good 👍

    • @Datavault
      @Datavault 2 года назад

      We are working on other sessions soon!

  • @ThePage91
    @ThePage91 2 года назад

    Can you provide a link to the white paper mentioned in the beginning of the video?

    • @Datavault
      @Datavault 2 года назад

      Hi Pasi, www.data-vault.co.uk/what-is-data-vault/ If you follow this link, scroll half way down the website and you will find all of our white papers - including the one mentioned at the beginning of this video. I hope this helps :)

  • @NSBMADHU
    @NSBMADHU 2 года назад

    I love it .. :) Always there is a simple solution to a hard problem. You are making data modeler job less ..

    • @Datavault
      @Datavault 2 года назад

      Thank you very much!

  • @belasec3672
    @belasec3672 2 года назад

    In my entire life I was waiting for the Puppini Bridge.

  • @piotrsz.8115
    @piotrsz.8115 2 года назад

    Hi Neil, I like this training a lot : ) One question though - why does it mention the effective dates in the satellites? I'm quite sure that DV 2.0 is about insert-only architecture, so getting rid of any updates, which means we cannot have the effective / end dates in the raw vault. Please comment : )

    • @Datavault
      @Datavault 2 года назад

      Hi Piotr, thank you for your comment. Its rare that Neil looks at these comments, however its best to post this question on the Data Vault Q&A Forum - forum.ukdatavaultusergroup.co.uk/ where industry experts including Neil will see and answer this or any other questions you may have! :)

  • @juanlerch
    @juanlerch 2 года назад

    Interesting. Look like a great solution and I feel like I need to go deeper... I have a first question... In this model... What is the convenient way to implement more than one relationship to the same table? EG: Email -> to:Contact from:Contact

  • @fb-gu2er
    @fb-gu2er 2 года назад

    My concern here is that introduces a new ETL layer. You have to maintain the bridge table. What if my fact tables are very large, with several billion rows? That can be expensive to maintain. If I insert or delete from the fact table I have to do the same on the bridge. It can get problematic very quickly

    • @francescopuppini
      @francescopuppini 2 года назад

      Hi Fernando, thanks for your comment. You are right: there is one additional step of maintenance to do. But then, you don't have to maintain the dozens (or hundreds) of ad hoc reports, which sometimes are outdated, but no one takes the responsibility to make obsolete. With the USS, you only maintain a live self service environment, useful to everyone. As for the system effort, I recommend saving the Puppini Bridge as a set of physical tables (one for each stage). Imagine your fact table of Sales has 100 million rows (lucky you, BTW!!). If you have an incremental load of that table, once it is finished, it will have 101 million rows. Then you need to update your table Sales_PBS (Puppini Bridge Stage). That table will also need to have 101 million rows. The Puppini Bridge as a full table, with all the stages, should be virtualized: a query that creates on the fly the UNION ALL of all the _PBS physical tables. I recommend maintaining the script of the view with DBT, because it makes it easier when a brand new table gets added to the USS, and some new key columns need to be added to the UNION ALL. It's hard, I agree. But self service BI, until now, was only a very limited success. With this approach, self service becomes really possible. I hope it helps. You can contact me on LinkedIn and we can talk further! 🙂

  • @jonathanamen9706
    @jonathanamen9706 2 года назад

    40:19 A wild cat appears! Just needed to timestamp that for reference. Great video! Learned a lot!

  • @isinpresshenning
    @isinpresshenning 2 года назад

    Excellent presentation. This guy solves real world problems. I liked the way the talked about unit of work for the links.

    • @Datavault
      @Datavault 2 года назад

      Thank you for your comment!

  • @isinpresshenning
    @isinpresshenning 2 года назад

    If each stack of colors represents a dataset then the model is incorrect. The model contains just one link (red, purple and blue) and the link is coloured in Yellow. That link is Correct only if the values of the yellow fields are non hubs making the link peged leged. I am missing the link from stack2(purple, light purple, red, green and blue) and the link from stack3(yellow and blue). Why didnt you model them? This model is not Jedi safe, data Cannot be recreated from the model as it was in the sources.

    • @nasirjamil4808
      @nasirjamil4808 Год назад

      You got it Right Andreas, I think he just demonstrated to show how to draw hubs/links/sats and use it as a Star schema. My assumption is creating a complete DV model wasnt an intention.

  • @SaifKhan-hi4df
    @SaifKhan-hi4df 2 года назад

    Data vault! Seriously! Cheap and fast? Hmmm. Quick and easy? Hmmm.

  • @guptaashok121
    @guptaashok121 2 года назад

    can we have multiple satellite for a hub, if yes then do we keep natural key from both satellite in to hub?

    • @Datavault
      @Datavault 2 года назад

      Hi Ashok, thank you for your comment. Its best to ask a questions like this on the Data Vault Q&A Forum, follow this link - forum.ukdatavaultusergroup.co.uk/ There are many other similar questions currently being answered and its a great place to learn about Data Vault.

  • @ruddplamondin7400
    @ruddplamondin7400 2 года назад

    p̶r̶o̶m̶o̶s̶m̶

  • @emanueol
    @emanueol 2 года назад

    nice presentations, i was wondering on last use case about the snapshots asked on demand by users, would they get loaded on independent tables or simple something like timetravel posfix on sql command at <timestamp> as feature in Snowflake ?

  • @TheYourbox
    @TheYourbox 2 года назад

    Before you continue to publish this blunder any further I'd recommend to read Sorting and Searching from Donald Knuth. You can't cover all and anything with hash keys. In particular not in databases. You simply do not know the hash function for arbitrary data. Hash-keys as indices in compilers are perfect, but not in databases. You will not get a structured data-model. Without defining cardinality you won't get anything useful with hash-keys.

  • @davidedwards6898
    @davidedwards6898 2 года назад

    Hi Neil. This was a very interesting session. On the DevOps slide, the diagram showed Kanban flowing into GIT and Wiki having some icon. What system are you using there?

  • @arghyaghosh4968
    @arghyaghosh4968 2 года назад

    The green hub should have 2 satellites instead of one?

    • @isinpresshenning
      @isinpresshenning 2 года назад

      Yes, but only if the green Field in stack2 represents More then just a reference / business key to the green hub.

  • @emanueol
    @emanueol 2 года назад

    I still struggling to accept Satellites using system load_date as pk instead of a more business date, just because "we dont control" source dates. In my opinion would be better to use extract date in what i call "real data date", i done this in several data lakes ingestion processes that would grab from rdbms like oracle 1 extra column on each table: instead of: SELECT * FROM <some table> do: SELECT sysdate extract_ts, * FROM <some table> then no matter if ingestion of data takes a week to get loaded and eventually loaded after more recent data, we always have proper scd2 "real data date". Dates are one of the most important business aspects, and as such I congrat you making a video calling attention on this.