Data Vault vs Traditional Data Warehouse Architectures

Поделиться
HTML-код
  • Опубликовано: 22 июл 2024
  • Data Vault example: • Data Vault Model Tutor...
    Let's take a look at an overview of the Data Vault Architecture for data warehousing. What are it's goals as an alternative to Kimball and Inmon's approaches. And how does it compare to dimensional data modeling, and when should we consider using the Data Vault.
    ⏯RELATED VIDEOS⏯
    Kimball and Inmon: • Let's Compare the Kimb...
    Data Warehousing: • What Exactly is a Data...
    ------------------------------------------------------------------------------
    Data Podcast ►► open.spotify.com/show/4PWmW2g...
    Website ►► www.nullqueries.com/
    ------------------------------------------------------------------------------
    🎓Data courses (Not Produced by nullQueries)🎓
    Azure Data Engineering: click.linksynergy.com/deeplin...
    DE Essentials, hands on: click.linksynergy.com/deeplin...
    ------------------------------------------------------------------------------
    📷VIDEO GEAR📷
    Programming Mouse: amzn.to/3zEom7f
    Lighting: amzn.to/3o8tXAM
    RGB light: amzn.to/3o8AQBS
    USB Microphone: amzn.to/3m3hjAt
    Mixer: amzn.to/2ZyqMIk
    XLR Microphone: amzn.to/3AHPZ0L
    💻VIDEO SOFTWARE💻
    music/stock: 1.envato.market/rnX70y
    ------------------------------------------------------------------------------
    For business inquiries please contact nullQueries@gmail.com
    Some of the links in this description are affiliate links and support the channel. Thanks for the support!
    ------------------------------------------------------------------------------
    00:00 Intro
    00:28 Goals of the Data Vault
    01:27 Architecture
    02:10 Modeling Terms
    03:16 Modeling Example
    03:39 ETL
    04:08 Reporting
    04:32 Pros and Cons
  • НаукаНаука

Комментарии • 37

  • @nullQueries
    @nullQueries  3 года назад +4

    What do you think of the data vault compared to the dimensional data warehouse? Have you built both?
    For more Data warehouse options: ruclips.net/video/Tff34jj_V-0/видео.html

    • @JimRohn-u8c
      @JimRohn-u8c 2 года назад +1

      I would love to see more videos on how to implement this. Wish there was a Udemy course on how to implement this.

    • @norpriest521
      @norpriest521 2 года назад

      @@JimRohn-u8c
      I love how he mentioned at the end that data vault may not be the best option for some scenario.
      This shows that it's not about which is the better one, but it's about which one is more reasonable to use in specific scenarios.

    • @willi1978
      @willi1978 Год назад +1

      The idea of Data Vault sounds nice. But using an ETL Automation Tool like WhereScape etls can be adapted very nicely too and with less overhead

    • @AlbertoSimeoni-wi9wj
      @AlbertoSimeoni-wi9wj 7 месяцев назад

      I think the main problem is computational power /time to build every link tables. The fact that in the end you build a reporting layer that is in fact a dimensional model vanish all the effort.
      The clear advantage is having the original keys in a staging area and avoid to change the extractors.
      But this is all made having in mind old row and disk based databases. With in memory columnstore database (SAP HANA) the link logics is not necessary, it can be all virtual. We have customers with all dwh / BI logic that runs on the erp database with tables over 100 million rows, all with virtual modeling without persistence.

  • @srikanthmanduri6429
    @srikanthmanduri6429 Год назад +2

    One of the best video's out there regarding Data Vault modelling

  • @michaelenriquez_
    @michaelenriquez_ 3 года назад +2

    thanks for make this kind of videos, i really appreciate it, they are so useful for people like me who are learning about it

  • @SjeetjeMineetje
    @SjeetjeMineetje 3 года назад

    Very well explained with good examples, this is very helpful!

  • @CrazySw3de
    @CrazySw3de 3 года назад +25

    I enjoy your videos quite a bit, just a few pieces of constructive criticism:
    I feel like a little bit more space between sentences to let the viewer digest what is being said/shown would help a lot.
    I like the clean look of the visuals, but the text labels etc. help make things easier to visually process.
    I think the visual example you did with the tables in this one was good, more real examples like that for what these concepts actually look like in the real world, even just as examples helps drive the points hope.
    Looking forward to seeing your channel grow, keep up the good work!

    • @nullQueries
      @nullQueries  3 года назад +9

      Thanks for the feedback. I'm trying to keep these as 5 minute overview videos, which is a challenge with some of these dense topics. Still trying to work out the pacing and how much detail to cram in. I have some ideas for more in depth, slower paced example videos to go along with the overviews. Just need to find the time!

    • @sued12345
      @sued12345 4 месяца назад

      @@nullQueries for me you don't need to change anything. I mean a short video will not replace proper training, but helps a lot. Thank you for your effort.

  • @stephanzhechev141
    @stephanzhechev141 Год назад +4

    This is a wonderful video. Unfortunately for me, I read 450 pages from Dan Lindstedt's book introducing the data vault 2.0 architecture. This is, hands down, the worst book I have ever ready. It is just horrible. However, it does contain about 7 good ideas and this video captures all of them in a nicely presented coherent way. Thank you!

  • @christopherbronson3275
    @christopherbronson3275 3 года назад +3

    Can I just say "Dimensional Datamart" is my favorite cyberpunk term

  • @Sam-gj4hf
    @Sam-gj4hf 3 года назад +1

    First time watching your videos and I absolutely love them! Subbed and liked. It'd be even more awesome if you could allow for an extra second to digest what you're saying. It's a lot of useful information. But even if you don't change anything, I'll still be a fan! Thank you for this!

  • @paulheadey265
    @paulheadey265 8 месяцев назад

    My data engineering team have built many data vaults, but could never quite articulate to me as a business leader why? This has been very educational for me in explaining the benefits vs complexity. The pace that business is changing and the number of new data sources that become available makes a data vault seem a more obvious choice. The business still gets its Inmon Kimble model, but the foundational data structures in the Vault provide more capability to make changes to them. That's what this inferred to me. I hope I am on the right mark.

  • @vidak92
    @vidak92 5 месяцев назад

    Really, the best explanation.

  • @ardee3949
    @ardee3949 3 года назад

    Great videos .. very informative ...can you do a quick comparison between Redshift & Vertica? an overall evaluation?

  • @kabirsingh6582
    @kabirsingh6582 2 года назад

    Great content..subscribed!

  • @danielolaru2496
    @danielolaru2496 2 года назад +7

    I went from the 3NF video to the dimensions one to this one and I feel like the only advantage I see is the dimension/Kimball one. This data vault seems just overkill. The storage will increase exponentially with all the extra keys needed and with very large storage of millions/billions of rows the performance I suppose will be greatly impacted when querying all those keys. Why is this an easier ETL solution? Am I missing something?

    • @TheR0yalBeast
      @TheR0yalBeast Год назад +2

      Hi Daniel, I think a key point of the data vault to understand is that it is exceptionally good at showing lineage. In my point of view it is only a good solution when you are dealing with many different data sources which need to be combined. A great example of a project I have helped on was combining 10 different SAP clients at a manufacturing company. Each is customized slightly, the data may be stored in the same fact table, say sales, but have different indicators or flags etc. modifying it. WIth the ETL solution you would do a one off ETL to land it in a standardized table; however, in 4 years you will need to spend weeks of development trying to figure out where the mistakes are and what transformation occurred.

    • @SamuelLees-jv8ji
      @SamuelLees-jv8ji Год назад

      I see a lot of advantages with data vault but I just can't see it as an advantage over dimensional warehouse for my business context: e-commerce platform + CRM + billing system + marketing campaign system because all of these sources are quite static. Would be great to get feedback on this.

  • @yogeshbharadwaj6200
    @yogeshbharadwaj6200 2 года назад +1

    very well explained...tks a lot

  • @pedropradocarvalho
    @pedropradocarvalho 2 года назад +1

    Would it happen that you guys have a transcript of this video? maybe posted in a blog post?

  • @bytedonor
    @bytedonor 5 месяцев назад

    Well explained in pictorial format. But there should be some use case or an example so the newbies can understand more easily.

  • @pb78pb
    @pb78pb 3 года назад

    Hi. Thank you for this overview video. Do you have also a webpage where you can be contacted? Would be happy to get your thoughts about DWH automation (we are the creators of the Datavault Builder tool). Regards

  • @MrCutlash
    @MrCutlash 2 года назад

    Data vault is the curated layer in a data lake. And they have a very specific design... But really its an inmon/operational design

  • @treelo11
    @treelo11 2 года назад

    This video is very good but I need to clarify the ETL Process. Supposed I have a few raw files yet to be stored. They are placed inside the data lake unmodified. From there, I insert the data as hubs, link tables and satellites tables into the raw vault, creating surrogate keys along the way. Is that right? And what does 'since objects in each layer never connect to each other' mean? 4:01

    • @ivani3237
      @ivani3237 Год назад

      it's mean that no any hard foreign keys, but logically they of course connected

  • @galeop
    @galeop 2 года назад

    Really good video! Thank you!
    Quick question: what do you mean by "Business logic"? Do you mean that kind of logic that would be used with an MDM, to control whether new attributes about an entity should be added or ignored (eg if we have conflicting phone numbers for a customer)?

    • @nullQueries
      @nullQueries  2 года назад +2

      I'm using Business Logic to represent anytime some sort of business rule alters source data. Sometimes it's explicit (ie: phone numbers are always stored in a certain format). And sometimes it's just tribal knowledge (ie: Some sources call it a customerID and some a consumerID. But everyone in the office knows it's referred to as ClientID. So we'll convert to that naming so it's easy for users to consume. ) A good MDM should handle this but it depends on how it's implemented, what it catches, and where in the architecture it makes the changes. But for the DV this would happen in the business vault layer, as the raw vault should reflect the sources.

    • @galeop
      @galeop 2 года назад

      Thank you!

  • @mosa36
    @mosa36 2 года назад

    Nice video, where can we learn about the other data warehouse format?

  • @moverecursus1337
    @moverecursus1337 Год назад

    a little bit complex

  • @thghtfl
    @thghtfl Год назад +3

    All those fancy pictures make zero sense without real live examples, just think about it

  • @juliustuckayo8973
    @juliustuckayo8973 2 года назад

    Great video, I stumbled upon this channel by accident today, after reading an opinion piece by Bill Inmon on why Snowflake isnt a data warehouse (on LInkedIn) after watching your video on Inmon vs KImbal i immediately subscribed, great content, what software do you use for the video animations? anyways you've got a new subscriber from Papua New Guinea, keep it up, happy Easter.

    • @nullQueries
      @nullQueries  2 года назад

      Thanks for the compliment! I use the adobe suite for all illustration and animations.