Dimensional Modeling to Data Vault Evolution

Поделиться
HTML-код
  • Опубликовано: 18 окт 2012
  • Moving from a dimensional model to a data vault model.
    www.datavaultacademy.com

Комментарии • 60

  • @bvenkateshx
    @bvenkateshx 3 года назад +3

    The moment you reverse the business key to reverse the relation.. it is a pragmatic way ti show evolution. Loved this.

  • @paulohsgoes1959
    @paulohsgoes1959 7 лет назад

    Short, straightforward and objective as every great work should be done. Congrats

  • @AttaKru
    @AttaKru 2 года назад

    this is one of the most important video for one who comes from a the kimbal world ...

  • @pamelasprague5814
    @pamelasprague5814 4 года назад +3

    Brilliant! So helpful in helping me to understand *both* modeling techniques.

  • @vieworacle
    @vieworacle 7 лет назад +2

    Crisp, clear and short introductory explanation of the how Data Vault has evolved from Dimensional Modeling. Personal thanks to the engaging and excellent presenter.

    • @TheYourbox
      @TheYourbox 2 года назад

      Dimensional modelling was SAP. A complete company without any idea how data models woks at all.,

  • @AnVinhNguyen
    @AnVinhNguyen 7 лет назад +4

    absolutely stunning explanation. Thank you a lot

  • @DataVaultAcademy
    @DataVaultAcademy  11 лет назад +6

    Data Vault is typically used for the main DW layer. This layer is integrated, non-volatile, time-variant, and subject oriented (@ Inmon) and auditable, agile, atomic etc. The business meaning implied by the "conformed" dimension in this example represents the common business rule transformations. These are intended to be completed within the DW layer as part of the target or BDW area. In the same layer there is also a RAW area (traceability).
    The Data Marts remain Dimensional Models.

  • @harfenueglebarfenuegle8192
    @harfenueglebarfenuegle8192 Год назад

    Thank you. This is the best explanation of the problem that Data Vault actually solves.

  • @MartijnEuyen
    @MartijnEuyen 9 лет назад

    Very good! Very clear story on DV modeling - thanks a lot for sharing.

  • @JeFFiNat0R
    @JeFFiNat0R Год назад +2

    Wow this was so easy to understand. Great story telling and explanation. Thank you!!

  • @habibjanjua
    @habibjanjua 3 года назад

    Best video on adding new source attributes

  • @beemac79
    @beemac79 7 лет назад

    Really well explained, great example!

  • @SolucaoEmAberto
    @SolucaoEmAberto 10 лет назад

    Fantastic! It just solved the DV to DM on the Dimensions!!! The dimension is a plain cross product between the satellites on the hub!

  • @barrybatsma8947
    @barrybatsma8947 6 лет назад

    Great explanation!

  • @haniattar
    @haniattar 4 года назад +1

    very well said.. I really enjoyed this sessions, and learned a lot.

  • @SathishKumar-gj6te
    @SathishKumar-gj6te 7 лет назад +1

    Thanks for the Great Introduction into Data Vault. The videos in the academy are great. Appreciate if you could put a logical structure of Videos that can be read on a sequential basis

  • @SSSoM3
    @SSSoM3 6 лет назад

    Great and easy to watch video, big thanks

  • @anishgopinath6801
    @anishgopinath6801 6 лет назад

    wowow!!! simple and perfect explanation

  • @robwozniak9912
    @robwozniak9912 Год назад

    Thank you. This made it real and practical for me.

  • @Nesikme
    @Nesikme 4 месяца назад

    Brilliant explanation !!!! Thank you very much

  • @chucka6415
    @chucka6415 Год назад

    I love this great presentation.

  • @swaminaidu9454
    @swaminaidu9454 4 месяца назад

    Awesome explanation. Great. Thank you

  • @vickysharma2949
    @vickysharma2949 8 месяцев назад

    wonderfully explained!!

  • @bhanu4j
    @bhanu4j 11 лет назад +10

    Thanks for sharing the concept. My queries are:
    How is this different from Snowflake?
    The detail lies is implementation because of more joins and hence performance for faster read.

  • @nickr2957
    @nickr2957 3 года назад

    Very good. Thank you.

  • @maverickrider4591
    @maverickrider4591 5 лет назад

    Thank You

  • @emilagaev5201
    @emilagaev5201 Год назад

    great explanation, thanks a lot

  • @shubhamg931
    @shubhamg931 3 года назад

    Awesome!

  • @nareshbuchuk
    @nareshbuchuk 5 лет назад

    This is Great quickly digestible

  • @gitanjalikhanna3009
    @gitanjalikhanna3009 6 лет назад +1

    Thanks very much for this clear explanation. Could you please also create a video explaining how FACTS are derived from DV or vice versa? Thanks

  • @amitshrestha2165
    @amitshrestha2165 10 лет назад

    Well done.

  • @aktharhussain6342
    @aktharhussain6342 10 лет назад

    v good. i have just started reading on data vaults
    akthar

  • @alexanderyelich817
    @alexanderyelich817 3 года назад

    THANKS!!!

  • @ranurobin
    @ranurobin 5 лет назад

    Good one

  • @gauravnist438
    @gauravnist438 9 лет назад

    It is really a great lecture

  • @Palthewall
    @Palthewall 6 лет назад

    It's very similar to EAV ( Entity Attribute Value) Model

  • @phaninany
    @phaninany 3 года назад

    How can we design the already connected dimension tables to the fact table in Data vault ? You were just explaining about the extended dimensions with vault

  • @PCoetzee24
    @PCoetzee24 6 лет назад

    Very clear explanation. Thanks! Simple Question: If I aggregate Sales upwards, via the link tables, to a unique customer contained within the hub, how do I know what version of a Satellite record to use. A unique Hub record can now have multiple Satellite records by having different Load Dates.

    • @PCoetzee24
      @PCoetzee24 6 лет назад

      An SCD2 is painful to maintain and use, but very effective at tracking history, as new versions have unique surrogates that can be associated with transactions. The hub-Sat model does not associate different satellite versions for the same hub record to a transaction. Only to the unique hub record.
      So If I use a Sat table to track changes to a specific customer contained in the hub, I cannot see what transactions occurred when during the life of the customer.
      Is my understanding correct??

  • @dsinghr
    @dsinghr 4 года назад

    best 11.23 mins i spent today

  • @aztec9424
    @aztec9424 4 года назад

    Pretty good explanation on DV. Though it talks about Hub and Satellites it doesn't seem to talk about the Links. Is there a separate video on that? Covering Links would make this topic complete in my opinion.

    • @HansHultgren_GA
      @HansHultgren_GA 4 года назад +1

      This particular topic didn't concern links however there are other videos in this channel that do. Also you can find more on the Genesee Academy youtube channel.

  • @rupanag
    @rupanag 4 месяца назад

    Can you please help me with these . Would be great if you can explain with some examples.
    1. Why link to link relationship is not recommended in RDV?
    2. In BDV bridge table, if we are storing only hash keys( not natural keys), then how in fact/dimension we are going to get natural keys?

  • @kimsung2384
    @kimsung2384 3 года назад +2

    I could see this becoming a performance nightmare ...

  • @botrrun9399
    @botrrun9399 4 года назад

    are there any modeling tools that support Data Vault?

  • @farwanqv
    @farwanqv 5 лет назад

    I didn't get if all or any of the smaller customer dimensions are type 2 dimensions?

  • @saravanan2011YOUable
    @saravanan2011YOUable 5 лет назад

    Perfect but It seems 3NF like hybrid approach.

  • @4suc6
    @4suc6 4 года назад +1

    This approach is no way fundamentally different from dimensional modeling. Just a variation. If a new hub comes, you still have to change all the link tables . Small changes may be are easier but Ralph never said you must prop everything in a giant dimension. This is more or less what DV improves, isolating attributes based on the same frequency of changes (satellites) and linking this to a dimension (hub) instead of direct link to the facts tables like classical MDM does. Makes sense, but packaging this like new modeling paradigm and calling it all different 😂😂😂😂. Presentation would benefit from talking about frequency of changes grouping. This is the essential part

  • @kshitijpathak8646
    @kshitijpathak8646 2 года назад +1

    I am not convinced that data vault modeling is the evolution of dimensional modeling. The main purpose for dimensional modeling ( denormalizing) is to reduce the number of joins, make it less expensive and hence get performance gain while querying reports in realtime. With the above data vault approach, you are again increasing the number of joins ( seems like you are again somewhat normalizing). Please let me know if my logic is correct.

  • @terminalrecluse
    @terminalrecluse 4 месяца назад

    Is this really the evolution of star or Snowflake schema’d DWH design? A data vault modeled DWH would require far more joins and likely perform poorly. What about that? Whereas a single dimensional model has fewer tables is easier to partition faster to query.

  • @eugenea8264
    @eugenea8264 3 года назад +4

    No, no, no, no... No on so many levels. And people cannot see the lecturer is making mistakes because he is giving no examples. The final structure you are making is turning dimension table into a fact table and you would have a record there for every customer. So basically if you have a million customers, you would have a million records there. If you have half a million customers from Melbourne you would end up writing "Melbourne" half a million times. Good luck with aggregating data by value.
    But that is only half a problem.
    Say you want to add a "gender" value to your customer. On your initial structure you would need to go through a million records of customers and add a new value to a new field. And you say it is difficult. And you offer an easier way, you create a new table and you add the gender there. So you create a new table with a key for every customer and then and a value of their gender, that would take exactly the same amount of calculation - it would be 1 million records processed. Where is a simplification?

    • @romeoalexander6175
      @romeoalexander6175 2 года назад

      Melbourne would not be written a million times. What this video did not mention is that there would be an intermediate link table. That is, there would be a customer-location link table with a million records linking those customers to the same (integer) id. Then there is a final location SAT table that contains a single record associating that id with “Melbourne”.
      As for your second point, the benefits come from modularity. Instead of having to modify a “dim_customers” file every time you want to add new attributes, you can write brand new “sat_customer_location” and “link_customer_location” files without modifying the original files. In a fast paced agile environment, writing new files is much easier than modifying existing files.

  • @TheYourbox
    @TheYourbox 2 года назад

    Don't you understand how ridiculous dimensional modelling is compared to 3rdNF. Dimensional modelling is simple nonsense. Data Vault is a fair good concept, but very related to 3rdNF. I like Data Vault, because you can do things in parallel, but dimensional modelling is simple pure nonsense. One of those unprecedented failures in IT.