The simpler DAX and performance are so true. So often when people ask me for help with a measure that's not working well it comes back to data models that don't use the Star Schema.
Do you mean this video is so repetitive and random it looks like a promotion for Ralph Kimball's book? Strangely Patrick published another video a few days ago where he randomly blurted out Kimball's book.
I agree with Asen Kripov (the previous commenter). If it wasn't for you and your channel, I probably wouldn't be using Power BI. You have a very friendly way of bringing complex, dry material to life. I've only been using it for two days, and am already making very useful visualizations and reports. In a few days I'll have a portfolio set up to pitch the product to my boss and have him buy all the necessary subscriptions. Thanks for your work (which you don't make it look like work, but rather fun time) !
I love your presentation style. Your changing tone, expresions, hand gestures, all contribute to making your videos engaging and entertaining. The content is great too, you're my first port of call when I have a Power BI question.
Thank you. I have been thinking about it on the same lines until recently came to know that star schemas doesn't scale for large models (especially dimensions with 100s of millions of rows) and suggestion received from memory footprints is to flatten the tables. For many other reasons Star schema works best definitely.
I might be getting hired onto a helpdesk that specifically handles Power Bi issues, these videos are invaluable. Especially for my interviews. Thank you.
Wonderful explaination.... my team improved performance of refresh from nearly always fail due to sheer volumetry of data to 99% pass after changing our data model
Thank you for this! I've known that star schema data models are considered best practice in Power BI since I started learning how to use it about 1 year ago. However, I could never fully articulate exactly why. This video helps, giving a great, pithy explanation. Keep up the great work!
This is the single most important thing that changes basic power bi report design into real workable models. Partitioning data is also a must for this. Dimensions can wide lots of cols but dont tend to be as big as fact tables many rows. So not having large facts with text based data is a must, aggregation data only in facts, slice and dice by cols in dimensions.
I usually use the waterfall model, but it’s basically the same thing as star schema just arranged differently with all dimension tables aligned and then a fact table centered below.
as a BI developer, i'm always using a star schema in both DWH and Power BI model, but in fact, i always end up by adding supporting tables or some aggregated tables that ruins everything and make my reports slower. anyway, the star schema is always have good benefits when it comes to calculations in Power BI instead single table method.
Excellent content! I had a basic knowledge of star schemas from Databases 101/102 but this really helped put things into perspective. Love your energy!!!
Additional benefit from star schema (and is usually my main argument vs a tool like Tableau): if you have one big table some measures might give unexpected results. (Unexpected results are interpreted by the business as incorrect results :)) Suppose you have your gigantic sales table with the customer data alongside it. Now you want to calculate the average age of the customer. whoops, that will be incorrect since some customers will have made multiple sales and thus their age is repeated as well. So you get a weighted average instead. While if you have the customer data separate in a dimension, calculating average age is easy-peazy. This ties in with your argument of easier DAX, because you can fix the average age measure in the one big table, but it's more complex.
I agree on that. With a big flat table that has multi-fact it could be really difficult or even wrong to just add up the numbers. Also you really need to pay attention when counting unique values for example. So I always stick to the star scheme indeed.
I agree on the fact that it is often the case that the business jumps to conclusion. Power BI is considered as a "bad" tool very quickly and I often have to explain what happens behind the scenes even to BI teams.
Confused about definitions for primary and surrogate keys. Seems like the text displayed calls surrogate key "unique ID for an object in the database," whereas you call surrogate key "primary key defined w/ in the data warehouse," and the alternate key "primary key from the source system."
Saving this video to show to my manager that told me star schemas would not be needed in PowerBi, the same manager that is telling me that she is also going to replace Excel with Power BI 😂
Great video as allways! I actually think another main point should be in reference to your recent video with multiple fact tables: If you have one big table, how are you going to include your budget? Most budgets I've worked with will not have the same granularity (months not days for instance) so really you need a dimension table to 'connect' the two
Patrick, I'm curious on the DAX in the flat table. I tried that out; however, it doesn't summarize the same as nesting DATESYTD in a CALCULATE using the date table in a star schema. It summarizes the same as just using SUM([field]) I realize the star schema is the way to go. I just would like to be able to demonstrate this DAX as an example. Thanks in advance!!!!
You can also create a star schema with power query / M. It might even prolong the query duration if you have to perform some heavy transformations, but once you have the data in your model everything will improve - report performance, maintenance (due to simpler DAX) and ultimately user experience.
@@Alex-ol4bv Yes I think there will be a tradeoff. You may have longer query times if you use flat file and do transformation in power query but other improvements will apply with star schema.
Hi Patrick, Thank you for the video, It gave me a good understanding. I am using Power BI reports for my organisation and I am pondering which schema should I use Star or Snowflake? I have mostly dimensions table only, user, userlogin, points earned and points redeemed few others and for all the tables the users are focal point ( in the sense I create relationships using usernames). There are just a couple of basic calculation that has to be done in each dimension for example: unique login counts, total points earned, etc. nothing complex. I do have an orders table which gives me the order details like product, price, purchase date and so on but here, let me put it this way, if there are 100 users, only 80 would have logged in and out of that 80 , only 70 would have earned points, out of 70 only 50 would have redeemed their points and it is only for those 50 users who have ordered products will the data get filled in the orders table. so in this case should I create a star schema with the order table as center or a snowflake schema. Right i dont think my model mimics either star or snowflake schema. My report is mostly used to get the aggregated details of each user, Would like to know your input regarding this Like to have you
Question : I heard some people say that snowflaking on my dimension tables is a bad practice. Is this true ? For example, having a product table, then a product sub-category table, then product category table., should be avoided ? I don't get it. Wouldn't that actually help me reduce my tables ?
Although a snowflake model will work, it's best practice to flatten your model to one dimension (Product) instead of multiple product-dimension tables. Downside of a snow flake model is: usability (more tables to go through) and possible performance (multiple lookups needed instead of one look-up to the fact tables). With Power Query you can flatten down the Dimension-tables by using the Merge function. Don't forget to switch Enable Load of so the unnecessary tables won't load into the model.
Very magnificent work Patrick! The video is awesome and you helped me understand the schema better, Thank you! Also in minute 7:13 you pointed on the total duration measure in seconds which was 2069, I think that is a mistake and I am curious about how many raws were processed in seconds.
I usually do all the joins on Synapse SQL and then I just import the result on Power BI writing the query on the advance options, for now this is working fine but I am always thinking that It could be a better way to improve efficiency, do you have any advice or ideas for this?
if you're given a kitchen sink file from the client, should you use the power query editor to create the start schema (ie create tables using values, add index, join index back to kitchen sink)? or is it best to just stick with the kitchen sink given a refresh having to process all the create tables and joins will be added time? Or, alternatively, would you recommend running the kitchen sink through a stored sql script or python/code script to create the schema then load to PBI?
To use star schemas and improve performance do you need to pull the schema in as a star or can you define the star from a single 'flat' file ? I presume if you try the latter and build the star inside PowerBI refresh times will not improve ? So using a star schema is reliant on you having a formal data model in a data warehouse ?
Hi, Patrick! I'm Diego from Buenos Aires, Argentina. I'd like you to reccomend me some web pages (or right this one) in order to learn DAX, Power Pivot & Query as well. Sound job in explaining the star schema benefits! Thank you very much indeed.
Great explanation on what is best practice, but I'll play devil's advocate for a minute. Someone could counter your organization reason by saying to just create folders. And your DAX and Performance focused on the date dimension. Seems like someone could say fine, I'll just seperate out the date dimension but leave all the other attributes in the one big f___g table. So do you have any performance and DAX example reasons that don't use the date dimension? Thx in advance for reviewing and responding.
Hello everyone, need some help. So many companies is purchasing other companies and i will have multiple data sources. I've done research on Star vs Snowflake schema. What schema should i use, when ingesting data from other companies into a data warehouse, which will be the main repository for all data. Microsoft best practice says Star schema, but Snow allows for multilple Fact tables. What do you'll think? Thank you.
You forgot one obvious reason: Normalization. Update a single row in a DIM table, and all facts are changed at once. Do the same in OBT (one big table) and you need to update potentially thousands or millions of rows. Not only is OBT much slower, but it's also much more error-prone, as the schema doesn't have consistency/normalization built in.
I think that would depend on your report. What is the question that you need answered? From there you can start thinking on how to separate your data into the facts and dimensions.
Is Power BI performance affected on the free version? I wonder if this could be another reason for such long periods of time when updating the data sources and refreshing the power queries.
got a question for you what If I have multiple Fact tables and I need to relate them to the same dimensions? I can/Don't want to append my fact tables b/c they have different data in them.
So I do a lot of JIRA ad hoc reporting in Excel using Power Query and I have been looking into ways to increase the speed of the ETL process (I'm running at about an hour per load and Excel becomes useless during that time). I know I could convert to Star Schema in my processes but will I see any gains in performance? (Just wanted to ask before I rebuilt anything.)
Hi i have a oracle table with 40 million record in the table , so if import table it takes more than 30 mins to load the table , if i use the direct query method im unable to use mutiple DAX quries so , please let me know what should i do ??
Hi Patrick, I have a case where I already have an sql view. Can I skip creating a star schema in sql server and just create one in Power BI? or still create a star schema using the sql view and create another start schema in power bi? Thank you in advance
Thanks!, I have a q..... columns chart with last 3 yearly data but for 2021 ( only ytd January).... My tooltips work very good for full yearly columns, but when located in columns YTD2021 the result from tooltip ( Var% from previous year is incorrect.....) and is not comparing vs ytd last year.... How I could do it ?
@@marenrongen6019 thanks Maren, I know this function but is not working. Toolpit with % variation still comparing against total last year and not YTD last year. Surely must be a simple thing....! I keep working on...... thanks !
So at my work they're thinking of moving towards Power BI exclusively because we're using MS platform for many other things. However, the manager is thinking of using a multidimensional Cube and the primary data source. Is this good idea? I voiced my opinion and said that it wouldn't be a good idea and tabular is much better integration for Power bi. Any thoughts?
I think in the end the Tabular model is more flexible to work with within Power BI. Cubes needs to be build and maintained outside Power BI. Also as user you could be more limited in the way you can slice 'n dice your metrics if the Cubes are not well thought of. It's difficult to add stuff within Power BI later as for as I know. With the tabular model as a business user you can easily add more data to the model. Anyone else has a different opinion here?
I recently switched my company after working for 3 years. I came here to only see that a lot of reports are built without any data modelling concepts being followed.No normalisation too. People are dealing with many to many relationships.joining using wrong keys. Creating a summary table to avoid duplicates and basically using switch measures for filter propagation. They have literally built a lot of measures. The date table have index mapped to MTD YTD(again calculated tables). I don't know how to clean up these reports! People are seeking my help but I never really worked in this kind of set up! They are always busy dealing with many to many issues! I suggested them to build reports from scratch but the client deadlines are too hectic! What can be done now? P.S thanks for this video
The best thing I can suggest is to use VPAX to help get the data you need in a more usable form. But it’s still gonna be a big undertaking. You might consider building a data model from scratch and then rebuilding the reports if that’s possible
I inherited a solution very similar to this when I was at Anheuser-Busch. Though it used the slicer and switch to toggle between volume of beer barrels and $ value of beer sales. I'm trying to figure out a way to use it now to have a slicer like that switch my x axis from last 6 weeks to last 6 months. I'm converting from a Tableau dashboard that is able to use a measure to change the x axis frok a format of the weeks to the format of the months. I'm thinking I have to create a week/months calculated tables and combine them using an index in place of dates but I'm still not finding a way to dynamically change the formatted text in the x axis. Right now I'm using buttons to show/hide the relevant view but it'd be great if there were a way to dynamically change the x axis. Though as I wrote this I realize maybe I should consider education on drilldowns but they probably want a simple way to change every visual.at once.
But if u have your data in multiple fact tables, wont u have to many rows in a macro fact table? And if u have a model with 30 tables is impossible to build a star schema model..
I have a question that is not related to the video, but I can't help trying! I'm capturing data from a FOLDER (right when I add new files in that folder Power BI updates), and I add a new EXCEL TABLE in that folder. For example: 2020 data Name Status Value Luiz bad 1 Claudia good 2 After a while, I added the 2021 data that is Name Status Value Luiz Great 3 Claudia Lousy 4 Power Query currently returns to me like this: Name Status Value Luiz bad 1 Claudia good 2 Luiz Great 3 Claudia Lousy 4 However I would like to know if you have a possibility of: Automatically replace the Status (as soon as you add a new table containing the same names (or ID)) prioritizing the data of the most recently added table and at the same time adding the values, presenting me with a result like this: Name Status Value Luiz Great 4 Claudia Lousy 6 I mean, without using an incremental update, is it possible to update the text of specific columns, and add others, and keep the data on a single line without duplicating the names? Sorry for my Bad english, i'am a brazilian guy! HAHAHAHAHAHAHA
You can create a new query that selects the latest file from the folder. Remove the status field from the appended table, then groupby name and aggregate (sum) the value. Then you can merge the latest table with the aggregated one and extract the status
Great video Patrick! I want to publish my reports onto my company Power BI Service Gateway and I want them to stay as Direct Query. When I create a star schema that has Direct Query datasets, does the Power BI report become InMemory or can it still be Direct Connect?
Great video, explained very clearly! bananas as usual 👍 A nice remark on the flat file scenario will be "EXCEL is not a Database" 😋 (as many use it that way currently)
As a Data Warehouse person, this is the best video that I have ever seen that explains OLAP and Data Warehouse best practices
The simpler DAX and performance are so true. So often when people ask me for help with a measure that's not working well it comes back to data models that don't use the Star Schema.
Totally agree with you on this!
This channel is so straightforward, fast, easy to understand and always has great examples. Thank you so much!
I love hearing "Kimball" and "Starschemas -the complete reference", in the video.
Do you mean this video is so repetitive and random it looks like a promotion for Ralph Kimball's book? Strangely Patrick published another video a few days ago where he randomly blurted out Kimball's book.
@@peterbogdanovich4043 No, but I don´t understand your question. I like to hear about Kimball in a self service context.
Personally I think the Star Schema book is better than the Kimball book.
I agree with Asen Kripov (the previous commenter). If it wasn't for you and your channel, I probably wouldn't be using Power BI. You have a very friendly way of bringing complex, dry material to life. I've only been using it for two days, and am already making very useful visualizations and reports. In a few days I'll have a portfolio set up to pitch the product to my boss and have him buy all the necessary subscriptions. Thanks for your work (which you don't make it look like work, but rather fun time) !
I love your presentation style. Your changing tone, expresions, hand gestures, all contribute to making your videos engaging and entertaining. The content is great too, you're my first port of call when I have a Power BI question.
Great video! I popped this on while driving home and the concept makes perfect sense (coming from zero experience using star schemas).
Thank you. I have been thinking about it on the same lines until recently came to know that star schemas doesn't scale for large models (especially dimensions with 100s of millions of rows) and suggestion received from memory footprints is to flatten the tables. For many other reasons Star schema works best definitely.
Snowflake exists for your use case. Flat is bad.
I might be getting hired onto a helpdesk that specifically handles Power Bi issues, these videos are invaluable. Especially for my interviews. Thank you.
Great content. I'm still very much new to PBI and I'm enjoying learning all of these new things.
This is the only video that convinced me to use star schema. Thanks a lot!!!
This is best explanation I've seen on star schema. Thank you!
Thanks Patrick. I am very new to Power BI/Data Modelling, and you made my day by making this tutorial easy to understand. Thank you.
Awesome video as always Patrick! Maybe would be usefull for many people if you explain how to break one huge table into a star schema
You can consult this video for your question
ruclips.net/video/vjBprojOCzU/видео.html
This tutorial is dope. Thank you, sir !
Thank you, your channel help me so much in understanding power BI
Hi Patrick! If the company gives us a big table, do you recommend to make it a star schema in powerquery or just working with the table, thanks!
Wonderful explaination.... my team improved performance of refresh from nearly always fail due to sheer volumetry of data to 99% pass after changing our data model
Thank you for this! I've known that star schema data models are considered best practice in Power BI since I started learning how to use it about 1 year ago. However, I could never fully articulate exactly why. This video helps, giving a great, pithy explanation. Keep up the great work!
This is the single most important thing that changes basic power bi report design into real workable models. Partitioning data is also a must for this. Dimensions can wide lots of cols but dont tend to be as big as fact tables many rows. So not having large facts with text based data is a must, aggregation data only in facts, slice and dice by cols in dimensions.
Hi Patrick, i did not know that the differences are so high. thanks for that simple explanation.
I usually use the waterfall model, but it’s basically the same thing as star schema just arranged differently with all dimension tables aligned and then a fact table centered below.
Thanks for another awesome video Patrick! Would be awesome if you could also show us the best way to turn a flat file into a star schema model.
See this video ruclips.net/video/vjBprojOCzU/видео.html
as a BI developer, i'm always using a star schema in both DWH and Power BI model, but in fact, i always end up by adding supporting tables or some aggregated tables that ruins everything and make my reports slower. anyway, the star schema is always have good benefits when it comes to calculations in Power BI instead single table method.
How do you feel about the Inmon approach? Kimball vs Inmon which is better for Power BI?
Nice, sent this video to my customer.
Thank you, Patrick. Please tell me, how I can kill a refresh my PBI report on PBI services?
Excellent content! I had a basic knowledge of star schemas from Databases 101/102 but this really helped put things into perspective. Love your energy!!!
Additional benefit from star schema (and is usually my main argument vs a tool like Tableau): if you have one big table some measures might give unexpected results. (Unexpected results are interpreted by the business as incorrect results :)) Suppose you have your gigantic sales table with the customer data alongside it. Now you want to calculate the average age of the customer. whoops, that will be incorrect since some customers will have made multiple sales and thus their age is repeated as well. So you get a weighted average instead. While if you have the customer data separate in a dimension, calculating average age is easy-peazy. This ties in with your argument of easier DAX, because you can fix the average age measure in the one big table, but it's more complex.
I agree on that. With a big flat table that has multi-fact it could be really difficult or even wrong to just add up the numbers. Also you really need to pay attention when counting unique values for example. So I always stick to the star scheme indeed.
I agree on the fact that it is often the case that the business jumps to conclusion. Power BI is considered as a "bad" tool very quickly and I often have to explain what happens behind the scenes even to BI teams.
Great video and great explanation. Thank you!!!
Just simplifying the DAX and improving the performance report are enough to use Star Schema
3:08 hey that's eight 😀. great insight. thank you
Confused about definitions for primary and surrogate keys. Seems like the text displayed calls surrogate key "unique ID for an object in the database," whereas you call surrogate key "primary key defined w/ in the data warehouse," and the alternate key "primary key from the source system."
What's the best practice for multiple date dimensions (such as project start date and close date, etc.)?
Saving this video to show to my manager that told me star schemas would not be needed in PowerBi, the same manager that is telling me that she is also going to replace Excel with Power BI 😂
really enjoyed this!
Great video as allways! I actually think another main point should be in reference to your recent video with multiple fact tables: If you have one big table, how are you going to include your budget? Most budgets I've worked with will not have the same granularity (months not days for instance) so really you need a dimension table to 'connect' the two
Patrick, I'm curious on the DAX in the flat table. I tried that out; however, it doesn't summarize the same as nesting DATESYTD in a CALCULATE using the date table in a star schema. It summarizes the same as just using SUM([field])
I realize the star schema is the way to go. I just would like to be able to demonstrate this DAX as an example. Thanks in advance!!!!
Awesome Patrick! * Schema is awesome!!!
Does star schema have to be in the source (Like sql) or having that in M also improves the query times? thanks, great video.
Same question: if a have a CSV flat file it is recommend to turn to star scheme in power query?
You can also create a star schema with power query / M. It might even prolong the query duration if you have to perform some heavy transformations, but once you have the data in your model everything will improve - report performance, maintenance (due to simpler DAX) and ultimately user experience.
@@Alex-ol4bv Yes I think there will be a tradeoff. You may have longer query times if you use flat file and do transformation in power query but other improvements will apply with star schema.
Patrick did a video on this back in 2018: ruclips.net/video/vjBprojOCzU/видео.html&ab_channel=GuyinaCube
Hi Patrick,
Thank you for the video, It gave me a good understanding. I am using Power BI reports for my organisation and I am pondering which schema should I use Star or Snowflake?
I have mostly dimensions table only, user, userlogin, points earned and points redeemed few others and for all the tables the users are focal point ( in the sense I create relationships using usernames). There are just a couple of basic calculation that has to be done in each dimension for example: unique login counts, total points earned, etc. nothing complex.
I do have an orders table which gives me the order details like product, price, purchase date and so on but here, let me put it this way, if there are 100 users, only 80 would have logged in and out of that 80 , only 70 would have earned points, out of 70 only 50 would have redeemed their points and it is only for those 50 users who have ordered products will the data get filled in the orders table. so in this case should I create a star schema with the order table as center or a snowflake schema.
Right i dont think my model mimics either star or snowflake schema.
My report is mostly used to get the aggregated details of each user,
Would like to know your input regarding this
Like to have you
Question : I heard some people say that snowflaking on my dimension tables is a bad practice. Is this true ? For example, having a product table, then a product sub-category table, then product category table., should be avoided ? I don't get it. Wouldn't that actually help me reduce my tables ?
Although a snowflake model will work, it's best practice to flatten your model to one dimension (Product) instead of multiple product-dimension tables. Downside of a snow flake model is: usability (more tables to go through) and possible performance (multiple lookups needed instead of one look-up to the fact tables). With Power Query you can flatten down the Dimension-tables by using the Merge function. Don't forget to switch Enable Load of so the unnecessary tables won't load into the model.
Not a big deal. Might help if have some facts only at the category level.
Holy cats, thank you! Well-articulated and straightforward.
Glad it was helpful!
You guys are awesome!
Very magnificent work Patrick! The video is awesome and you helped me understand the schema better, Thank you! Also in minute 7:13 you pointed on the total duration measure in seconds which was 2069, I think that is a mistake and I am curious about how many raws were processed in seconds.
Hello Patrick, great stuff! Would be nice if you could show us how to do heat maps 5x5 quadrants one of these days? thanks
a superb video!
Is there a video on how to convert (import) large flat files into a star schema? Cus thats my issue!
I usually do all the joins on Synapse SQL and then I just import the result on Power BI writing the query on the advance options, for now this is working fine but I am always thinking that It could be a better way to improve efficiency, do you have any advice or ideas for this?
Super helpful!
Newbie question, but how do you create this star schema?
if you're given a kitchen sink file from the client, should you use the power query editor to create the start schema (ie create tables using values, add index, join index back to kitchen sink)? or is it best to just stick with the kitchen sink given a refresh having to process all the create tables and joins will be added time? Or, alternatively, would you recommend running the kitchen sink through a stored sql script or python/code script to create the schema then load to PBI?
To use star schemas and improve performance do you need to pull the schema in as a star or can you define the star from a single 'flat' file ? I presume if you try the latter and build the star inside PowerBI refresh times will not improve ? So using a star schema is reliant on you having a formal data model in a data warehouse ?
Do you define your dimension tables (apart from date table) in the power query data model or in the report layer data model?
Hi, Patrick! I'm Diego from Buenos Aires, Argentina. I'd like you to reccomend me some web pages (or right this one) in order to learn DAX, Power Pivot & Query as well. Sound job in explaining the star schema benefits! Thank you very much indeed.
My bane is header detail tables. Usually Transaction to Line, but what if have many Promotions per Line? :)
Would be great if you could create a video on how to create the ultimate Calendar table.
Great explanation on what is best practice, but I'll play devil's advocate for a minute. Someone could counter your organization reason by saying to just create folders. And your DAX and Performance focused on the date dimension. Seems like someone could say fine, I'll just seperate out the date dimension but leave all the other attributes in the one big f___g table. So do you have any performance and DAX example reasons that don't use the date dimension? Thx in advance for reviewing and responding.
Hello everyone, need some help. So many companies is purchasing other companies and i will have multiple data sources. I've done research on Star vs Snowflake schema. What schema should i use, when ingesting data from other companies into a data warehouse, which will be the main repository for all data. Microsoft best practice says Star schema, but Snow allows for multilple Fact tables. What do you'll think? Thank you.
You forgot one obvious reason: Normalization. Update a single row in a DIM table, and all facts are changed at once. Do the same in OBT (one big table) and you need to update potentially thousands or millions of rows. Not only is OBT much slower, but it's also much more error-prone, as the schema doesn't have consistency/normalization built in.
Again great!
Does it hold true when using Excel, Data Models and Power Pivot? Thanks!
Absolutely.
Excellent! and when you have to link 2 fact tables (PO and Invoice), what is the ideal schema?
thank you, you are the best
I think that would depend on your report. What is the question that you need answered? From there you can start thinking on how to separate your data into the facts and dimensions.
Is Power BI performance affected on the free version? I wonder if this could be another reason for such long periods of time when updating the data sources and refreshing the power queries.
got a question for you what If I have multiple Fact tables and I need to relate them to the same dimensions? I can/Don't want to append my fact tables b/c they have different data in them.
So I do a lot of JIRA ad hoc reporting in Excel using Power Query and I have been looking into ways to increase the speed of the ETL process (I'm running at about an hour per load and Excel becomes useless during that time). I know I could convert to Star Schema in my processes but will I see any gains in performance? (Just wanted to ask before I rebuilt anything.)
Hi i have a oracle table with 40 million record in the table , so if import table it takes more than 30 mins to load the table , if i use the direct query method im unable to use mutiple DAX quries so , please let me know what should i do ??
Does the common data model follow the star schema?
Awesome comparison!!, actually i don't know why don't design star schema data model..
Really well made video
Thank you for using dark mode when viewing the MS docs in your video :)
"I'm not that good at drawing" - Draws perfect star
Hi Patrick, I have a case where I already have an sql view. Can I skip creating a star schema in sql server and just create one in Power BI? or still create a star schema using the sql view and create another start schema in power bi? Thank you in advance
Thanks!, I have a q..... columns chart with last 3 yearly data but for 2021 ( only ytd January).... My tooltips work very good for full yearly columns, but when located in columns YTD2021 the result from tooltip ( Var% from previous year is incorrect.....) and is not comparing vs ytd last year.... How I could do it ?
You could use the dax sameperiodelastyear to see the current selection versus same period selection a year ago
@@marenrongen6019 thanks Maren, I know this function but is not working. Toolpit with % variation still comparing against total last year and not YTD last year. Surely must be a simple thing....! I keep working on...... thanks !
What if you don't have a data warehouse? Can you show how to create a star schema from an excel flat file?
He did a great video on that already in this channel - Power BI Tutorial | From Flat File To Data Model - ruclips.net/video/vjBprojOCzU/видео.html
Honestly, in big project, we can never have a pure star scheme. We have more and more facts and dims
Why does star schema produce faster refreshes ?
So at my work they're thinking of moving towards Power BI exclusively because we're using MS platform for many other things. However, the manager is thinking of using a multidimensional Cube and the primary data source. Is this good idea? I voiced my opinion and said that it wouldn't be a good idea and tabular is much better integration for Power bi. Any thoughts?
Oh my. Dreadful idea 😂
I think in the end the Tabular model is more flexible to work with within Power BI. Cubes needs to be build and maintained outside Power BI. Also as user you could be more limited in the way you can slice 'n dice your metrics if the Cubes are not well thought of. It's difficult to add stuff within Power BI later as for as I know. With the tabular model as a business user you can easily add more data to the model. Anyone else has a different opinion here?
@@WaldemardeHaas I think you’ve raised a separate issue - should business users be allowed to add more data to a model?
I recently switched my company after working for 3 years. I came here to only see that a lot of reports are built without any data modelling concepts being followed.No normalisation too. People are dealing with many to many relationships.joining using wrong keys. Creating a summary table to avoid duplicates and basically using switch measures for filter propagation. They have literally built a lot of measures. The date table have index mapped to MTD YTD(again calculated tables). I don't know how to clean up these reports! People are seeking my help but I never really worked in this kind of set up! They are always busy dealing with many to many issues! I suggested them to build reports from scratch but the client deadlines are too hectic! What can be done now?
P.S thanks for this video
The best thing I can suggest is to use VPAX to help get the data you need in a more usable form. But it’s still gonna be a big undertaking. You might consider building a data model from scratch and then rebuilding the reports if that’s possible
@@michaelmays9415 thanks for the suggestion Michael. I felt the same of needing to create a data model from scratch and build a new report.
I inherited a solution very similar to this when I was at Anheuser-Busch. Though it used the slicer and switch to toggle between volume of beer barrels and $ value of beer sales. I'm trying to figure out a way to use it now to have a slicer like that switch my x axis from last 6 weeks to last 6 months. I'm converting from a Tableau dashboard that is able to use a measure to change the x axis frok a format of the weeks to the format of the months. I'm thinking I have to create a week/months calculated tables and combine them using an index in place of dates but I'm still not finding a way to dynamically change the formatted text in the x axis. Right now I'm using buttons to show/hide the relevant view but it'd be great if there were a way to dynamically change the x axis. Though as I wrote this I realize maybe I should consider education on drilldowns but they probably want a simple way to change every visual.at once.
First time here... Apart from the excellent sharing, does any one notice that Patrick looks like Will Smith? 😁
nice i just understand it .. Hardee's is star schema
But if u have your data in multiple fact tables, wont u have to many rows in a macro fact table? And if u have a model with 30 tables is impossible to build a star schema model..
He's a Kimball Wizard
I have a question that is not related to the video, but I can't help trying!
I'm capturing data from a FOLDER (right when I add new files in that folder Power BI updates), and I add a new EXCEL TABLE in that folder. For example:
2020 data
Name Status Value
Luiz bad 1
Claudia good 2
After a while, I added the 2021 data that is
Name Status Value
Luiz Great 3
Claudia Lousy 4
Power Query currently returns to me like this:
Name Status Value
Luiz bad 1
Claudia good 2
Luiz Great 3
Claudia Lousy 4
However I would like to know if you have a possibility of:
Automatically replace the Status (as soon as you add a new table containing the same names (or ID)) prioritizing the data of the most recently added table and at the same time adding the values, presenting me with a result like this:
Name Status Value
Luiz Great 4
Claudia Lousy 6
I mean, without using an incremental update, is it possible to update the text of specific columns, and add others, and keep the data on a single line without duplicating the names?
Sorry for my Bad english, i'am a brazilian guy! HAHAHAHAHAHAHA
You can create a new query that selects the latest file from the folder.
Remove the status field from the appended table, then groupby name and aggregate (sum) the value.
Then you can merge the latest table with the aggregated one and extract the status
Great video Patrick! I want to publish my reports onto my company Power BI Service Gateway and I want them to stay as Direct Query. When I create a star schema that has Direct Query datasets, does the Power BI report become InMemory or can it still be Direct Connect?
Hii Patrics❤️❤️
Great video, explained very clearly! bananas as usual 👍
A nice remark on the flat file scenario will be "EXCEL is not a Database" 😋 (as many use it that way currently)
Well 4 WHYs are true star schema vs flatfile, but star vs snowflake is... more complicated
A very informative video! 😄
Make sure you check out Edu4Sure's video on "Schemas in Power BI" ⬇️
ruclips.net/video/c6C7yEwGCf0/видео.html
I thought Will Smith was explaining Power BI....
No snowflake for you!
要是能有这么好的中文频道就好了,我。。。来了,正在努力中。