Hi Ruth, another Friday 🙂 There are Excel functions in power bi DAX functions (kept for Power Pivot engine compatibility), which shouldn't be prioritised over native DAX functions. (DAX functions are much more optimised in it's internally logic when compared to same excel function) Another reason I could think of is that because DAX runs on Analysis services, there will be a guid column added in memory when there's no primary key in the table, and count rows is probably programmed to read the unique column count , which is the cardinality of the table.. I read it somewhere, but not able to recall exactly about this.. Anyways, that's the work for weekend.. 🙂👍 Have a great weekend. Thanks
Actually, blank cells are treated as an empty string cell, and real blanks - cells with no data in them, are marked with null in power query. You can try to load some data into an excel worksheet and see that those empty strings contain ' in it and they are also treated as text. :) It's quite confusing though. The same is when you try to get an empty cell in Excel from let's say IF statement. IF (condition, something,"") will return an empty string (empty text field) and not blank if it's false. Hope that makes sense.
SSAS Tabular Model (including PowerBI version), creates a hidden column for each table (not available to the user) in format like: Product Category-RowNumber-2662979B-1795-4F74-8F37-6A1BA8059B61 (it can be seen to connect e.g. in daxStudio to the model and querying DMV. And probably counting is done on it. Which probably affects the speed. In addition counting blank, this may by faster than COUNTBLANK () CALCULATE ( COUNTROWS ( 'Table' ), KEEPFILTERS ( 'Table'[CloumnWithBlank] = "" ) )
Hi Ruth! How are things going? Great video as always! I think that when it says COUNTROWS "doesn't consider BLANKs contained in any column of the table" means that it treats nulls, "" and actual values like 5, 6, "apple", "orange", "BIKE", "CAR", true, false all in the same way. I quickly created a table : #table(type table[#"Column1"=text], {{null}, {null}, {null}, {null}}) and then imported it. measure countrows = COUNTROWS(Query1) returns 4 measure count = COUNT(Query1[Column1]) returns (blank)!! Seems to match what we saw. What do you think? I must check the slack channel! Cheers Ollie
You are correct. Saying it doesn't consider blanks is not the same as saying it doesn't count blanks. It would be nice if it were documented more clearly, but it's saying it counts the row whether it is blank or not.
COUNTROWS maybe employs some form of (internal) index perhaps? Whereas using COUNT on say Order Date, or a non-integer type field would be less efficient? As already hinted at by Aleksander Strömmer, you'd likely have to be working with a sizable data set to see much difference one would think?
I work with large sales tables so most likely COUNTROWS is a 'safer' way because there may be some rows that are left blank. Example is that I have to count the number of rejects/warranty returns and definitely not all sales will have that so COUNT might be dangerous that it counts every row. Maybe that makes sense I think?
But with the name countrows I would expect to count all rows? Maybe they should create a countallrows? It is confusing what all those count functions count or skip... I might do a new video going deeper in the subject. Thanks for your thoughts and happy friday!
Hmmm, I' guess I have tuned to the logic to think COUNTROWS is to "count rows with actual data" while COUNT will count everything :) Happy Friday to you too!
countrows should perform better because it counts the "from/to" tables that vertipak creates..all data that is agregate, generates a small version(compact) that contains the " name: Q1, from: 1, to: 100000 ; name: Q2, from: 100001 , to: 200000 ; etc... " this is what i think, not 100% acc
"Kunskap är förvirring men på en högre nivå än tidigare" I think that the problem is DAX does not recognize NULL in the database context as missing value. Instead you get empty strings which is not the same.
My assumption as to why COUNTROWS return 7 instead of 5, it is an "additive" model. Logically, you would wanna use an OR operator to count through all rows that are not NULL (I think the documentation meant this instead of BLANK). Instead of "multiplicative" model which uses the AND operator, which is to count the row only if all rows in all columns are not null. To me, this is a risky and faulty data modeling assumption, that I believe COUNTROWS aim to avoid. In all 3 columns in the 'Counting' you have at least one value per row. Here's where COUNTROWS would return 7, again since it counts through all rows in the table that doesn't contain NULL. As to why it is being more performant, I would say it goes back to the "additive" model assumption where each row in the table has a logic that checks them in the entire table. Otherwise, the engine has to work "harder" to get into the columns. We gotta understand that DAX work from the following pecking order: Table > Column > Row (which explains why context transitions usually work the hardest). I guess the logic is to answer which is easier to purchase in a store: A. 5 packs of flour weighing 1kg each B. Weighing 5kg off the big sack of flour I know it isn't a great example, but I guess this is where the article is going. Once again, thanks for sharing Ruth. I've never used these formulae but will certainly bear in mind when I need to in future.
@@CurbalEN haha apologies for that confusion. Maybe you can just load one extra row in the table where all 3 columns will be null. I think COUNTROWS would still return 7. If it returns 8 I can eat my own cat food. Hahahaha.
I've tried. You can't even load a fully null row at the end of the table. But the interesting stuff I found out is, if you add a null row in between, it will be counted too! 😅😝 So yeah, suffice to say, it's very confusing to explain.
Hi Ruth, another Friday 🙂
There are Excel functions in power bi DAX functions (kept for Power Pivot engine compatibility), which shouldn't be prioritised over native DAX functions. (DAX functions are much more optimised in it's internally logic when compared to same excel function)
Another reason I could think of is that because DAX runs on Analysis services, there will be a guid column added in memory when there's no primary key in the table, and count rows is probably programmed to read the unique column count , which is the cardinality of the table.. I read it somewhere, but not able to recall exactly about this..
Anyways, that's the work for weekend.. 🙂👍 Have a great weekend. Thanks
Thanks!
Also interested in whether indexing the tables will improve the dax efficiency, like SQL?
Actually, blank cells are treated as an empty string cell, and real blanks - cells with no data in them, are marked with null in power query. You can try to load some data into an excel worksheet and see that those empty strings contain ' in it and they are also treated as text. :)
It's quite confusing though. The same is when you try to get an empty cell in Excel from let's say IF statement. IF (condition, something,"") will return an empty string (empty text field) and not blank if it's false.
Hope that makes sense.
SSAS Tabular Model (including PowerBI version), creates a hidden column for each table (not available to the user) in
format like:
Product Category-RowNumber-2662979B-1795-4F74-8F37-6A1BA8059B61 (it can be seen to connect e.g. in daxStudio to the model and querying DMV.
And probably counting is done on it. Which probably affects the speed.
In addition counting blank, this may by faster than COUNTBLANK ()
CALCULATE (
COUNTROWS ( 'Table' ),
KEEPFILTERS ( 'Table'[CloumnWithBlank] = "" )
)
Thanks!
Hi Ruth!
How are things going?
Great video as always!
I think that when it says COUNTROWS "doesn't consider BLANKs contained in any column of the table" means that it treats nulls, "" and actual values like 5, 6, "apple", "orange", "BIKE", "CAR", true, false all in the same way.
I quickly created a table : #table(type table[#"Column1"=text], {{null}, {null}, {null}, {null}})
and then imported it.
measure countrows = COUNTROWS(Query1) returns 4
measure count = COUNT(Query1[Column1]) returns (blank)!!
Seems to match what we saw.
What do you think? I must check the slack channel!
Cheers
Ollie
You are correct. Saying it doesn't consider blanks is not the same as saying it doesn't count blanks. It would be nice if it were documented more clearly, but it's saying it counts the row whether it is blank or not.
COUNTROWS maybe employs some form of (internal) index perhaps?
Whereas using COUNT on say Order Date, or a non-integer type field would be less efficient? As already hinted at by Aleksander Strömmer, you'd likely have to be working with a sizable data set to see much difference one would think?
I have no idea of why, wish they would have said in the article...
How much faster with diff sizes of data?
Speed depends on a lot of factors, give it a go yourself with your data and let us know :)
I work with large sales tables so most likely COUNTROWS is a 'safer' way because there may be some rows that are left blank.
Example is that I have to count the number of rejects/warranty returns and definitely not all sales will have that so COUNT might be dangerous that it counts every row. Maybe that makes sense I think?
But with the name countrows I would expect to count all rows?
Maybe they should create a countallrows?
It is confusing what all those count functions count or skip...
I might do a new video going deeper in the subject.
Thanks for your thoughts and happy friday!
Hmmm, I' guess I have tuned to the logic to think COUNTROWS is to "count rows with actual data" while COUNT will count everything :) Happy Friday to you too!
Need to do that too:)
countrows should perform better because it counts the "from/to" tables that vertipak creates..all data that is agregate, generates a small version(compact) that contains the " name: Q1, from: 1, to: 100000 ; name: Q2, from: 100001 , to: 200000 ; etc... "
this is what i think, not 100% acc
Thanks!
Hi Ruth ! We always COUNT or COUNTROWS on you ... :-) Very very confusing..... Have a nicw weekend !
Happy Friday!! 🥳🥳
Summary: 1. Empty is not null; 2. COUNT and COUNTA ignore null, but COUNTROW include null.
"Kunskap är förvirring men på en högre nivå än tidigare" I think that the problem is DAX does not recognize NULL in the database context as missing value. Instead you get empty strings which is not the same.
Glömde en länK www.sqlbi.com/articles/blank-handling-in-dax/
Hi curbal, i had a question which visual is used to display 4 continuous variables ad 2 categories of data, can you help me with this
Try scatter chart, with play axis (time) for continuous variables.
My assumption as to why COUNTROWS return 7 instead of 5, it is an "additive" model. Logically, you would wanna use an OR operator to count through all rows that are not NULL (I think the documentation meant this instead of BLANK). Instead of "multiplicative" model which uses the AND operator, which is to count the row only if all rows in all columns are not null. To me, this is a risky and faulty data modeling assumption, that I believe COUNTROWS aim to avoid.
In all 3 columns in the 'Counting' you have at least one value per row. Here's where COUNTROWS would return 7, again since it counts through all rows in the table that doesn't contain NULL.
As to why it is being more performant, I would say it goes back to the "additive" model assumption where each row in the table has a logic that checks them in the entire table. Otherwise, the engine has to work "harder" to get into the columns.
We gotta understand that DAX work from the following pecking order: Table > Column > Row (which explains why context transitions usually work the hardest).
I guess the logic is to answer which is easier to purchase in a store:
A. 5 packs of flour weighing 1kg each
B. Weighing 5kg off the big sack of flour
I know it isn't a great example, but I guess this is where the article is going. Once again, thanks for sharing Ruth. I've never used these formulae but will certainly bear in mind when I need to in future.
I am.afraid it is still confusing....maybe when I am older I will get it ;)
Thanks forntaking the time to explain!
@@CurbalEN haha apologies for that confusion. Maybe you can just load one extra row in the table where all 3 columns will be null. I think COUNTROWS would still return 7. If it returns 8 I can eat my own cat food. Hahahaha.
🤣🤣
I've tried. You can't even load a fully null row at the end of the table. But the interesting stuff I found out is, if you add a null row in between, it will be counted too! 😅😝
So yeah, suffice to say, it's very confusing to explain.
🙈🙈🙈
Amazing...