Why is my Power BI refresh so SLOW?!? 3 Bottlenecks for refresh performance

Поделиться
HTML-код
  • Опубликовано: 21 ноя 2024
  • НаукаНаука

Комментарии • 97

  • @MortenHannibalsenOlsen
    @MortenHannibalsenOlsen 4 года назад +10

    One surprising bottleneck I have discovered recently is 'query caching' in a mixed data model.
    With query caching on the refresh would take about 23 minutes and without only about 3 minutes.
    The root cause is still a bit unclear to me, but my best guess is that the cache has to be refilled each time the refresh runs.
    Another caveat to using query caching, in our case, was that the data model footprint exploded from about 50 MB to 2,4 GB.
    So be careful when using the query cache out there!
    Maybe a video on the inner workings of query caching would be appreciated by the community?

  • @VeganSmasher
    @VeganSmasher Год назад +2

    Great video as always, Adam. The learning curve with Power BI can be intimidating, but you guys break it down into bite-sized chunks and make it manageable. Thank you!

  • @wmfexcel
    @wmfexcel 4 года назад +14

    Hi Adam, thanks for the video. Very information! However i have encountered a "strange" bottleneck that puzzles me for long.
    My Power Query is only getting data (csv and xlsx) from local drive. When I refresh my model, some of the queries get stuck in "Evaluating". It could be stuck there for more than 5 minutes. Weird thing is, when I cancel the query and Refresh again, it gets quicker.
    I searched for solution for this issue in PBI Community, but couldn't find one. Appreciate your advice on this. 🙏

    • @demented1007
      @demented1007 11 месяцев назад +1

      In my experience it happens when u change the query. What I usually do is keep a small truncated version of those files on my PC and design and refresh using those smaller files (like 5000 rows max) and when it refreshes and finishes, I replace the files at the back end with the full size and it runs like cake...no more evaluating....

    • @tacom6
      @tacom6 9 месяцев назад

      @@demented1007lol... that's crazy that one has to do that. gosh I detest Microsoft products half the time lol.

  • @donmcgimpsey1706
    @donmcgimpsey1706 Год назад

    I just gotta say - I get value from any video where I learn something new. Thanks for these tips, and my new idea for today was that the user-supplied SQL does not fold. I don't know why that isn't broadcast from the mountaintop - but I really didn't know that. It makes sense to me once I heard it but it could be the root of many of my issues relating to speed.

  • @PedroCarneiroJr1971
    @PedroCarneiroJr1971 Год назад +1

    Great suggestions! I see here the same and old good practices since Crystal Reports times. The more prepared is the data before it gets in, the better is the performance.

  • @alfredsfutterkiste7534
    @alfredsfutterkiste7534 4 года назад +3

    Honestly the graphics in this video are really good. Good job guys

    • @GuyInACube
      @GuyInACube  4 года назад

      Thank you! First video with our new video editor. Very happy with it so far.

  • @thepurpleenigma
    @thepurpleenigma 3 года назад

    Thanks a lot for this post. I have a large postgresql DB with several million rows. My power query transforms bloat it even more and my rookie mistake #1 was not cleaning things up first on the server side. #2 issue is of course the row size. Not sure what I can possibly do about that but right now it takes over 1 hour to refresh!

  • @SolutionsAbroad
    @SolutionsAbroad 4 года назад +3

    Great handy tips, i was hoping you would touch on DAX studio or performance analyser on this video

    • @GuyInACube
      @GuyInACube  4 года назад

      Yeah wasn't really the point of the video. More about the concepts which people tend to not think about.

  • @SoniaSantillan-Cabrera
    @SoniaSantillan-Cabrera 14 дней назад

    This is great information! I am wondering why my Excel model in my new laptop is 10 times slower than my old laptop. They both have the same amount of memory. I am wondering if there is some setting in Excel that can speed up the refresh of my model?

  • @tubez29
    @tubez29 3 года назад +2

    Well done Adam! I must confess I was distracted by trying to read the shirt. Very funny. Anyhow, these are great tips. After I create my steps, I like to rename them properly. What tool do you suggest we can use to rename without having to wait a few seconds everytime we make changes?

  • @solidoffortitude
    @solidoffortitude 4 года назад +3

    Completely agree - if you use Native Query, there should only be 2 lines: Source, and Change Type

    • @mcnater
      @mcnater 4 года назад +1

      This is not even close to feasible at most organizations and especially the one I work at. So while in theory it sounds great, it's not practical.

    • @MasterCamus
      @MasterCamus 4 года назад

      Why wouldn’t it be feasible? Aren’t you querying using native query?

    • @GuyInACube
      @GuyInACube  4 года назад +1

      Agreed. If you need to do transformations past that, then try to get the native query into a view and then do your Power Query transformations, or update the query to include those transformations.

    • @ITSNev
      @ITSNev 4 года назад

      What would be the correct way to load the native query, and once loaded do your transformation. Would this be to reference that table : query?

    • @juanm555
      @juanm555 11 месяцев назад

      what?🙃@@ITSNev

  • @andyparkerson
    @andyparkerson 4 года назад +2

    We try to get analysts to tell us what native queries they want to use, and then create SQL views from those queries. This allows query folding to occur, and minimizes the amount of SQL that analysts have to write and maintain.

    • @joelguay4452
      @joelguay4452 4 года назад +1

      Then as an analyst, we have to CAPEX each and every modification to those views ! No thanks ;) You don't have to Custom Query in PowerQuery, you can start your query leaving the OPTIONAL query blank on the table, then do query folding supported transformations BEFORE doing any transformation that is not supported :) Step by step, look at the "view native query" option on a right clic of a step to look if that step is breaking query folding. The most evident one are Selected columns and column filtering, Group by, maybe some SQL native text transformations and Math ...

    • @GuyInACube
      @GuyInACube  4 года назад

      Using views are a great way to go about that. Assuming the business has a way of getting that created which doesn't have a lot of red tape around it. If there is friction, I find that the business user will just do it themselves.

    • @juanm555
      @juanm555 11 месяцев назад

      the joys when you have separation of roles... nice, haha

  • @vasilisa6128
    @vasilisa6128 4 года назад +3

    Interesting to know about these bottle necks. Thank you!

    • @GuyInACube
      @GuyInACube  4 года назад

      Agreed! Most people don't think about this. Even more confusing when the database queries are really fast but the refresh is slow. Can be so frustrating.

  • @klaymoon1
    @klaymoon1 Год назад +1

    Great video! I find it super useful. I use CMD commands when connecting to SQL servers. In this case, I suppose I cannot do anything about Query Folding?

  • @jocelynsamson3389
    @jocelynsamson3389 4 года назад

    Good quality video. I still have a long way to go to learn all this things.
    Have same issues with refresh takes forever.
    Still a bit confused about how the "query" can be put to "source" instead of "fact".

  • @Cioopas
    @Cioopas 4 года назад +2

    Now I definitely joined Adam team.
    Thanks for those tips!

  • @rolandszirmai3922
    @rolandszirmai3922 4 года назад +4

    Love the new design and video elements! Great stuff!!

  • @Psyresidente
    @Psyresidente 4 месяца назад

    Is it better to create a table in DAX from the date columns in other tables and then add columns like month number, month name, fiscal period, fiscal quarter, etc., or is it better to create these columns in SQL and then import a comprehensive date table with all these attributes?

  • @koltregaskes8385
    @koltregaskes8385 Год назад

    Good video. What about improving performance when connecting to data sources like Salesforce?

  • @Luggruff
    @Luggruff 2 года назад +2

    Not sure I followed what happened loading the 4 x 2 date table. Although, if you are suggesting to use DAX over Power Query, maybe good to mention that it becomes a question of where you offload your loading times: to yourself for the initial refresh, or to the end user?

    • @juanm555
      @juanm555 11 месяцев назад +1

      THe thing is that since automatic dates are enabled, Power BI will create some tables in order to control de date hierarchies that it creates naturally, but if you saw the values that Adam had in his table, he had something like year 1 AC , and also something like year 9999.... that meands that Power Query will create calendar tables for that date range, 1 row for every day, each of those tables would have like 3.65 million rows, so there you go, auto date tables are bad, Bravo is good, be friends with bravo and deactivate auto dates.

  • @artemkeller2571
    @artemkeller2571 3 месяца назад +1

    So... How to speed up transformation again?

  • @TheDukeNY
    @TheDukeNY 3 года назад

    WFH tips. If you have slow refresh on your laptop at home here are some basic networking helps with download times (1) use your ethernet if possible (2) use a high speed router that exceeds the mbps of your service, preferably not the one that came from your isp (3) if you are wireless be as close to your router, try connecting to your main ("extended" mesh routers or repeaters especially will slow down your download). (4) Check the signal strength if you are on wifi. Make sure you got a good signal. (5) sometimes the issue can be with your VPN, talk to your IT department but maybe do that last before checking on your home internet set up

  • @jameszhou162
    @jameszhou162 4 года назад

    I am dealing large data sources over 2G with 3 million rows. Power Bi definitely not able to load these data or refresh them easily. i found alternative solution by using Alteryx to take data from Data Lake or SQL directly and massage them before load it to PBI and use DAX for Calculated columns or measure exclusively. it seems to be right approach by saving me tons of time.

    • @joelguay4452
      @joelguay4452 4 года назад

      That sounds fishy to me, 3M rows SHOULD NOT take 2GB in a sane star shema model ...
      Got 52 Tables and 253 columns with a load of relationships, facts tables are at ~300 000 rows, I got a words extraction function that takes rich text fields and split the words into rows of Words for each ticket IDs, 2.5M rows for 22K unique words on 300 000k tickets, that is one of the smallest table, 4.5% of my data set. One the other side, i got the "journals" table from wich the words are taken that are kept as is to read the tickets, ~300 000 rows, that take 19.5% of my model ... CARDINALITY IS KEY, huge amount of rows of VERY repeatable values is MUCH better than fewer rows of very High Cadinality (Uniqueness)
      My Whole model is measured at 265MB in DAX Studio. (Date tables and Time tables are duplicated with DAX for each fact table)

  • @shantimaharjan110
    @shantimaharjan110 Год назад

    Thank you for video. Does this solution work for with denodo connectors as a source too.

  • @NarendraSunku
    @NarendraSunku 4 года назад

    Hi Adam - Thanks for the video. It is very informative.
    I have one question different from this video.
    I would like to hide count of slicer items in Filter Pane. How can I achieve this? Please help me.

  • @RomanGWatch
    @RomanGWatch 3 года назад

    Good to know, thank you

  • @fidelp27
    @fidelp27 2 года назад

    What's "Microsoft Mashup Evaluation Container". My PC it's slowly! when I use Power Query Editor. Thanks for your help

  • @chiraggoel3615
    @chiraggoel3615 2 года назад

    what's the solution for refreshing the dummy table that you created? to avoid the huge refresh time?

  • @JamisonAvery-vf7ub
    @JamisonAvery-vf7ub 5 месяцев назад

    Where is the link to the book? Did I miss it?

  • @kat7041
    @kat7041 3 года назад

    Very helpful. Thanks for sharing.

  • @engvictorfarias
    @engvictorfarias 2 года назад

    The book is really amazing. It's my holy bible. I started reading the optimization part today by coincidence.

  • @maheshgupta1440
    @maheshgupta1440 4 года назад +3

    Hey Adam, I happen to work on Excel data of 500 MB (min) on a regular basis and whenever I try to do a refresh in PBI, it takes a lot of time to load. Is there a way to deal with this situation to reduce the time?

    • @joelguay4452
      @joelguay4452 4 года назад +1

      Makes sure you don't jam mixed tables into single tables wich in 1 to * or * to * relationships will cause many dublicates of values in comlumns where values should be unique (IDs). So work to split your data into Star schema tables and reduce your cardinality (uniqueness of rows) making the dictionaries leaner. One way to start is to slit timestamps into Dates and Time columns : instead of 23M seconds in a year cadinality, you will get 365 days and 86400 seconds in a day unique values, HUGE difference.
      LEt say your operational system has Calls in a table, then actions done in those calls in another, then tickets in a third, don't use a join of those 3 tables in a single Excel sheet as a data source.
      Export those 3 tables seperately with the necesseray field to join Calls to actions, then actions to tickets. Make the relationship in the relationship editor in PBI (not in Power Query).
      What you would do tho is split the timestamp of those elements into Date and time, and build a Date and a Time Tables, only oncem, that you will duplicate for those 3 tables (as to not have circular references.

  • @danielmilner7237
    @danielmilner7237 3 года назад

    Great tips - that's really helped me.
    Love the T-shirt too!

  • @aleksanderstrommer7677
    @aleksanderstrommer7677 4 года назад

    Please make a video about Exchange Online to Power BI. It took over one hour to load only about 7000 rows from the Exchange Online via power query. Where is the bottle neck? This case it was shared mail folder and only messages from this year. The purpose was to make month by month visuals only how many (external) messages shared mailbox received. I only selected received date, sender address, title, message body (for possible future text analysis).

  • @gerardotobar6905
    @gerardotobar6905 2 года назад +2

    Hey guys! Just wanted to ask if do you know if Oracle queries do take longer to import than let’s say Azure ones?
    I’ve been experiencing really long waits (18 hours) in order to load 4.6M rows stored in Oracle and I wanted to know if maybe a not-so-great compatibility in between Microsoft and Oracle may cause it

    • @donmcgimpsey1706
      @donmcgimpsey1706 Год назад

      I would also have a follow up question on this: is it possible to increase the buffer size on Oracle queries? In my case, it seems to read about 1000 records at a time. The data source is as fast as it gets, and most times, I can export to CSV in 5 minutes for the same row count, whereas Power BI takes hours. A clumsy work around - and not really sustainable for production queries

  • @workstuff5253
    @workstuff5253 3 года назад

    Great video. Random question - I have the Definitive Guide to DAX first edition. Is there enough of a difference to consider picking up the 2nd Ed?

  • @Bharath_PBI
    @Bharath_PBI 4 года назад

    That's great, I understand this applies to import mode, but when its DQ, it's completely different where source db performance has to be optimum.. following best practice rules analyzer for tabular model. Any other suggestions for DQ reports optimization?

  • @ManafKAMIL
    @ManafKAMIL 4 года назад

    Really useful, under the hood, insights.... Thank you. 👍👍

    • @GuyInACube
      @GuyInACube  4 года назад

      Appreciate that! 👊 Thanks for watching.

  • @lwhieldon1
    @lwhieldon1 3 года назад

    How can you tell if the slowness if the result of a VM that's been throttled versus the performance of the pbix workbook itself?

  • @Rotjeknar
    @Rotjeknar 4 года назад +1

    I am interested in a similar topic on performance but then pulling data from a Sharepoint list. I've been experimenting to get it load faster but it still is way too slow. Using an export connection (in Excel) refreshes in a couple of seconds, using a power query (Sharepoint list API14 or OData) is taking minutes (and Sharepoint list API15 even worse). Making dedicated list views in Sharepoint don't work as the query only takes the full list. Any plans for this as topic?

    • @joelguay4452
      @joelguay4452 4 года назад

      I actually made a function to paginate trough the page limitation of 5000 elements, look at this and see if that helps :
      github.com/Altheran88/PowerBI/blob/master/PowerQueryM/Sharepoint_Online_PageLoop.m

  • @williamstorey5024
    @williamstorey5024 3 года назад

    What is that query add on that you are using in visual studio code?

  • @navyareshu
    @navyareshu 2 года назад

    Why the scheduled refresh takes longer time than on demand refresh

  • @BISAACk
    @BISAACk Год назад

    i am working with a LARGE "SAP" TABLE with 2 million rows + when i load the table it takes a while to load all that data, how to i filter by a flag field so i only get 1/4 of the data i need, i don't need all the data. I want to filter the data before it loads into the data model before all the Power Bi transformations - i want to reduce the data refresh time by not loading 2+ millions rows

  • @Garycarlyle
    @Garycarlyle 4 месяца назад

    Power Query seems slower than just writing formulas the normal way in Excel. Not sure of the point of it.

  • @ivanmorgan958
    @ivanmorgan958 3 года назад

    Coming from Qlik, Im really struggling with PowerBI. My data set takes seconds to refresh on PowerBI desktop, but when i do it on PowerBI Service, it takes centuries. Please Help!

  • @chamilam
    @chamilam 4 года назад

    Thanks .. good tips ..

  • @muriela21
    @muriela21 4 года назад

    Since the september 2020 uptade when I open my pbix file the first refresh takes too long, even hours, the next times it takes around 5 minutes like usually before the update. It only started happening after updating the version of power bi, I have no idea why. I didn't make any changes on the model, anyone facing the same issue? how can I fix it?

  • @surajkhune2808
    @surajkhune2808 Год назад

    Hi
    What can we do if report working find on Desktop but when we see on service that time it becomes slow?

  • @tortugahispana
    @tortugahispana Год назад

    "Fill Up" and "Fill Down" commands takes a lot of time to process. Becarefull with that.

  • @fredbarnes196
    @fredbarnes196 2 года назад

    It’s not realistic to push all your transformations to sql. Besides the Oracle driver is also really slow. I find that developing an app almost half of your time is sitting waiting for the data to refresh. Add to that you have to also refresh the visuals. On the bright side it gives you plenty of time to watch you tube videos

  • @Rice0987
    @Rice0987 Год назад

    Nice violet eyes 😎👌

  • @siddheshamrutkar8684
    @siddheshamrutkar8684 4 года назад

    Publish with the proper solution what would have been your approach to do so? With example would be appreciated..

  • @memairport1
    @memairport1 4 года назад

    Do you know why it takes so long just to open the PowerBI program on it's own without opening a visualization? Have you heard of this issue?

    • @GuyInACube
      @GuyInACube  4 года назад +1

      I've heard some cases of that. I know the Power BI team tries to improve that. I don't have any definitive items for you though.

    • @memairport1
      @memairport1 4 года назад

      Hope they work on that...my groups don't like to use it because it takes so long just to open the program. So, everyone does not have this issue, just some users experience this?

  • @alyssonmachadoborges3373
    @alyssonmachadoborges3373 4 года назад

    O produto precisa evoluir nesta parte. Consultas rápidas no SQL SERVER com processamento lento no PQUERY.

  • @robgreen708
    @robgreen708 9 месяцев назад

    I didn’t find this helpful to be honest. My dashboard refresh has been timing out after around 8 hours. My data sources are in excel files in sharepoint. There is a lot of data manipulation in powerquery. I was hoping to identify which datasets and processes were causing the slow refresh so I could make more efficient.

  • @edhansberry9194
    @edhansberry9194 4 года назад +4

    The ability to add SQL code directly to a SQL connection should be blocked. Causes so many issues downstream. If you really REALLY need your own SQL code, create a view, then connect to the view in Power BI.

    • @PicaPauDiablo1
      @PicaPauDiablo1 4 года назад

      May I ask what sorts of problems its caused you ? I have a bunch of legacy queries against CRM that's being migrated to cloud and had insane deadlines, so I took the SQL from old SSRS on prem and changed it against CDS. Been worried about it and it sounds like I need to. It was only a stop gap but sounds like it's a deal with the 😈

    • @GuyInACube
      @GuyInACube  4 года назад +1

      I've seen a lot of good reasons to use a native query. The problems I see are when folks do continued transformations after that point. If that's the route you have chosen, stick with it. OR, stick it in a view on the data source end and then query folding should be ok.

    • @edhansberry9194
      @edhansberry9194 4 года назад +1

      @@PicaPauDiablo1 The problem is it can potentially cause issues downstream. I have yet to hear a good reason for using the Native Query functionality vs connecting to a View in SQL with your SQL statement. That allows you to do what you want in SQL, but Power Query can still fold the statement. That means:
      You can use Incremental Refresh
      You can continue to do transformations that still fold or don't outright break Direct Query models
      You can have permissions on the gateway that don't allow your users full admin rights. I had a client that was trying to use this in DataFlows and while the SQL works fine in the desktop, in the service they didn't have admin rights to the gateway so didn't have permissions to run a native query.
      If you are using it in a migration, that is fine, but they can cause problems later if you ever want to make changes. I might suggest putting those SQL statements in a view for the migration, the later either do it in Power Query, or just use the view forever and do additional transformations in Power Query. But leaving the SQL code in the advanced query box really says "this will never change and the query is perfect as is, and my users can have admin rights to my gateways."

    • @olemew
      @olemew 3 года назад +1

      Imagine very large tables, and a quick Power BI model to analyze specific business cases or products. Then, maybe you REALLY need your own SQL code because you are dynamically taking input parameters from Excel/Power BI. In this case, creating a view is not feasible.

  • @diamondunbreakable3257
    @diamondunbreakable3257 3 года назад

    my report takes 1hr to load

  • @Fiktage
    @Fiktage 4 года назад

    Hello! Looking for advice how to solve issue by connecting to excel files with Restricted Access. I have all the rights to do changes in files but Power Query i got error Unable to connect, details: "File contains corrupted data". If source file(s) will be open, Power Query can connect easily but its inefficient way :(
    As I see, some Idea already created ideas.powerbi.com/ideas/idea/?ideaid=63445c8b-e13a-4735-b955-5b004f44c9d5
    Appreciate for any feedback!