A question I get a lot in my Power BI training for Business Central, and non Business Central companies is the following:
- How can I implement an incremental refresh, without building a data warehouse and without buying a Power BI Premium license?
And to be honest, the answer is, it’s not really possible, but there’s a workaround. The workaround is an incremental refresh, using a cutoff-date, but this date might need to be manually updated at a certain moment.
Let me explain how this works. But before I do that, I should also mention that last week I was enormously pleasantly surprised by an announcement of the Power BI team for the next Wave 1 release:
To be more precise:
- This capability will be brought to General availability, including removing the restriction that models be hosted in Power BI Premium. Anyone with a Power BI Pro license will be able to use this feature.
How cool is that!!!!!! #IWishItWasApril 😉
So this blogpost is actually for those of us who can’t wait for the 2020 release wave 1 in May, or for those of us who are using a free Power BI version.
Let me explain the ‘problem’ of incremental refresh.
Imagine you have created a Power BI report that imports Ledger Entries from Business Central ( or another data-source). Every time you refresh the report, manually or scheduled, Power BI will first delete all entries and then will re-import everything. This includes all entries, meaning all the new and modified ones and also the old unmodified entries. As your ledger entry tables always grow and grow with every transaction, this means every refresh will always require more records to be imported, and so will take longer to complete. At a certain moment its also possible that you won’t be able to refresh anymore because of limitations:
Anyway, independent of any limitations, it’s always a good idea to limit the number of records you import in Power BI.
In this blog article I will presume that you are:
- importing data from Business Central (Saas or OnPrem)
- using ODATA queries as a data-source
- refreshing your report manually or scheduled
- using a Power BI free or pro license
But actually it doesn’t really matter, because this technique should work on all kinds of datasources…
The first thing you should of-course do is create your query objects.
This means that you will create one ore multiple query objects in AL and publish them as web services in the web-services table.
You can find an example of how to do that here:
Once your queries are published as web services in Business Central, you can use the Dynamics 365 Business Central connector in Power BI desktop to import the data:
Let’s look at an example. First I have the following query:
In this query I’m fetching records from the “G/L Entry” table. The query includes a group-by on the following columns:
- “G/L Account No.”
- “G/L Account Name”
- “Posting Date”
- “Dimension Set ID”
and a subtotal, using the SUM function on:
- removing the G/L Account Name
- calculating a subtotal on the Posting Date (Year, Month, Day)
But oké, let’s leave it as it is, for the purpose of the example. I just want to stress that performance tuning a Power BI (or Excel) report, starts with performance tuning the underlying queries, and queries should only include the columns that are absolutely necessary, especially when implementing grouping.
As an intermezzo, a quick questions, what’s could be wrong with the following query?
Do you see the error in the above query? Think about it… Why is there an “Entry No” here? By adding this field, the group-by has no effect anymore, because the Entry No is unique for every row. Kind of stupid no? Well, to be honest, I have seen this in some queries when I was investigating performance issues for some customers and as you can imagine, by simply removing this field (and others) the query became a lot faster…
Anyway, let me continue with my example.
We now have a query fetching general ledger entries. Let’s import them into Power BI Desktop.
So far so good. But notice that the GL Entry table already contains 1737 rows, and I’m using a simple Cronus demo database in Business Central Saas, that only contains a few posted transactions. So can imagine that a real life company will have many more records to import. And every time you refresh the Power BI report (manually or scheduled), everything is re-imported again from scratch…
Let’s try to implement an incremental refresh.
This means that a certain amount of rows (typically old rows) do not need to be re-imported again and again and again. So how to choose which rows are incremental and which rows are static? Typically in this example we would use a certain date. You could say, all entries with a posting date older then cutoff-date are static and all younger/newer entries are incremental. So we will need to split the GL Entry table into two subsets: a static part and an incremental part.
There are several ways to do that. We can do that in Power Query, by splitting the table into two tables using the cutoff-date.
We could also apply a filter in the ODATA query. In this case I’m using the Dynamics 365 Business Central connector in Power BI desktop to import the data, but if I would be using the ODATA connector, I could apply a filter directly on the ODATA URL:
But this is something I will explain in a different blog post 😉 Also, because at the moment, importing Business Central, multi-company, ODATA feeds contains a bug. Everything works fine in the Power Query editor, but when you close and apply, nothing gets imported.
So, let’s go into the query editor and split our queries:
Right click on the GLEntries query and select Duplicate:
This will create an exact copy of the GLEntries query. Then rename the two queries:
Now we have two queries:
Now let’s apply a filter in both queries, using a cutoff-date, for example: 01/01/2019:
Now we have implemented a split, based upon a date: 01/01/2019. Next, let’s change some query properties:
As you can see, in the GLEntries_static query I have disabled load and aso disables report refresh. This means this query will not be refreshed anymore, and also it will not be loaded to the report.
Then on the GLEntries_incremental query I will do the following:
This means the GLEntries_incremental query will still be refreshed, but not loaded to the report.
Now comes the tricky part…
Lets now combine the two queries into a new one, using the Append Queries button:
We now have a GLEntries query, which will be loaded into the report, that’s the result of appending GLEntries_static and GLEntries_incremental:
But when you actually test it it does not always seem to work…
Apparently when you union two queries, like we did in this example, the Include in Report Refresh option does not behave as expected. In the GLEntries_static query Include in Report Refresh is turned off, but it still gets refreshed in the unioned GLEntries query. And that exactly what we do not want to happen…
In that case there’s another workaround.
Go back to the query editor and remove the GLEntries table (the union). Then enable load on both other queries, but the first (static) query still has Include in Report Refresh set to false.
Then Close and Apply, and both queries will be loaded into the data model. Then hide both queries in report view:
When you now refresh the report, only the GLEntries_new gets refreshed and no longer the GLEntries_static.
And now we will union both tables using DAX:
And voila, the table is back and with an incremental refresh.
In Power BI Desktop, after loading the queries, the result is exactly the same, but the refresh will be a lot faster, because the GLEntries_static will not be refreshed anymore…
This is an example of how you can implement an incremental refresh in Power BI Desktop.
There are however some disadvantages:
When you want to move the cutoff-date further in the future, you have to do that manually. This means, changing the date, and applying the refresh properties again in the Power Query editor. The cutoff-date is a fixed date, it’s hard coded. Even f you would change this into a parameter, then the ‘Include in Report Refresh’ property of the query can not be modified pro grammatically.
Then in this example I have done it for the GLEntry table. As yo can imagine, a real report will contain a lot more tables and doing this for all those table will complicate your data model in the Power Query editor. That’s why I would recommend to only apply this technique for the big fact table in your data model, and not for all tables.
Also, remember that in the near future, this might be out of the box available for Por accounts:
So then you will even need to implement this workaround…
My blog post was also inspired by these two posts, by Maxim Zelensky: