From Data Archive to Azure Data Lake
Last time a write an article about the newly released extension “Data Archive” (Data Archive in Business Central – Cleanup your data | LinkedIn). There I mentioned that you can better export is to an Cosmos Database. But Microsoft also looked into Azure Data Lake. This is done by the project “bc2adsl” (microsoft/bc2adls: Exporting data from Dynamics 365 Business Central to Azure data lake storage (github.com))
The project bc2adsl is just an Proof of Concept the see if it is working. All the details you can find on the GitHub above. And I must sayd it works perfectly!
But you can also combine those two. Archive all your deleted entries and store it in an Azure Data Lake for your PowerBI dashboard. In this way you can keep your BC database small but still have your data for analysis.
When I have investigated it I came across several isseus. All those isseus are reported to MS.
The first isseu is that the Data Archive extension doesn’t archive the system fields. Those fields are needed for the piplines in “Azure Synapse”. So for this I changed function in GetFieldListFromTable 80103 “Data Archive Provider” implements “Data Archive Provider” into:
(There was a filter on the field no. I have removed it). This will also be removed in a later version of the extension.
The next part we need to change the bc2adsl extension. Here are not many functions set to open. So first we need to do this. For that I have added a lot of functions in codeunit codeunit 82567 ADLSE (the changes I have made you can see here: bc2adsl_archive/ADLSE.Codeunit.al at master · Bertverbeek4PS/bc2adsl_archive (github.com)).
Then we need to fix the export to CSV format. This is because the data that is stored in the Data Archive is stored in a json format.
For that part I have created two functions in codeunit 82564 “ADLSE Util”. The first is two create the headers and the second one it so create the lines from a stream.
The code for that you can find here bc2adsl_archive/ADLSEUtil.Codeunit.al at master · Bertverbeek4PS/bc2adsl_archive (github.com)
And the last part is to add an button to the Data Archive page so you can export the entries that are in the Data Archive. Also when it is processed the Data Archive is deleted.
The source for that you can see on Bertverbeek4PS/DataArchive2adls (github.com)
So is it working?
In this part I don’t show you the setup of the extension bc2adsl. That is perfecly describe in the earlier mentioned GitHub.
I have deleted an Item. That is in the Data Archive:
When I do an export to csv you can see which one I have deleted:
When you push the button “Export 2 ADLS” the files will be exported. And when you start the pipelines in “Azure Synapse” you can see all the data in your storage accounant:
And in PowerBI you can use that data in your dashboards:
Hope this will help to explorer all the possibilities to export data and keep your database clean.
As mentioned before the bug in the Data Archive extension will be picked up by Microsoft. And hopefully there will be a pull request on the bc2adsl repro when I have talked to MS and deliver some clean code. So anybody can use this. When this is done I will inform you.