MS Fabric with bc2adls – Performance

MS Fabric with bc2adls – Performance

In me last two posts we talked about the enabled feature of bc2adls that export data to Microsoft Fabric:
– BC2ADLS: MS Fabric enabled part 1 – Discover Microsoft Business Central (bertverbeek.nl)
– BC2ADLS: MS Fabric enabled part 2 – Discover Microsoft Business Central (bertverbeek.nl)

In the past weeks I have tested the performance of the export and I was really impressed!

For this setup I used a MS Fabric trial workspace. An trial is delivered with 128 vcores:
Concurrency limits and queueing for Fabric Spark – Microsoft Fabric | Microsoft Learn

For this test I have setup 29 tables with the following total of fields and records:

Table# Fields selectedEntity nameTotal Records
G/L Entry59GLEntry-17150000
Customer101Customer-18100000
Cust. Ledger Entry74CustLedgerEntry-21100000
Vendor79Vendor-23100000
Vendor Ledger Entry68VendorLedgerEntry-25100000
Item142Item-27100000
Item Ledger Entry64ItemLedgerEntry-3250000
Sales Header162SalesHeader-3650000
Sales Line181SalesLine-3750000
Purchase Header144PurchaseHeader-3850000
Purchase Line205PurchaseLine-3950000
Purch. Comment Line7PurchCommentLine-4350000
Sales Comment Line7SalesCommentLine-4450000
G/L Budget Entry16GLBudgetEntry-9650000
Sales Shipment Header104SalesShipmentHeader-11050000
Sales Shipment Line97SalesShipmentLine-11150000
Sales Invoice Header117SalesInvoiceHeader-11250000
Sales Invoice Line101SalesInvoiceLine-11350000
Purch. Rcpt. Header92PurchRcptHeader-12050000
Purch. Rcpt. Line120PurchRcptLine-12150000
Purch. Inv. Header102PurchInvHeader-12250000
Purch. Inv. Line121PurchInvLine-12350000
Res. Capacity Entry5ResCapacityEntry-16050000
Job71Job-16750000
Job Ledger Entry76JobLedgerEntry-16950000
G/L Entry – VAT Entry Link2GLEntryVATEntryLink-25350000
VAT Entry70VATEntry-25450000
Detailed Cust. Ledg. Entry38DetailedCustLedgEntry-37950000
Detailed Vendor Ledg. Entry38DetailedVendorLedgEntry-38050000

Initial run

When we do an export from Business Central to Microsoft Fabric it will use the OneLake APi. The initial export are with those numbers:

TableRecordsBC Time
G/L Entry1500002 minutes 50 seconds 900 milliseconds
Customer1000003 minutes 43 seconds 400 milliseconds
Cust. Ledger Entry1000001 second 500 milliseconds
Vendor10000017 minutes 31 seconds 700 milliseconds
Vendor Ledger Entry1000001 minute 3 seconds 200 milliseconds
Item1000005 minutes 29 seconds 400 milliseconds
Item Ledger Entry5000057 seconds 900 milliseconds
Sales Header500003 seconds 600 milliseconds
Sales Line500001 second 900 milliseconds
Purchase Header500002 minutes 44 seconds 500 milliseconds
Purchase Line500001 minute 10 seconds 200 milliseconds
Purch. Comment Line5000034 seconds 600 milliseconds
Sales Comment Line5000032 seconds 600 milliseconds
G/L Budget Entry5000041 seconds 900 milliseconds
Sales Shipment Header500001 minute 28 seconds 600 milliseconds
Sales Shipment Line500002 minutes 9 seconds 400 milliseconds
Sales Invoice Header500007 minutes 8 seconds 700 milliseconds
Sales Invoice Line5000042 seconds 700 milliseconds
Purch. Rcpt. Header5000048 seconds 900 milliseconds
Purch. Rcpt. Line5000054 seconds 900 milliseconds
Purch. Inv. Header500001 minute 31 seconds 200 milliseconds
Purch. Inv. Line5000059 seconds 400 milliseconds
Res. Capacity Entry5000027 seconds 400 milliseconds
Job5000045 seconds 900 milliseconds
Job Ledger Entry5000044 seconds 600 milliseconds
G/L Entry – VAT Entry Link5000036 seconds 300 milliseconds
VAT Entry5000036 seconds 400 milliseconds
Detailed Cust. Ledg. Entry500001 minute 2 seconds 900 milliseconds
Detailed Vendor Ledg. Entry5000036 seconds 100 milliseconds

Only the Vendor take quite some time. But proberly this is caused by an server issue. I run in this case Business Central on an on-premise VM Machine (Standard E4as v5 (4 vcpus, 32 GiB memory)). Some other partners doesn’t have this experience on an Business Central Online environment.

When the files are in Microsoft Fabric the Notebook will transform the .csv files into an lakehouse delta table.
In that first initial run when there are no tables yet the Notebook took only 4 minutes an 54 seconds to run.

Second export

Mostly the delta files take allot of more time because the notebook have to sort and remove the suplicates. In the second run I have exported the following records:

TableNew recordsModified records
G/L Entry2000 
Customer20001000
Cust. Ledger Entry2000 
Vendor20001000
Vendor Ledger Entry2000 
Item20001000
Item Ledger Entry2000 
Sales Header20001000
Sales Line2000 
Purchase Header20001000
Purchase Line2000 
Purch. Comment Line2000 
Sales Comment Line2000 
G/L Budget Entry2000 
Sales Shipment Header2000 
Sales Shipment Line2000 
Sales Invoice Header2000 
Sales Invoice Line2000 
Purch. Rcpt. Header2000 
Purch. Rcpt. Line2000 
Purch. Inv. Header2000 
Purch. Inv. Line2000 
Res. Capacity Entry2000 
Job20001000
Job Ledger Entry2000 
G/L Entry – VAT Entry Link2000 
VAT Entry2000 
Detailed Cust. Ledg. Entry2000 
Detailed Vendor Ledg. Entry2000 

When I run the Notebook again the second run has to take 9 minutes and 25 seconds. After that I run another export from Business Central with the same amount of new and modified records and the third time the Notebook ran it took 11 minutes and 11 seconds to complete.

So very impresive. This because with Azure Synapse it took almost 7 minutes per entity to complete the task.

So in short:

Tables exportedNew recordsModified recordsNotebook
291.800.000 4 min 54 sec
2958.0006.0009 min 25 sec
2958.0006.00011 min 11 sec

Parallelism

A spark dataframe works with different partitions because of the big size the dataframe can be (mostly the number of partitions are equal to the number of CPU cores). 
This allows completing the job faster. But if you have to move data from one partition to another partitions this can be slowing your job (operations like sorting etc.).
In that case it is better to tune how many partitions you want in your spark pool.

That is why there is now a build in parameter. In this case you can define your number of partitions on your own to do the fasted job for your workspace:

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.