Data

Helping a global leader in nondestructive testing solutions drive their data to better efficiency

The Virtual Forge was approached by Magnaflux to make recommendations around their Power BI reporting on global financial data. As the global leader in non-destructive testing, Magnaflux have been a part of the world’s progress helping develop safer parts for cars, jet liners and infrastructures.

Brief

Magnaflux is a global organization with over 800,000 records in their sales database. To enhance their operations, they were seeking a global report which would provide a front to back view of their organization’s global sales, allowing data to be added on a monthly basis. Their current report was very large and processing was slowed down by the high volume of data.

The Challenge

Multiple different developers had been working on the data sets feeding the existing Magnaflux Power BI sales reports without standardization or implementation of best practices. Internally, there was a concern that the data sets were not streamlined and had redundancy. Our challenge was to first investigate their business requirements, data sets and dataflows, then design organized and streamlined dataflows.

Magnaflux’s existing Power BI report was very large and processing was slowed down by the high volume of data, approx 800,000 records, in giant data tables 92 columns wide. If, and when, there was an error, it was hard to fix because identifying the source was very difficult. Best practice within Power BI states that as the reports run on aggregated values, tables should always be slim and long, with millions of transactions but should only be a few columns wide.

The Virtual Forge identified these two main challenges;

Data Model Complexity - the model in its current state took excessive time (20-30 minutes) to refresh reports.
Data Model Structure - there was an interest to determine if the model was appropriately structured to meet business needs and performance needs.

The Goals

Magnaflux were seeking a service partner to assist with improving their PowerBI data model to achieve a responsive and well-structured “Magnaflux Data Model 3.0”.

Removing redundant and obsolete sources without compromising reporting
Determining the source of burden of the existing model.
Identifying opportunities for optimization

The Technical Solution

The Virtual Forge are experts in reimagining business intelligence. Our Power BI Consultants can quickly get a feel for the reporting requirements, get Microsoft Power BI up and running for an organization if they don’t already have it, and get it fully integrated with existing data sources such as cloud based systems like Microsoft Dynamics CRM, Salesforce, Google Analytics, or Azure SQL server.

Magnaflux came to The Virtual Forge with a well defined problem. The data behind their Power BI reports was high volume and in a complicated series of reports using different data sets. The first step was to analyze the data sets and compare against Power BI best practices.

The Virtual Forge carried out a review of data sources and linkages in consultation with Magnaflux experts to identify sources of burden. This involved a comprehensive review of system inputs, data collection, and data quality. A deep dive into the reporting needs and the data that was feeding into the reports was conducted. Through careful data analysis, and a strong understanding of the business needs, it was possible to create a transparent data model.

I used a lot of tools in ways I haven’t used them before. Dax studio let me run a trace against each table as it refreshed so I could find the things that were slowing the reports down. Tabular editor allowed me to load code into it that performs a Power BI best practice analysis against the report. This was incredibly useful.

says Tim Volz, Business Intelligence Analyst at The Virtual Forge.

A Data Led Approach

Magnaflux wanted their report to be a front to back, global view of sales in the organization. It ran on a huge amount of data with many reports and specialized tables within the data sets. This bloated the data model and made it very difficult to work out the cause of any issues. Data was required to be fed into the report monthly from six financial units, dropping in every file relating to a customer transaction. A template was held in an archive, data was dropped in and then the old data was archived within SharePoint. A second data flow and data pipelines were built inside Power BI to separate the data sets as a preprocessing step, this was then taken together in one data flow that was taking 30 minutes to refresh as all the individual transaction and item sales are processed together!

Our challenge was to provide a solution to reduce the volume of data stored and simplify the data flows as the current situation meant that every time new data was needed to be added they would simply bolt on something else to the data model.

The Virtual Forge was able to use DAX analyser, an issue analyser tool to sort through the existing data sets and find errors and isolate them. PBI cleaner was also used within Power BI to analyze the reports produced.

The Solution

The Virtual Forge was tasked with rebuilding the overly complicated data model that was used by Magnaflux, formulating a renewed and transparent solution. For example, a file from an UK financial unit comes in, if there’s an error in that data, it’s contained, this then becomes a separate data flow and any bad data is easy to identify at its origin. Our first step was to identify what is not being used in the data table, keep it in a source file but carve it out of the report, this column can then be dropped. The Tabular Editor tool was used to dive into the data set in excel and discover exactly what wasn’t being used in the current version of the report, identifying precisely which fields were not being utilized in reporting.

After a careful data model analysis The Virtual Forge were able to make a series of core recommendations. For the next iteration of the data model, we recommended the transition from the current SharePoint methodology to a data warehouse / lake. With a prevalence of significantly large tables in the model, both in terms of row count and column count, as well as seeming unused tables, we recommended the removal of all extraneous elements, whether in the dataset, dataflows, or at source (which will likely yield the greatest performance gains).

Merging queries based on non-foldable sources, where the operation cannot be pushed back to the source to be carried out, forces Power Query to hold tables in-memory. Over the course of several merges, this can greatly increase table refresh time (see below image). To mitigate this, aside from column reduction, revise queries to push merge operations back into the dataflow layer, or leverage merge optimizing m functions.

To aid with troubleshooting in the current model, given that the files are generated via manual manipulation, which can introduce errors, we recommended the deployment of dataflows by business unit. All business unit-specific files should be ingested in the new location, which will allow for much more targeted error identification and mitigation.

For additional troubleshooting in respect to the current model, we also recommended the exploration of the following for data quality monitoring via SharePoint:

Use a combination of Power Query and Power Automate to detect possible inconsistencies in the business unit-level files, notify users when an error is detected, and notify again once issues are resolved. We suggested a method similar to that described here.
Create staging sub-folders by business unit to serve as the initial destination for files. Files could be scanned via the above process, and then moved for Power Automate to the proper locations for import, reducing “hands-on” time

The Benefits

A full data model analysis was carried out, reviewing data flows, the data model and Power BI usage. Short and long term recommendations were made in line with a suggested action plan for the current and next data model. Magnaflux now had clarity and an action plan relating to the following recommendations.

Unused table/column clean up - Yielding significant performance gains.
Reduce/optimize query merge operations
Deploy dataflows by business unit, equalling a reduction in the time required for troubleshooting
SharePoint data quality monitoring - hands-on time reduced.

Get In Touch

Have a project in mind? No need to be shy, drop us a note and tell us how we can help realise your vision.

Thank you.

We've received your message and we'll get back to you as soon as possible.

Sorry, something went wrong while sending the form.
Please try again.