Multi-parent Tree-Structures in Business with Power BI / DAX

Bob was not in the best mood. He felt this was the low point of his career. The last meeting with the CEO Bill was bad.

For 4 months he was now head of controlling at Lost Bonus Agreements Corp. and this was the first time Bill was yelling at him. “We have simplified all our bonus agreements I have no f… clue, what our total bonus spend will be!”.

The fiscal year end was 1 month away and indeed the bonus situation were unclear. A year ago, the company had drastically simplified the bonus agreements and only the revenue of the fiscal year is the basis for the gratification. Then there is an individual percentage applied to this revenue and voilà the amount is calculated. What seemed so easy had a pitfall: in the bonus discussions completely individual revenue base lines have been agreed.

During the fiscal year two types of information were available:

  • The bonus relevant entities per employee stemming from the bonus sheets.
  • One big list containing the profit center, the child organization, and the turnover forecast for the fiscal year.

ID Profit Center Children Turnover
1 NA Sales NA Operations 6000
2 NA Corporate NA Operations 500
3 NA Operations Toys,Bikes 2000
4 Toys Puppets,Games 10000
5 Bikes 7000
6 Games 3000
7 Puppets 5000
8 APAC Sales Appliances,Singapore Operations
9 Appliances 5000
10 Singapore Operations 10000

Bob was lost with these sheets. He had now 5000 bonus agreements and one big sheet with all business divisions and their children and no clue how to get the total bonus relevant revenue sum. So, Bob called his favorite business analyst Ken (just returning from the vacation with his wife Barbie) for help.

The Solution

Ken received all documents and below we will describe Ken’s thought process: Ken opened his Power BI Desktop and stared at the ceiling of his window-less office.

Ken made several observations:

  • The above table represents a multi-parent tree structure. E.g., if an employee had NA Sales and NA Corporate on the bonus sheet, they both had the direct child NA Operations. So, this was a multi-parent structure in the graph.
  • It would have been easy if the relevant legal entity descendants had been extracted per employee. But there was only on big sheet with all parent child relationship of the legal entities. It was out of the question to isolate the part of the legal entity sheet being relevant per employee for 5000 employees.
  • The build in PATH function did not support several parents for one child.
  • DAX is not recursive. This means that the implementation of graph exploration strategies starting with only e.g., an adjacency list was not possible in DAX.

He identified three fundamental problems with the data input he had received:

  1. Revenue of the organization on the bonus sheet: the bonus sheet showed only the top organizations. The ERP system did not aggregate the revenue per legal entity before the end of the fiscal year based on the forecast for an entire fiscal year. So, the revenue of all descendants would have to be summed up manually.
  2. Double counting of profit centers: it was possible that the revenue of a profit center was assigned to both, sales organization A and sales organization B. If an employee had a b bonus agreement comprising sales organization A and B and both had a profit center as descendant, it would be counted double when separately regarding sales organization A and B.
  3. Redundant information: as mentioned above the parent child relationships of all legal entities have been given in one big sheet. So relevant descendants with their company IDs could be found, but a lot of entries were not relevant for the bonus agreement in scope.

Before we show Kens solution and implementation, we should address the fundamental algorithmic question. Why is this a graph and why do we need recursiveness in our programming language?

If we look at the turnover sheet above, NA Sales and NA Corporate are not children of any other profit center and therefore roots of our tree (APAC Sales is root, too. We come to this later.). If we then draw all parent child relationships, we come to the directed graph below:

Directed graph for bonus agreement 1

Let’s assume that NA Sales and NA Corporate are explicitly mentioned on the bonus sheet. Then these two vertices are the rightful roots of the graph; no parents of these vertices are relevant for the turnover sum to be calculated. Furthermore, we remember the fact that all relationships of all entities are in one big sheet. So, the above table is only an extract of the entire sheet. The relevant roots are marked, and our program should find all relevant profit centers automatically. If we say that all the above entities are connected, then the rest of the big sheet only contains entities that are not connected to the two roots above. This exactly the case for the tree with APAC Sales as root:

Directed graph for bonus agreement 2

Again, we assume that the rest of the big sheet is not connected to the above tree. In summary, or toy example contains two disjoint trees, and we know that the above tree represents a bonus sheet with only NA Sales explicitly written on it. The recursiveness of the problem is given by the fact that in automated solution we have no clue how many descendants our roots have and how many generations of descendants are connected to that root. In terms of our practical example this means that we are not isolating the connected entries of the big sheet for one bonus agreement, and we do not draw the hierarchy beforehand like above.

Implementation

At the core of the implementation is a standard breadth-first search algorithm (BFS):

Breadth-First-Search in Power Query

The algorithm takes an adjacency list, a queue and a visited list as input:

  • The adjacency list is created from the big sheet with all entities. The relevant part for the first tree from above is {{1,{3}},{2,{3}},{3,{4,5}},{4,{7}},{5,{}},{6,{4}},{7,{}}, {8,{9,10}, {9, {}},{10,{}}}.
  • The queue consists of the regarded roots of the trees. In our first example these are 1 and 2 that represent the entries on the bonus sheet
  • The visited is list is empty. The target is to visit all vertices of the graph that are connected to the roots.

As output the visited list will be filled with the visited vertices during the recursion. It is worth mentioning again that although NA Operations has two parents (NA Sales and NA Corporate) NA Operations will be counted only once as intended. In summary, with the right roots as input the algorithm will output all vertices relevant for the specific bonus sheet.

Power Query Tree Search Result

For bonus agreement 1 we get the right result:

Vertex
6
7
5
4
3
2
1

Now we can go over to DAX. All recursive work has been done and we only must match the turnovers of the vertices and are done for this specific bonus agreement.

DAX Entity Column DAX Turnover Column

This is a view of the end result:

Vertex Entity Turnover
6 Games 3000
7 Puppets 5000
5 Bikes 7000
4 Toys 10000
3 NA Operations 2000
2 NA Corporate 500
1 NA Sales 6000
Total 33500

Key Takeaways

Power Query is recursive and DAX is not:
if possible shift your recursive work to Power Query and pass the result to DAX for further work.
The graph search algorithm used above is a basic form of AI:
you do not know the depth and descendants of your graph beforehand and let the computer figure it out.
Cookie Consent with Real Cookie Banner