How to structure cubes with slow moving hierarchy

1. Abstract

A common challenge in a Board Application is to fulfill the request of tracking changes in hierarchies, which by definition are the current snapshot of a relationship.

This article describes how to keep track of the historical situation using a cube and dedicated entities.

2. Context

Since a hierarchy in Board typically represents the most recent status[LS1] [HH2] of a relationship you would lose the information [LS3] [HH4] on how that relationship looked like at the time a transaction was made.

A good example of this would be a sales rep being responsible for the customer at transaction time who subsequently has moved to another area of responsibility. The current status of the hierarchy, however, would place another agent in this role. The current information therefore would no longer reflect the situation at the time of the transaction.

This change in relationship is commonly known as “slowly moving dimension” or “slowly changing hierarchy” (abbreviated SCD).

Dealing with these issues involves SCD management methodologies referred to as Type 0 through 6. Type 6 SCDs are also sometimes called Hybrid SCDs.

Here, we want to highlight the possible solution of type 3 (adding a new attribute).

Slowly Changing (or Slowly Moving) Dimensions, are a common database concept, where the requirement is to keep track of changes over time on a hierarchical relation that is supposedly 1 to 1.

Since Board relationship, by definition, gives us the current snapshot, the easiest way to achieve such kind of historical reporting and keep track of hierarchical changes, is to create additional historicized entities as part of the Data Model. This entity should then be detached from the main hierarchy and be added to all of the cubes, as an additional sparse element to the current structure, which might benefit from this "additional" dimension of analysis, assuming the cubes are also structured by a time dimension. This has of course a wide impact on the cube’s structure, size and overall performance since we are enhancing the dataset granularity.

In case this historicization is required just for a few dimensions of analysis it could be handled. Otherwise, if multiple elements need to be considered as a slowly changing dimension, it might be worth foreseeing a dedicated reporting Data Model or concatenating them in a unique key that then has the separate single dimensions as fathers. We should also consider that to be able to achieve it, the data source you are loading data from must be accordingly structured.

3. Content

3.1 Simple model

Let’s assume a simple use case for demonstration purposes: Customers are related to Sales Reps and those to Division Managers.

Now, consider the following scenario of one Sales Rep being promoted up the hierarchy and becoming a Division Manager.

Mr Fuller's promotion to divisional manager naturally meant that the responsibility he had previously borne for clients was now transferred to his subordinate staff. So now he is only indirectly responsible for his former clients.

If one were to look at the turnover of Mr Fuller, one would see in the current hierarchy (in which he is no longer a representative) "his" turnover merged with that of his subordinates and it would no longer be recognizable that he was responsible for this turnover. This in turn would mean that no comparisons between current and historical sales figures per Sales Rep would be possible.

3.2 Adding a new attribute

To enable us to distinguish between Mr Fuller's turnovers and those of his former colleagues in the past, we now introduce another additional dimension indicating who was the current responsible agent at the time of the transaction.

This new attribute is a standalone entity containing all Sales Reps independent of their current status, and we’ll call it “Sales Rep as was” or “Sales Rep historical”. This entity will be a dimension of the sales cube and it will be populated with the Sales Rep responsible for the customer at the time of transaction.

3.3 Comparing the data

If Mr Fuller has been responsible in the past for customers who purchased beverages and dairy products, the corresponding sales figures would be incorrectly assigned to his now subordinate employees when using the current hierarchy.

If, on the other hand, we look at the data according to the former representatives, we now get the correct picture.

3.4 Conclusion

Board hierarchies always describe the current status, as they have no temporal validity. Historically different relationships, on the other hand, can be mapped if one uses type 3 of the SCD - and that means introducing new entities. This makes it possible to represent the historical facts correctly.

This was only a very simple example; of course, one can think this even further, for example, by making complete hierarchies quasi-time-dependent via a technical key.