On-the-fly Calculations: Pros and Cons

Center of Excellence
edited April 2023 in Best Practices

1.   Abstract

The term “on-the-fly calculation” refers to all the calculations executed directly at the layout level, meaning that the result is not physically stored in a Cube. This directly affects drill-down functionality and has advantages in terms of reducing the overall size.

2.   Context

During the development phase the developer chooses to leverage some of the on-the-fly calculations or create an ad hoc structure in the data model and complex dataflows to achieve the same result.  

3.   Content

On-the-fly calculations are an alternative to dataflows with a fundamental difference in how data are calculated and stored in the application.

On-the-fly calculations calculate “on-the-fly,” which means that the calculation is executed directly at the layout level. The result is visible in the application without being stored in a Cube. On the other side, dataflows work and store the result on a Cube, so data are physically stored in the data model.

The direct implication is how the data, which can be the result of a data flow or an on-the-fly calculation, can be analyzed, sliced, drilled, and used in other parts of the application.

On-the-fly calculations can be executed leveraging Aggregation Functions, Algorithms, and Rules.

  • The aggregation functions are predefined formulas that can be recalled from a layout and applied to a data block.
  •  The algorithm (also called column algorithm) is a data block whose values are calculated with a formula based on other data blocks.
  • The rule is a set of formulas defined by members of the same Entity. A rule is always associated with a single Entity and can be used with (or applied to) all Cubes with this Entity as a dimension in their structure

3.1 Aggregation Functions

Aggregation functions are predefined formulas that can be recalled from a layout and applied to a data block and allow a calculation result to show directly in a report. This avoids creating specific dataflow procedures and dedicated Cubes to achieve the same result.

3.1.1   Why and when to use an AGGREGATION FUNCTION instead of a dataflow?

The aggregation functions are predefined formulas that can be recalled from a layout and applied to a data block and allow a calculation result to show directly in a report. This avoids creating specific procedure dataflows and dedicated Cubes to achieve the same result.

The suggestion is always to choose the aggregated functions and use the dataflow only if the results need to be in a Cube or because it’s used as a source for other calculations.

The aggregation functions are configured in the front-end layer, so they only require a Power User license.

3.1.2   Drill-down functionality for aggregation functions

Aggregation functions also support drill-down functionality, as you’d expect.

When you drill down using the Entity that is the aggregation function’s driver, the result retrieved is from the raw Cube data. See the example below:

Figure 1

  • Layout set up:

    • ACT Quantity
    • Aggregation Function: Distinct count on Item
    • Aggregation Function: Distinct count on Customer
  • Drill down by item:

    • The column Distinct count on Item returns the ACT quantity row data
    • The column Distinct count on Customer returns the number of distinct customer by Item

3.2 Algorithms

The algorithm is a data block whose values are calculated with a formula based on other data blocks. They also don’t store data in the database and won’t increase the Cubes number and, consequently, the database size. This helps improve program efficiency

3.2.1   Why and when to use an ALGORITHM instead of a dataflow?

The main difference between the algorithm and the dataflow is the way the calculation is executed, in particular:

  • The algorithm performs the calculation at the level of the dimensions configured in the dataview that could be more aggregated than the granular level of the Cube used. For example, a Cube’s dimensions by product, customer, and month can be displayed in the dataview aggregated by month and product category. Whatever formulas are configured in the column algorithm will be executed at the month and product category level.
  • The dataflow always executes the calculation at the granular level of the Cubes involved in the calculation. The result can then be displayed aggregated at any level that makes sense, based on data structure and hierarchy.

For this reason, it becomes clear that some calculations do not make sense if executed at an aggregated level. See the example below:

Example: To calculate the Total Sales as Price * Quantity (both detailed by Item).

Case 1 (Red) : Sales amount calculated by the algorithm is correct only if the layout is configured with the item’s granular information (Figure 2). As soon as the item dimension is removed and the layout aggregated by item category, the price is aggregated, and the sales amount calculation is wrong (Figure 3).

Case 2 (Green): Sales amount calculated by dataflow is executed at the lowest level of detail of the Cubes involved (i.e., the Item) and then aggregated in the dataview; hence the calculated sum is correct (Figure 3).

This is the primary driver to decide. 

Then, if the calculation is adequate, and the result doesn’t need to be stored in a Cube, it’s always preferable to use column algorithms. They’re executed at the layout level every time the Screen is refreshed; they don’t require the user to run a procedure or the developer to create all the data model structures needed.

As for what concerns the aggregation functions, the algorithm is managed in the front end, and it only requires a Power User license to be maintained.

3.2.2   Drill-down functionality for algorithms

In the case of algorithms, the drill-down functionality works as expected, allowing analysis at a deeper level of detail than the one shown in the dataview.

Example: Report with the variation between sales of current and previous months. This report is aggregated at Item Category. We need to analyze the details by item.

By drilling down on the ITEM_CAT01 by Item, the algorithm correctly calculates the Variation Percentage (Figure 4

Figure 4

3.3 Rules

Rules are sets of formulas defined by members of the same Entity. A rule is always associated with a single Entity and can be applied to any data block with that entity as a dimension in its structure.

3.3.1   Why and when using a RULE instead of a dataflow?

It’s always recommended that you use rules when the target and the factors used for the calculation are part of the same Entity. To execute the same through a dataflow, combining different functionality such as selection and referring to and applying them in a specific sequence creates multiple dataflow steps.

Example: the Gross Profit is calculated as the Revenues minus the Cost of Goods Sold. These three measures are included inside the same entity P&L Account, and a rule is created based on this Entity.

Figure 5

Two Cubes share the P&L Account dimension: the Actual and the Budget.

The P&L rule can be applied to both the Cubes, and the Gross Profit is then automatically calculated without creating additional data-model structures and related dataflows steps.

Figure 6

3.3.2   Drill-down functionality for rules

The drill down can’t be applied to cells calculated through a rule. The drill down works as expected for all the other lines that are not the result of a rule.

Figure 7

To allow the drill down on these cases, it’s necessary to consolidate the rule in a dataflow procedure. In this way, the rule is applied during the execution of the dataflow, and the result is physically stored, and the drill down applicable.

The consolidation through dataflow requires high resource consumption, which can lead to low performance. It’s advised to consider the rule consolidation only in the case described above or when the result is needed as a source for other calculations. In all the other cases, it is sufficient to apply the rule directly in the dataview.

4.   Conclusion

Except for the cases described above, where there is no alternative to the dataflow, it is always advisable to use on-the-fly calculations:

  • They are automatically calculated and refreshed at the layout level.
  • There is no need for the user to press any buttons to run procedures to refresh or trigger the calculation.
  • There is no need for the developer to create ad hoc structures and complex dataflow to achieve the same result. This is also affecting the data model size.
  • They are predefined formulas, so they can easily be configured, managed, and maintained.