Reporting Layer: Optimize a report

1. Abstract

Users are more and more demanding and are looking for the right balance between the correctness of the result and the scalability of the solution at the same time.

Improving response time has always been a top priority, therefore the purpose of this document would be to explain the main best practices to apply and common pitfalls to avoid when defining any report/chart configuration.

2. Content

Here followings the main topics to review and pay attention to while you are approaching the layout (report/dashboard) creation:

  • Cubes’ structure
  • Cubes’ versioning
  • Axes’ configuration
  • Refer to block
  • Layout Filters
  • Load only visible tab
  • Drill To screen versus Go-To screen
  • Drill Through

2.1 Cubes’ structure

The golden rule is to have a layout in which the Info-Cubes (blocks) are sharing as similar structures as possible.

If in a report you have Info-Cubes with different structures, Board will need to navigate through all the different structures before prompting the data view.

Technically what Board does before prompting the data view is:

  1. Scan the entities by axis (row/column) and understand the level of details required to display the analysis requested;
  2. Navigate through all the cubes’ blocks detecting the structure (dimensions) for each of them;
  3. Virtually ‘align’ the Info-Cubes’ content by aggregating or spreading down the Info-Cubes’ content according to the level of detail set by axis.

In other words, Board virtually reconciles the several Info-Cubes’ structures to get the most adherent structure to the axes’ configuration.

The more different the cubes structures the more background processing we are requesting the Board Engine to execute.

This is also the reason why the number of dimensions of an Info-Cube should in general not exceed 7 or 8. An Info-Cube with more than 8 dimensions can be difficult for end-users to understand and use in a layout. Before creating an Info-Cube with more than 8 dimensions, consider revising your data model to reduce the number of dimensions.

2.2 Cubes’ versioning

Info-Cubes by default only have a single version and additional versions can be created to boost performance. Adding other versions is suggested only when neededsince they increase the cube size and maintenance required.

Correct cube versioning is the way to provide faster access to aggregated information.

Versioning allows to keep information at the level of aggregation needed on most of the reports without querying the lowest level of the cubes, thus drastically improving the performances.

Some very simple principles in versioning a cube the developer must deal with:

  • The first version must always be the most detailed, i.e., the one that contains the data at the highest level of granularity (highest level of detail). In other words, from a business perspective, it is the minimum level of information required by the business case.
  • Any additional version must be defined as an aggregation for one or multiple dimensions set in the first version. Therefore, each physical version of the cube contains data at different aggregation levels. Since Board natively sums up the information in a bottom-up manner, a correct level of aggregation will ease the alignment of the version – process to get consistent data by any version.
  • All versions of an Info-Cube must contain the same data. This grants that reports return consistent data independently from the version used to create the report. Aligning a version consists of feeding the version using data from another more detailed version of the same Info-Cube. When data is loaded through the data reader, the system automatically feeds all versions coherently, but when a new version is created in which the Info-Cube already contains some data, the new version must be aligned. The data flow does not align the new version, since it always works against the first version given in the Info-Cube; an additional procedure’s step of alignment must be executed.
  • Do not exceed 3 versions for Info-Cube, if you have more than 3 versions, analyze whether each version is really necessary. A lot of versions may indicate a problem with the way the first version was set up and the dimensionality of the cube. Each time an additional version is created, BOARD will have to navigate through all the different versions available before prompting the most suitable one for the report in scope, so we have a trade-off effect. In other words, Board accordingly to the ‘perimeter’ conditions - which are the axis configuration and the ‘active’ selections, both screen selections and selection objects (pager/selector) - picks up the most suitable and affordable version to host the report.

The side effect of versioning is the size of the cube, which will increase accordingly with the number of versions set, and since Board is a RAM technology, the more versions the more RAM space will be consumed.

We can therefore conclude that the best practice is to set versions only when necessary and keep them to the minimum (usually no more than 3).

The best way to decide the versioning and to which cubes should be applied is to:

  • Analyze reports and dashboards;
  • Identify the ones that have performance issues;
  • List the reporting cubes used in these screens and any selection object used (selectors/pagers);
  • Evaluate the cubes that are driving the performance issues and might benefit from versioning
  • Define the versions in such a way that they cover most of the reports (avoid creating any “ad-hoc” version).

2.3 Axes configuration

The Axis area plays a crucial role in the layout execution.

From there can be defined the aggregation level of the Layout query and set the entities that will be displayed in rows and columns.

Some very simple principles in Axis configuration the developer must deal with:

  • It is advisable to not overcomplicate the layout structure by including more than 3 entities by row or more than 2 entities by column. The layout should provide at first glance, a pretty good level of information with the most representative and exhaustive aggregation level.
  • Any additional level of information (entity) must be queried with the drill-down feature. In this way, the drilled layout will be lighter and faster in prompting the necessary level of information. In Board, the “drill down” (or any drill-related feature) is a key concept. Board’s best use case is to indeed generate aggregated analysis and “manage by exception” drilling down on elements of interest (as opposed to large/detailed/flat spreadsheets to scroll).
  • Display by axis the most shared entities between the cubes shown in the layout. This best practice is also connected to the fact that reports in which the Info-Cubes (blocks) are sharing the same structure returns better performance (2.1).
  • Always pay attention to the order of the entities by row in the layout. The way to display the entities affects the layout execution (rendering): a “not-well” sorting by axe is more likely to return worse performance. The order in which entities are displayed must be evaluated according to their numerosity, so first the entity with the smallest cardinality (a few members) is displayed and then add the next most numerous entity (more members) is added. Repeat the same for any additional dimension you might need to add. This because if we use the entity with the highest cardinality as first we will request Board to create many more “Groups by” in the layout and therefore worsen the performance and readability of the table.
  • Not place ‘user-navigation’ information by axis. Additional information such as Brand, Company, Division, Function, Legal Entity, Scenario, Version is mainly adopted to aid user navigation. Some of these entities (Scenario, Version) are usually centralized and managed by the administrator user, while others (Brand, Company, Division, Function, LE) come true with the user log-in. This information must be managed out of the Axis area, either through the security section of the database or at screen level within the ‘Selection’ area or the pager or the selector. This will help the user to narrow the data navigation with fewer interactions. In practice, if we have entities always selected on one member and this is enforced because of the application logic, it is not necessary and worsening the performances to place this entity also by row. It is much better to display the information through labels.

2.4 Refer to Block

Sometimes it may be necessary to apply the ‘block references’ to some data blocks within the same report.

This might be the case of analysis by scenario, where information is shown as Actual Vs Budget, Actual Vs Forecast, Budget Vs Forecast.

Under the ‘block references’ menu it is possible to apply the "Refer to" function to a Data Block with an Info-Cube, to alter its aggregation or detail level. 

The ‘refer to’ allows referring the Info-Cube to a specific entity occurrence, overriding the screen selection (Select) and the axes settings.

With the ‘block references’ option enabled, the system will apply the injection of the selection on the data block and thus will take extra time to do it.

Therefore, the recommendation is to not abuse this feature, as the layout takes longer to be queried.

In case of poor performance, the suggestion is to analyze whether each ‘refer to’ is needed and may be shifted to the selected objects at screen level to light the layout.

2.5 Layout Filters

The filters in the Data area optimize the response time of the layout by narrowing the visible amount of data.

The ‘Filters’ button allows to apply additional filter settings to data based on the block values.

Multiple filtering conditions can be defined, they can be combined using the logical operators:

  • AND. Requires that all filter conditions are true.
  • OR. Requires that at least one condition is true.

In case of poor performance, the suggestion is to analyze whether filtering conditions can be applied to Data.

Be aware that filters are only affecting the “rendering” of the object, so the query requested to the Board engine remains the same while the displayed rows are reduced. This means that the filtering will result beneficial only if the performance decrease is happening at the rendering stage done by the Web Engine.

2.6 Load only visible tab

In case of multiple layouts (data view, chart) per screen, the response time increases since by default Board will query sequentially the datasets necessary for all these objects.

The more data views a screen has, the more time is required to fully render the requested data.

A solution for this type of crowded screen can be the tab container object with the new option ‘Load only visible tab’.

Thanks to this useful configuration setting, the complexity of the screen is mitigated with the splitting by tab of the information.

Each tab is then asynchronously executed, meaning ‘get only what you see’.

In this way, the user can speed up the response time instead of waiting for all tabs to be updated and trigger a load of other objects only when needed.

The correct balance between a complete analysis and a too-crowded screen must be always considered. In these cases, it is always suggested to evaluate a multiple-screen approach , especially if the different objects are not always used in combination by all users.

2.7 Drill to Screen versus Go-To Screen

If a screen shows poor performance, the suggestion is to evaluate a different analysis approach. Rather than opening the full-screen selection and then restrict it, you could introduce a drill-to-screen link in another part of the application which will open the target screen directly with the relevant selection.

Instead of the Go-To Screen procedure, sometimes the Drill-to Screen might be preferred.

This feature allows to nest analysis in a more powerful way than the standard drill-down which fundamentally only changes the level of granularity of one report.

It is possible to drill down from one Data View row to another screen within the same Capsule.

This feature is “dynamic”, i.e., in a drill-to-screen navigation the data represented (values, indexes) and types of objects (charts, dashboards) may vary as the user drills from a top-level Data View to another more detailed screen.

The drill to screen can be a very easy and intuitive way to merge navigation and analysis needs. It is suggested to use it when applicable since it will reduce the need for selections in procedures and guide the user to a “manage by exception” type of analysis.

2.8 Drill Through

When approaching the reporting requirements, it is necessary to understand what type and level of detail of data should be stored in Board, and what type of data can be kept outside Board and invoked on demand in the data source.

Companies often require information to be stored at the highest level of granularity, which means that a very large data set has to be processed (i.e. SKU, Order Number, Transaction ID,..). However, we always need to evaluate carefully whether this granularity is necessary since it can have severe consequences on performance and maintenance.

Due to the level of detail of data (i.e. millions of records) and the large impact on the database memory, it is always preferable not to host such detailed entities in Board, but it is strongly recommended to keep externally it in the data source.

In particular, you can avoid the creation of highly-granular entities in Board

  • if this level of information is mainly used for data reporting and no calculations are performed against it; i.e. for example we just need to see the transactions related to a specific product;
  • if these very detailed analyses are often necessary for only few users or to back-up business analysis with raw data details; i.e. we need to see transactional details only on 1% of products which have refunds higher than X.

In these cases, Drill-Through features comes to help in retrieving the level of detail required directly from the data source.

Dill-Through is very intuitive since it can be attached to any layout (dataview or chart) and based on the selection/drill element will generate an ad hoc query on the source system and allow the users to visualize and export the detailed info. Of course, multiple drill-throughs can be configured for different purposes.

Of course, the best practice is to evaluate the applicability of this feature at the beginning of a project since it can heavily affect the design of DB.

3. Conclusion

In summary, we can suggest the following checklist in case you have performance issues with your reporting application:

  1. Cubes’ structure → review the layouts and screens set up so that they contain cubes with structures as similar as possible
  2. Cubes’ versioning → evaluate if the creation of cube versions can be beneficial; apply only if necessary, only to relevant cubes and usually no more than 3 versions per cube should be necessary
  3. Axes’ configuration → keep it within 3 entities by row and 2 entities by column; extra details should be retrieved with a Drill down
  4. Refer to block → review the layouts and screens set up so that the need for ReferTo is minimized
  5. Layout Filters → evaluate if layout filters can improve performances
  6. Load only visible tabs → this is always suggested since it will benefit almost every type of setup, and can be introduced in existing screens to avoid loading all objects on opening
  7. Drill To screen versus Go-To screen → evaluate if this can improve performances in combination with a different analysis approach, using drill-to screen rather than opening the full-screen selection and then restricting
  8. Drill Through → evaluate (ideally at the beginning of the project) the use of this feature to avoid loading in the database high-granularity entities which can affect performances

Comments