Temporary Cube vs Staging Cubes
1. Abstract
When developers create a simple or complex procedure, they need to know that they have an advanced feature to ease their job by keeping the database simple and lean. This feature is the temporary Cubes.
2. Context
Since B11’s introduction, developers can leverage the temporary Cube to store data temporarily and improve the concurrency of the BOARD application. The purpose of this article is to explain when a temporary Cube is applicable, and it is best practice to use it.
3. Content
3.1 How to configure temporary Cubes
Let’s start with a quick refresh on how to create this object:
Initially, the developers need to open a new procedure (either Capsule or database) and select the “Add Temporary Cube” button.
Once done, the temporary Cube is ready to be defined in “Name,” “Data Model,” “Data Type,” and “Dimensions,” also, an existing physical or temporary Cube structure can be leveraged and copied.
3.2 Temporary Cubes
The best practice is to use temporary Cubes as much as possible for procedure calculations. Using them will have the effect of :
1. Reducing the data redundancy in the data model and, therefore, using less disk space
2. Simplifying the data model because you won’t need to create Cubes to temporarily store data or prepare data for a calculation
3. Improving the concurrency because the temporary Cubes are created at the start of a procedure and deleted at the end of it
Let’s explain each of the advantages listed above:
1. If temporary Cubes are used, the data stored in these Cubes is not physically stored in the data model but only kept in RAM. This means that developers don’t need to take an extra step to clarify that temporary data or, as it happens more often, no useless data is left in the data model;
2. Simplifying the data is an obvious benefit. Before temporary Cubes, each temporary procedure calculation required a Cube to be created in the data model. This led to data models with 30-40% of Cubes created just for temporary calculations and affected maintainability, and impacted the analysis of the application;
3. Improved concurrency: this is probably the less evident benefit of temporary Cubes, yet the most powerful. Temporary Cubes are created and used only locally within the procedure—they’re created with a unique identification code. Even multiple iterations of the same procedure generate different temporary Cubes. This means that the overlap of calculations is prevented in high concurrency usage. Even further, the temporary Cubes are created with their dedicated sparsity structure, which doesn’t overlap with any other iteration of the procedure.
3.3 Temporary Cubes vs. physical Cubes
There are certain situations where it’s not possible, or not recommended, to use temporary Cubes.
Some examples include:
- The two steps, “If then else” and “select based on,” can’t use temporary Cubes; they only accept physical Cubes. These steps are often used to direct the flow of a procedure and drive the selection range of calculations.
- In practice, storing these datasets in physical Cubes is recommended. These actions are often based on settings (i.e., admin settings, security settings, calculation parameters, etc.). These data types shouldn’t be stored in temporary tables, but it is already necessary to keep them stored in the application, therefore in physical Cubes;
- It’s not a best practice to base “if then else” or “select based on” steps on “volatile” information; what this means is that if we base these crucial procedures steps on Cubes that will be often recalculated, we decrease the reliability of the procedure and increase the possibility of unexpected behavior (i.e., the Cube is recalculated while another procedure tries to use it for a selection or an “if-then-else.”)
- Stage or reporting info. When it’s required to show data that results from calculations or data loads in procedures, temporary Cubes are not applicable. In these cases, data must be stored in a physical Cube so that it can be displayed in a Screen. Other than the most obvious cases, we can think of processes like a driver-based allocation where we could decide to use physical Cubes in key calculation steps to be able to show how each step of the allocation was performed and report that to the end user;
- Export of data through data view object cannot use temporary Cubes because the step is based on the object data view, which can contain only physical Cubes. However, since the introduction of the “Export dataset” and “Bulk insert” features, this limitation is less relevant because we can rely on these two new steps to export data through a file or directly into the source system. These two steps are based on the “dataflow” object and, for this reason, allow the use of temporary Cubes. This allows developers to perform all the ETL required for exports into temporary Cubes and then execute the export without storing irrelevant info in the Board data model.
Comments
-
Thanks very much for this explanation, it´s very useful! I´ll be able to take into account the last point (data export) as I wasn´t aware the temporary cube export layout limitation can be overcome by just using the Export dataset function.
3