Moving Data

Options
Center of Excellence
edited March 2023 in Best Practices

1. Abstract

Often customers’ requirements are to copy data from one element to another. Typical examples are copy processes from one version to another or to copy data from an existing reference product to a newly created product as an initialization. The calculation logic in those examples is the same and can be applied for most requests of shifting data in one Cube.

2. Content

Board stores data in Cubes with certain dimensions and structures. We call things a “slice” if we describe a specific subset by selecting a single value for one of its dimensions.

In most cases, we shift data from one Cube to another. However, in some cases, we need to shift data from one Cube slice to another slice of the same Cube.

3. Contexts

In our situation, we have a simple 4-dimensional Cube (PLA CLC 0001 Plan Version):

Version 1 (V1) contains data that we want to shift to another slice (another version element) of the same Cube.

3.1 Static source slice

In the following, we describe the process of copying data from a static selected source slice (our case: version v1) to a dynamic selected target slice.

We start from the front-end side, where we have a dataview showing our Cube with the versions in rows. Next to the Cube is a text algorithm block calling a procedure to execute our shifting process (1001_PLA_Copy Version FIX).

Here the procedure needs to be executed directly from the target version (we want to copy data onto version V2).

The user gets a message asking if they want to proceed or abort the procedure to avoid an unintended execution.

Using a dataflow, we will copy from our source slice (using block reference on version V1) to our target slice. Note that references and time functions are not allowed in target blocks! So, the target slice will be determined from our selection provided by the data view row. Our dataflow calculates b = a (no extensions necessary).

For the user, the executed process might look like this:

(calculations run, then the Screen refreshes)

Please note that this approach is not recommended for a large amount of data because the performance of the calculation might suffer due to the reference of a target block. Please follow the next approach for a more dynamic and efficient calculation.

3.2 Dynamic source slice

Below is the process of copying data from a dynamic slice (i.e., selected version from dataview) to an interactive (dynamic) selected target slice.

We start from the front-end side, where we have a dataview showing our Cube with the versions in rows. Next to the Cube is a text algorithm block calling a procedure to execute our shifting process (1000_PLA_Copy Version).

By executing the procedure from the dataview, we take the selection of the source version with us. The user is asked to confirm before proceeding to avoid unintended copy processes.

Since the source version may vary and is not fixed, we can’t work with a single dataflow and references since they are static.

Therefore, we need to save the source slice of data by copying the data into a temporary Cube. The temporary Cube mustn’t contain the dimension to which you want to shift the data. The dataflow calculates b = a. No extensions are needed.

  • Block a: n dimensions
  • Block b: n-1 dimensions

Why do we get rid of the version dimension?

In the next step, we remove the selection on the dimension and let the user choose interactively from another element (allow one member only). It doesn’t matter whether the selection is made interactively, by a fixed selection, or by a selection based on Cube, etc.

Since dataflows are affected by selections, both source and target Cube are restricted to the selected version. So, if we had written the data onto a temporary Cube with the version, we won’t be able to access the data once the selection had changed. That’s why we need to get rid of the version dimension and be able to access the data in the following steps.

After the target slice respective version has been chosen, we can invert our dataflow and copy the data from the temporary Cube onto our target slice of the initial Cube. Again, the dataflow calculates b = a. No extensions are needed since Board knows automatically to extend by the version dimension.

Using this process, we can avoid references on source blocks and optimize the performance of the calculation. Therefore, this approach is recommended when moving a massive amount of data or when the source needs to be dynamic.

For the user, the executed process might look like this:

(calculations run, then Screen refreshes)

Comments

  • Center of Excellence
    edited March 2023
    Options

    Special thanks to @Bettina Clausen for all the insights and the experience shared on this topic!

  • Lorenzo Aiani
    Lorenzo Aiani Employee
    First Anniversary 5 Likes First Answer Name Dropper
    edited August 2023
    Options

    Hi,

    I’d like to add one attention point that a Developer should consider when developing a procedure like this.

    A brief introduction: when the “interactive selection” step is executed the user always has the option of pressing the “Cancel” button. If the user presses “Cancel” the procedure is closed.

    Lets assume that the requirement is to move data from Version x to Version y (according to the examples given in the article) and, in addition, delete data from Version x. The procedure will be very close to the one shown, with the additional “clear cube (use current selection)” step to clear out Version x data.

    The easiest way to develop the procedure would be, step by step:

    1. Data Flow to move data from original cube to temporary cube
    2. Clear original cube
    3. Remove selection on Version
    4. Interactive selection
    5. Data Flow to move data from temporary cube to original one on the user selected Version
    6. ….

    If built in this way, issues can occur (with data loss) if the user presses “Cancel” when prompted for interactive selection, because the Clear Cube has already been performed.

    In this case, it is safer if the procedure executes all steps that requires user interaction at the beginning. All data transformation should be performed only after. The sequence of steps should be:

    1. Save Current Selection – “Source”
    2. Remove selection on Version
    3. Interactive Selection
    4. Save Current Selection – “Target”
    5. Restore Selection – “Source”
    6. Data Flow to move data from original cube to temporary cube
    7. Clear Cube (use current selection) [the selection is the Source]
    8. Restore Selection – “Target”
    9. Data Flow to move data from temporary cube to original one on the user selected Version
    10. ….

    In general, if the procedure requires data transformation and also the user interaction, be sure that the data transformation will happen only after completing the Interactive selection step(s). Otherwise the user may abort the procedure without completing the transformation and without the ability to revert back the changes. The result is possible data loss and/or data inconsistency due to partial transformation.