Synchronize multiple data model

Options

1. Abstract

In the case of an architecture with several data models, the synchronization process may be required to update metadata and data. Defining which data model should be the master and how to set up synchronization is of primary importance.

2. Context

In complex and cross-functional applications, having a single database architecture could not be the best solution since this, in many cases, is reflected in a huge number of entities and cubes that will increase the complexity and expose the entire application and all the users to a failure in case of unexpected behavior in one application module.

3. Content

Often, dealing with a huge complex database is neither efficient nor scalable since one single change can impact the whole database operation and affect the already consolidated database performance.

In a growing environment, it is important to consider the future architecture full set and plan accordingly splitting the database into different, more efficient, and manageable small databases.

Multiple databases need to talk to each other and be aligned in terms of metadata’s content (entities) and consistent in terms of information displayed (Info Cubes), so it is important to align them as frequently as possible.

Let’s see how to synchronize two or more databases.

It is a good practice for the Solution Architect to define a DB Master where metadata and data will be created and updated, such as:

  • Adding members to entities.
  • Adjusting/managing changes in relationship(s).
  • Feeding cubes.

Then, it is mandatory to create a synchronized process to push all the above amendments to the other database(s).

In case you need to remove some old entity members, please follow for each DB (Master and Satellites) the “Member Deletion” Best practice article at the following link:

As we mentioned above, not only the entities and relationships need to be synchronized, but in some cases also cubes must follow the synchronization flow, since some data from a data source or a calculation process may only exist in the Master DB but will also be needed in the satellite database(s).

In the case of updating Info Cubes, the available actions are:

  • Adding data (new combinations)
  • Modifying data (existing combinations)
  • Deleting data (deletion of existing combinations)

The synchronization process consists in a procedure that will extract the metadata and data from the master database and reload them in the satellite database(s) through data reader protocols:

Each Board Cloud instance, once deployed, comes with shared cloud storage by default.

Azure’s Cloud Storage (technically called Z:\) is shared between all instances of Board (e.g., s-1 is usually intended for development, p-1 is usually intended for user acceptance tests, and c-1 is live production) and is therefore perfect for exchanging data between instances.

It is best practice to always use system variables to ensure consistency throughout the data exchange process, which is:

%bss%, which is the variable referring to the storage cloud Z:\

%computername%, which is the variable referring to the board instance

Always type %bss%%computername% first for any extraction step (both metadata and data) to uniquely define the extraction path, therefore in the presence of “extract all entities” and “extract all info cubes” steps type the following syntax:

%bss%%computername%\dbname

The expected result will be the creation of the main folder All and the subfolders Cubes and Trees below the database name path like the following:

Z:\boardhq-s1\db name\All\Cubes

Z:\boardhq-s1\db name\All\Trees

Since the data reader does not automatically resolve system variables, the entire path must be defined without the use of system variables.

In the procedure, when the data reader is invoked, the file path must be changed using system variables.

Please make sure to:

  • Set the Reload Data Reader with the “Add” flag when you are working with Entities.

While

  • Set the Reload Data Reader with the “Replace” flag when you are working with Cubes.

It is also good to have a multi databases screen check where to review the information and be sure that, every time, the information is synchronized.

An important note is related to the synchronization frequency that must be scheduled in a period where the users are not interacting with the application, better if even not connected to the database (a suggestion is to perform this action during the night).

We strongly recommend having a sanity check at the end of the synchronization procedure (can also be a triggered email) to be sure that the flow from the Master to the satellites has been completed.

We also recommend reserving a place in the Admin Capsule from where this procedure can be manually triggered only by the Administrator user if needed.

In case of manual triggering, we suggest performing this action in a service window, “simulating” the absence of users interacting with the database like when the procedure is running during the night.

We strongly recommended to create just one Master DB and interact with the entities just in this Master DB, since all the processes will be more understandable and easier to work with, but, if you need to have more than one Master, you need to be sure that all the architecture databases are synchronized.