Dataflows : c=a*b versus c=join(a*b)
Hi everyone,
I see in an application developed by consultants here, that some dataflows are using a c=join(a*b) formula, and I can't find the difference with "c=a*b" when I test on my dataflows.
In some cases, "JOIN" appears in the db log as the algorithm, and some other time it's HBMP.
Does anyone knows what is the intended use of join algorithm ? What's the impact on the result ? on performance ?
Thanks,
Etienne
Answers
-
Hi Etienne,
the JOIN algorithm is a new kind of dataflow syntax that has not been inserted yet in the official documentation.
You should not be using it as a standard development method as not officially enabled yet - it is not active by default and certain options need to be enabled in order for it to work.. It will likely be available in future BOARD versions as a standard dataflow method.
Used in a standard a*b dataflow with heterogeneous cubes structures, it won't differ from a standard HBMP execution.To give a bit more of context, the JOIN dataflow main use case is when you need to expand a cube adding an entity to it, through a matrix.
The most common scenario is when having a sparsity of 2+ elements, and you need to expand the cube adding one entity to the existing sparsity.The JOIN dataflow is able to expand the cube in a more efficient way that the standard HBMP dataflow, and it properly creates the new sparsity in the process.Example:
a) Source cube: Month (D), Entity 1 (S), Entity 2 (S), Entity 3 (D)
b) Matrix cube: Entity 2 (S), Entity 4 (S)
c) Target cube: Month (D), Entity 1 (S), Entity 2 (S), Entity 4 (S), Entity 3 (D)c = JOIN(a*b)
11 -
Hi Michele Roscelli,
Thanks for your reply this is very clear.
In fact the use case you describe is exactly what I need. I was curious to see this syntax but indeed I can't see any difference with a standard HBMP matrix product when I use it.
Currently I'll keep a standard product with common sparse structure, then a copy dataflow with an "open sparsity" option to end up with the target sparse structure.
If I understand correctly, this should be equivalent (but slower).
Etienne
1 -
Hi Michele Roscelli,
Which version of Board has the new JOIN algorithm? (i.e. Board 10.1.X)
Thanks,
Tom
0 -
Hi Michele Roscelli,
thanks for the good explanation!
A colleague from Board Germany already told me about the JOIN a few months ago and I tried it successfully, but since it is not official yet, I didn't implement it anywhere so far. Is it going to be part of 10.5 or rather in a later version?
Thanks & Kind regards,
Gerrit
0 -
The functionality has been included in the dataflow logics presentation from Antonio Speca at BOARDVille, so it will be made officially available very soon!
Michele
0 -
Hi Michele Roscelli,
yes, I so heard and I am really looking forward to it! Great accomplishment!
Kind regards,
Gerrit
0 -
Hi Is this officially released and in which version.
Thanks
Yugank
1 -
Hi,
@Board team : Could you please confirm if we still have to use JOIN(a*b) in v12 ? Documentation is still missing explanations about this function ?
0 -
Hi Julien,
the usage of the JOIN syntax has been deprecated with the adoption of the B12 version.
To manage the dataflow domain correctly by including the additional entity, it is necessary to use the option "Extend calculation on new tuples for all members of" under the dataflow domain.
You will be prompted with the list of entities for which the dataflow domain can be extended. Select the entity for which you wish to manage the calculation.
Hope is clear.
Regards,
Tommaso
1 -
Hi @Etienne CAUSSE,
be aware Board Engine may still use JOIN algorithms for its calculation of a Dataflow step under certain conditions related to the structures of the Cubes involved and the formula. This calculation algorithm is only used when the calculation domain options are in the default configuration and the Layout contains three Cubes whose structures meet the following conditions:
- - One factor Cube has n dimensions in its Structure, where n is any number
- - The second factor Cube has one dimension less than the target Cube and at least a dimension in the Structure of the other factor Cube
- - The target Cube has in its Structure all the dimensions of the factor Cubes and exactly one more than the first factor Cube (n+1)
0