Dataflows : c=a*b versus c=join(a*b)

Hi everyone,

I see in an application developed by consultants here, that some dataflows are using a c=join(a*b) formula, and I can't find the difference with "c=a*b" when I test on my dataflows.

In some cases, "JOIN" appears in the db log as the algorithm, and some other time it's HBMP.


Does anyone knows what is the intended use of join algorithm ? What's the impact on the result ? on performance ?

 

Thanks,
Etienne 

Tagged:

Answers

  • Hi Michele Roscelli,

    Thanks for your reply this is very clear.

    In fact the use case you describe is exactly what I need. I was curious to see this syntax but indeed I can't see any difference with a standard HBMP matrix product when I use it.

    Currently I'll keep a standard product with common sparse structure, then a copy dataflow with an "open sparsity" option to end up with the target sparse structure.

    If I understand correctly, this should be equivalent (but slower).

     

    Etienne

  • Thomas Ironstone
    Thomas Ironstone Active Partner
    Fourth Anniversary First Comment

    Hi Michele Roscelli,

     

    Which version of Board has the new JOIN algorithm? (i.e. Board 10.1.X)

     

    Thanks,

    Tom

  • Gerrit Kohrs
    Gerrit Kohrs Active Partner
    Fourth Anniversary 250 Up Votes 10 Comments 25 Likes

    Hi Michele Roscelli,

     

    thanks for the good explanation!

    A colleague from Board Germany already told me about the JOIN a few months ago and I tried it successfully, but since it is not official yet, I didn't implement it anywhere so far. Is it going to be part of 10.5 or rather in a later version?

     

    Thanks & Kind regards,

    Gerrit

     

    performance data flow

  • Hi Gerrit Kohrs  @SDG Group,

     

    The functionality has been included in the dataflow logics presentation from Antonio Speca at BOARDVille, so it will be made officially available very soon!

     

    Michele

  • Gerrit Kohrs
    Gerrit Kohrs Active Partner
    Fourth Anniversary 250 Up Votes 10 Comments 25 Likes

    Hi Michele Roscelli,

     

    yes, I so heard and I am really looking forward to it! Great accomplishment!

     

    Kind regards,

    Gerrit

  • Hi Is this officially released and in which version.

     

    Thanks

    Yugank

  • Hi,

    @Board team : Could you please confirm if we still have to use JOIN(a*b) in v12 ? Documentation is still missing explanations about this function ?

  • Hi Julien,

    the usage of the JOIN syntax has been deprecated with the adoption of the B12 version.

    To manage the dataflow domain correctly by including the additional entity, it is necessary to use the option "Extend calculation on new tuples for all members of" under the dataflow domain.

    You will be prompted with the list of entities for which the dataflow domain can be extended. Select the entity for which you wish to manage the calculation.

    Hope is clear.

    Regards,

    Tommaso

  • Hi @Etienne CAUSSE,

    be aware Board Engine may still use JOIN algorithms for its calculation of a Dataflow step under certain conditions related to the structures of the Cubes involved and the formula. This calculation algorithm is only used when the calculation domain options are in the default configuration and the Layout contains three Cubes whose structures meet the following conditions:

    • - One factor Cube has n dimensions in its Structure, where is any number
    • - The second factor Cube has one dimension less than the target Cube and at least a dimension in the Structure of the other factor Cube
    • - The target Cube has in its Structure all the dimensions of the factor Cubes and exactly one more than the first factor Cube (n+1)