Data Flow not working!!!

Options

Dear All,

 

I have three single cubes at below.

- a: cube 1: 3 dimensions: month, base article (sparse), new article(sparse), it has data already.

- b: cube 2: 3 dimensions: month, base article (sparse), customer (sparse), it has data already.

- c: cube 3: 4 dimensions: month. base article (sparse), new article(sparse),customer (sparse), it doesn't have data already.

I tried to use data flow to feed cube 3 with algorithms: c= a*b, i hope that it can reach high performance mode. But It is not working. Does anyone know the reason? Thank you very much.

 

Thanks and best regards,

 

William le

Answers

  • Previous Member
    edited March 2020
    Options

    Hi,

    The sparse is always related to a given set of entities.

    In your example, you have 3 distinct sparse:

    1. S1 {base article}{new article}
    2. S2 {base article}{customer}
    3. S3 {base article}{new article}{customer}

     

    You should be able to check in your DB under Entities image

    In the tab "sparsity" you can see all distinct sparsities

    image

     

    regarding your dataflow, he can only write in previously existing sparse crossings. Unfortunately the sparse S3 is empty in your DB, therefore the dataflow cannot write data in it.

    To "open" crossings for S3 you should consider

    1. a datareader which opens the crossings, pre-requirement would be a) to extract cubes 1 and 2 and b) load them within one single SQL datareader, joining them on "base article". After this, the sparse should be created for {base article}{new article}{customer} and you could run the dataflow.
    2. the option "open sparse" of a dataflow which I think is not the proper one in your case.
  • William Le
    Options

    Dear interested typ,

     

    Thank you very much for your comment. Each dimension has around 15,000 to 25,000 members. Do you have any suggestion to make it easy? Shouldn't i use sparse in this case? Thank you very much.

     

    Thanks and best regards

     

    William Le

  • Alexander Kappes
    Options

    Dear William Le,

     

    the main question is if you have active combinations in your target sparsity of Cube 3.

     

    When you have, the DataFlow will work for this combinations.

     

    The open Sparsity function can be also your solution but for using this you have to split your DataFlow into 3 Steps.

     

    1. Cube 3 = Cube 1 (B=A) with Function Open Sparsity)

    2. Cube 3a(Temp Cube with same Dimensions like Cube3) = Cube2 (with function Open Sparsity)

    3. Cube3 = Cube3 * Cube3a

     

    Hope it helps

     

    regards

     

    Alexander Kappes

  • William Le
    William Le Customer
    First Comment 5 Likes First Anniversary
    edited March 2020
    Options

    Dear Alexander Kappes,

     

    Thank you very much for your comment. I try to apply your approach, but i always have this problem at below, it takes very very long time to complete this, may be 2-3 hours. Do you know any possible reason? Thank you very much.

     

    Thanks and best regards

     

    William Le

     

    image

  • Alexander Kappes
    Options

    Dear William Le,

     

    can be due to combinations which must been inserted.

     

    The function Open Sparsity in your case opens a combination for each customer with the existing combinations from the source sparsity.

     

    Due to the Max item Numbers and number of possible combinations this can take much time. Perhaps the High Performance Mode can help you

     

    regards

     

    Alexander Kappes

  • Hi

    Due to the Max item Numbers and number of possible combinations this can take much time

    For which entities exactly (among base article, new article,customer) is their max item number relevant regarding the performance of the "open sparsity" dataflow ?

     

    I thought the Max item number has no relevance in this matter, but maybe I haven't understood the point.

    The help says:

    The sparse combinations of the target will be the same as those of the source multiplied by the selected members of the additional entity part of the sparse structure of the target.

    Because of this phrase, I was thinking the real number of members (and not the max item number) of the additional sparse entity was relevant for the scope of data of the "open sparsity" dataflow.

    This would also give a hint to improve performance for the "open sparsity" dataflow, e.g. adding a select step prior to it as suggested in the help.

  • It might help us if you post a screenshot of the "Sparsity" tab within the Entities window of the DB

  • William Le
    William Le Customer
    First Comment 5 Likes First Anniversary
    edited March 2020
    Options

    Hi Alexander Kappes,

     

    Now, i think the big sparsity will impacts my system performance. Please take a look at picture below. Do you have any suggestions? Thank you very much.

     

    Thanks and best regards

    William Le

     

    image

  • Alexander Kappes
    Options

    Dear William Le,

     

    I think the only working solution will be performing preselection to your dataflows. This will reduce the amount of data to be calculated.

     

    Concerning the request of interested typ, the Max Item Number has no direct impact, but concerning them the sparsity becomes 64 or 128 Bit. DataFlows for 64 Bit Sparse Cubes are faster.

     

    William Le, perhaps it will make sense to get in contact with your consulting partner. He knows the datamodel more detaillized than me and perhaps has other ideas to solve your request.

     

    hope it helps

     

    regards

     

    Alexander Kappes

  • Dear William,

     

    I think you would like to take a look at the JOIN function Dataflows : c=a*b versus c=join(a*b)  because your case is exactly falling in that situation. The performance improvement can be extraordinary.

     

    I hope it helps
    Davide