Execution time in database logs

Hi everyone,

We noticed a few months back, while analyzing the performance of our data loading processes, that the information available in the logs was not always accurate.

Yesterday I noticed a strange behavior: in the DB log, the "time elapsed" info is half of what it should be.

Action CodeDateTimeUserNameDbNameOperation-TitleD.Flow ModeTargetElapsedFileRecordNrValidatedRejectedRAM StatusErrCode
FR2018012919:15ecaFAST_001Standard Prices00h18m43sFAST_-95419380439380430[0/0]Mb    
FR2018012919:58ecaFAST_001Standard Prices - Detailed Costs00h21m13sFAST_-95408234108234100[0/0]Mb    
FR2018012920:06ecaFAST_001Standard Prices - Calculation Dates00h03m27sFAST_-9533397962397344618[0/0]Mb    

 

The procedure was launched around 18:40. Therefore I know the first data reader took around 35 minutes to load, which is confirmed by the data reader screen:

image

 

These data reader uses a SAP connector. When I go in the connector log, I see each "extractor" has been launched twice. I suspect this is due to the use of the "replace" option which needs to scan the whole cube to obtain the time entities used (which means loading 1M lines to know that only 2018 is present instead of just reading the time field !).

 

So a few questions :

- Does anybody have the same issues ?

- How can we obtain the correct timings in the DB logs ?

- Is the "replace" option a good choice in this case or should I first clear part of the cube before loading in a normal mode ? Do you have any suggestion to make sure the procedure clears what is needed (not more and not less) before loading ?

 

Thanks !

Etienne

Answers

  • Helmut Heimann
    Helmut Heimann Employee, Community Captain
    500 Comments 250 Likes Fourth Anniversary 25 Answers

    Hi Etienne CAUSSE,

    first of all, I would like to know what Kind of xtractor you are using in your connected SAP-System - is it a simple table/join or are you using the Delta Queue? Their behaviour is slightly different..

    The next thing is that I can't quite figure out what the elapsed time is in case 1 compared to case 2. The table you've provided just says "time", so I'd assume "starting time" - but that's just an assumption (and then I can't see any coulmn stating the "end time").

     

    Kind Regards,

    Helmut

  • Hi Helmut Heimann,

    In this case, our SAP Extractor is using a BWCube-type extraction. Regarding the "time" column of the DB log, I am myself not quite sure what it is supposed to be, in this case It looks more like the time of the end of this data reader.

  • Hi Etienne CAUSSE

     

    The time column indicates the end of the datareader, that is when the datareader has finished loading data.
    The elapsed time, shows the execution time for loading data into the Board database.

    Here more details  (BOARD LOGs in detail)

     

    Datareader is always executed twice when entities and cubes are loaded at the same time.
    This behaviour is by design and occurs with any source (Odbc/oledb/file/Sap etc..).

     

    If you load data using the Board connector from SAP, you can set some parameters to save the result set in the buffer.

    In this way, the second time, the data will be extracted from the cache and the SAP query will not be executed again. From the connector log file, you can check whether the data are extracted from SAP or from the cache.

     

    Regards

    Fabio

  • Hi Fabio Donatellis , and thanks for your reply.

    Two remaining questions then:

    - we currently disable buffer by default on our SAP extractors because very often we have to reload source then reload to Board after a data loading issue. If I setup for this particular extractor "Buffer Extraction Count" = 2 and "Buffer Lifespan Minutes" = 60, I'll be sure the buffer will be used only for the 2° data reading of that same source, right ?

    - In the log I showed, the "Elapsed" time seems to be only the last execution of the data reader (and not the total of the "double reading" time). Can you correct that in the next versions ?

     

    Thanks,

    Etienne

  • Hi Etienne CAUSSE,

     

    Yes, it's correct if you do not have more extractions at the same time.

    In order to setup the cache, in Board connector server there are 2 parameters:

    - BufferLifespanMinutes: Defines the maximal age in minutes of an extraction in the buffer.

    - BufferExtractionCount: Defines the maximal count of results of different extractions in the buffer.

     

    Moreover, for each extractor there is an option that allows to disable the buffer.

     

    So, you have to set the right configuration for your application, using these options.

     

    Regards

    Fabio

  • Hi Fabio Donatellis,

    I tested using the buffer during the last loading procedure last night, and a strange thing happened. The first execution of the connector took twice the usual time (~40 min instead of 20), and the 2nd execution was almost immediate (10 seconds). In the Board log, we can see "elapsed time 00h00m10s" but from the ending time we can see it actually took 40min.

    So we don't save time apparently with this technique, as it seems loading the buffer and Board for the first data reader takes as much time as loading Board twice.

    Etienne

  • Hi Etienne

    In the log file you have all information to understand what happened.
    You just need to merge all the log files, from the sap connector to the Board server.

     

    Of course, the cache is just one of the techniques that can be used to optimize performance..

     

    Fabio

  • Well noted.

    I was wondering if this could be improved in order to have a better follow-up in the "log management application" for data reader execution times, which is an important technical subject for us.

    Another solution if you can suggest it to the lab, would be to integrate SAP Connector logs import in the next version of LMA Application ?

    Thanks,
    Etienne

  • Hi Fabio Donatellis

    I made further testing with this data loading procedures to optimize performance:

    - split of connectors to obtain separately the list of members having data, 1 connector for each dimension, with the "append" option in the data reader (only one entity in the data reader).

    - the procedure clears the cubes before loading to avoid "replace"

    - modification of data reader for the facts to remove the "append" and "replace" options.

     

    However even with these modifications my data reader for facts is executed twice even though I'm only loading cubes.

     

    Any idea why ?


    Etienne