Full RAM Mode / Benchmarking : how much time does it take to load your DB in RAM ?

Options

Hi everyone,

I would like to do a bit of benchmarking to understand better our performance and look into possible infrastructure improvements.

My current issue is that we have large databases (typically 50Gb) that takes a lot of time to load into RAM when used the first time. Usually this takes between 20min and 40min.

So a few questions:

- is that normal ? what kind of performance do you have on your own databases / servers ?

- did you implement specific infrastructure upgrades to improve your performances ? Which ones ?

- are you using a virtual infrastructure ? If yes, are you using memory extensions such as VMWare "overcommitment" function ?

 

Besides in terms of usability, very often we start loading a DB into memory by selecting it by mistake in the DB list. If it's a small one, no issue, but if it's a big one we're stuck for 20min... could it be possible to implement a check like "This DB is not in memory, do you want to load it ?" ?

 

Thanks for your input,

Etienne

Tagged:

Answers

  • Unknown
    Unknown Active Partner
    Options

    Hi Etienne CAUSSE,

       Those are definitely some large databases. We tend to partition data and aggregate to make sure active datasets are the ones in memory. There is always a need to go back to historical data, but we tend to leave those in relational databases accessible with drillthrough protocols. For historical years, having summarized numbers in ram along with drillthrough protocols to underlying details is sufficient for users. 

     

       Given your timings, I wonder if BOARD does a parallel load of objects into memory or sequential? Does anyone know? Loading objects into memory in parallel could speed up the time to load into memory.

     

       If the load time is your only concern, then maybe you could schedule them to be loaded into memory whenever the service starts, and ensure there is sufficient RAM for all databases to be loaded into memory. If you have a situation where loading all BOARD databases at the same time would exceed your available memory on a server, there is some availability risk. Once RAM is consumed, if another database is loaded, some existing database will be kicked out of memory, or the new database may not load. I could certainly be wrong here though. I don't know exactly how BOARD handles memory allocation in the near-capacity case. Adding some memory could be a way to reduce some risk, but won't help with the timing issue.

  • Etienne CAUSSE
    Options

    You're right Bob Gill, we do have some large databases, much to my regret. Simplifying is not an easy task in our company... just ask Andrea Masiero or Ibrahima GUEYE

    My concerns are both loading time (but indeed we use the preloading when starting the server, for our production environment) and the availability risk: in our sandbox environment we have a lot of large databases, often copies, and if any user tries to load the wrong one we end up with more databases in memory that the server can handle (even with 256Gb of RAM).

     

    Therefore I think a simple warning "the database is not in memory, do you want to load it" would already be useful.
    Thanks,

    Etienne

  • Hi Etienne,

     

    my 2 cents on improving database in-memory loading, given the experience i had with several customers (i'm also tagging Fabrizio Straccia as an expert on the matter)

     

    What is mostly impacting performances when loading databases into memory is (in order of impact, of course this is what i have found as there is no "thumb-rule", every server and customer architecture is different):

     

    • Disk read speed (if you have flash drivers or good SSDs on the host machine, it is much faster than network disks)
    • Usage of 128-bit address in cubes: large cubes with 128-bit sparsities or densities do kill in-memory load performance as it needs to load on a set of memory addresses that are ^2 bigger..
    • RAM Frequency (the higher the better)
    • Usage of several sparsities that are different one to another. If you have 5 entities, and different set of cubes using those 5 entities in different sparsity combinations, in some cases (when it's not impacting on layout performances) you can think of unifying some of the combinations into a single sparsity.
      Say you have a sparsity by A,B,C  with 2.000.000 elements, and a sparsity by A,B,C,D with 2.300.000 elements.
      You might want to try to set D as dense and therefore generate a unique sparsity for A,B,C, this will take out a sparsity load of 2.300.000 elements and the other sparsity will likely not increase or increase by a few elements.

     

    For the message, I agree it might be a useful and simple wish especially given the data sizes that we see nowadays - if you create an idea i will definitely vote for it!

    Hope this helps

     

    Michele

  • Etienne CAUSSE
    Options

    Thanks Michele Roscelli for your 2 cents, that will be useful.
    Indeed we have a lot of 128-bits sparsities in our main DB, I'm working on that given all the feedback I got in my various posts. The hardware recommendations I will pass forward to our infrastructure guys.

    I posted a simple Idea for the warning message here : Warning message before loading a DB in memory