How to: Optimize Datareader performance when loading large files

Document created by gbiondi Employee on Jul 6, 2017Last modified by ggallo on Jul 20, 2017
Version 6Show Document
  • View in full screen mode

When loading large files or tables (e.g. with several million rows), it is possible to reduce loading time and improve database optimization by sorting the records within the file or the table. 

 

The file or table must first be sorted on the TIME field. Then it should be sorted from the smallest to the largest entities.This happens because this is the way BOARD organizes its memory pages; if the file is sorted the same way, BOARD will work on a page till it’s full and then move to the next page. If the file is not sorted correctly, BOARD will swap pages more frequently, killing performance.

Example:

We are loading a file with four fields: Date, Customer, Product, and Units. 

Let's assume that in our BOARD database, the Customer entity has more members than the Product entity.

 

When loading from ASCII files:

  • you must create a sorted file or use a program to sort it
  • the file should be sorted by Date/Product/Customer.

 

When loading through ODBC:

  • to sort the table, simply create an index on the table.
  • Indexing the table will make the views sorted by default without the need to create a SQL statement that sorts rows.
  • Considering the above example, the index should be defined on Date/Product/Customer

 

In both cases, if the data-reader is very large do not activate the ETL unless strictly necessary!

 

Important!
For better performance of time cumulative functions (such as Yearly Cumulated Value or Yearly Moving Total) it is recommended to load data in chronological order: starting from the first year of your database time range. For example, load January, then February, then March and so on. Loading data for the year 2004 and then loading data for year 2003 will result in a poorly optimized database.

1 person found this helpful

Attachments

    Outcomes