When loading large files or tables (e.g. with several million rows), it is possible to reduce loading time and improve database optimization by sorting the records within the file or the table.
We are loading a file with four fields: Date, Customer, Product, and Units.
Let's assume that in our BOARD database, the Customer entity has more members than the Product entity.
When loading from ASCII files:
- you must create a sorted file or use a program to sort it
- the file should be sorted by Date/Product/Customer.
When loading through ODBC:
- to sort the table, simply create an index on the table.
- Indexing the table will make the views sorted by default without the need to create a SQL statement that sorts rows.
- Considering the above example, the index should be defined on Date/Product/Customer
In both cases, if the data-reader is very large do not activate the ETL unless strictly necessary!
For better performance of time cumulative functions (such as Yearly Cumulated Value or Yearly Moving Total) it is recommended to load data in chronological order: starting from the first year of your database time range. For example, load January, then February, then March and so on. Loading data for the year 2004 and then loading data for year 2003 will result in a poorly optimized database.