Data Changes
Hello
Currently we are currently using a SQL data reader and loading data once a day. How do we set up board to load only data changes throughout the day.
Thanks Alex
Answers
-
Hi Alex Jones,
You could probably make a copy of your datareader to run a few times through the day. In the SQL statement, you’ll need to add a clause in the WHERE to filter data by records updated since the last time the data reader ran. You should also check that the cubes DO NOT have the REPLACE keyword assigned to them. That ensures the new data is appended to the existing data.
Good luck,
2 -
Hello,
in order to answer you accurately, would you tell us a few circumstances ?
- do you want to "refresh" your data more often than once a day, using the existing SQL Datareader ?
- do you want to add only (append mode) new data at each load during the day, or are you ready to "erase and load again" (replace mode) previous loaded data, e.g. if your current SQL Datareader runs at 9am, you load all data existing until 9am, then you run the datareader on 11am, do you want to load only the data appeared during the time from 9am to 11am, or the data apparead during the whole day until 11am ?
Depending on your answers, the solution may be to duplicate the datareader and customize a WHERE clause, or leave your datareader as is and create a new scheduled task on your server, or...
5 -
Hi Alex Jones,
In order to be able to perform incremental data loads, there are several techniques and methods that can be applied.
The best solution depends on the requirements, what is the data source and how often they should be loaded.
It is very difficult to provide you a solution without having more details.
During the BOARDVille User Conference there will be a technical sessions dedicated to this topic. (Track 3: Academy)
Join us @ BOARDVille Conference 2018!
Check the Agenda and see you at the conference.
Fabio
3 -
Hello,
Thank you for your reply. I want to run the data reader say every 30mins and bring in any changes during that 30min period. I have played around with your suggestions and can amend the data using a where clause. However, this is financial data and the $ values are stored in a Infocube, which only sums or replaces which deletes all data in the cube except any changes made within the last 30mins. How do I amend the data in a info cube?
Thank you, Alex2 -
Alex Jones My understanding (from a long time ago) is that 'Replace' will essentially clear any time periods it is reading data for, so if your cube is structured by day (or month), as soon as it reads data for a time period/day BOARD will clear the data in the cube for that time period/day, and the new data (coming from your SQL reader with the WHERE clause) will be written to it. This would be why you are only seeing data for the last 30 minutes.
What is the reason for only wanting the last 30minutes of transaction/data? Is it to speed up the datareader, or is it because you carry out further actions on this data and you want to limit the amount of data to speed up the procedure?
I have done similar in my early BOARD days to speed up a procedure. There I essentially read into a 'Latest Data' cube, re-dimensioned/mapped this as required (with selects based on the 'Latest Data' cube to limit the processing time of the dataflows) into a 'Latest Mapped' via a procedure, then added this to a 'Previous Mapped' cube and stored it in a 'Final Mapped' cube. (Dataflow: Final Mapped = Latest Mapped + Previous Mapped)
The 'Final Mapped' cube was then dataflown to the 'Previous Mapped' cube at the end ready for the next data reader/procedure (Dataflow: Previous Mapped = Final Mapped).
Having said that though, if you want to do this for the reasons of speeding up actions/dataflows as I did, I would suggest also looking at the structure of the cubes (sparsity) and the Dataflows themselves, as ultimately I was able to modify both of these and with interim cubes/dataflows was able to remove/expand the WHERE on my datareader and sped up the entire process. Generally we only limit our Financial datareaders to all transactions posted in the current FY, with the current FY coming from a system_info type table in the DB/ERP you are reading.
4