Assume Zero for Loading Blanks to a Numeric Cube
Overview
When loading data through an file datareader, I commonly see blanks in a CSV file. If I try to load this field to a numeric cube (Integer, Single, Double), I may see something odd. The datareader will run fine. There will not be any errors thrown. Since these fields are blank, no data is loaded, which is fine. There seems to be an error showing up in subsequent fields to the right of these blank fields. When the datareader encounters a blank, it stops processing the rest of the row. To clarify, the field with the blank behaves correctly. No data for the field with the blank is loaded, thus the numeric value stays zero. The blank in one field causes the datareader to stop processing subsequent fields in the same row.
Workaround
The current workaround is to add a simple IF statement to the ETL dialog to check for blanks and replace them with zeroes in each numeric field. After adding the IF statement, the datareader will see a zero in each of these blank fields and continue processing more fields in the same row.
Example - Without Workaround
Before making any changes, the data and formulae look like this. Notice how the blanks in rows 12 to 16 are blank in both column C and column D. With this configuration, the value of 266.81 in row 17 does not get loaded for any row with blanks encountered first.
Example - With Workaround
After adding the IF statements, the data and formulae look like this. When blanks are encountered, the value of zero is seen by the datareader, and it can continue processing other fields in the same row.
Feature Request
To simplify the steps to build of a data reader, they should assume zero for loading blanks to a numeric cube in a datareader.
Possible Refinement
To further simplify development of data readers, I would like to be able to set which characters should be assumed to be zero. For example, I would like to add a flag to a server config file to flag dashes, spaces, blanks, N/A, TBD as fields assumed to be zero. With these phrases specified, I would not need to add IF statements into the ETL screen to handle these data quality issues.
Thank you to Paola Mason for her help identifying this as a solution to our current challenge.