Some Problems with BEAM

Hello,
I have some questions to anybody, who has already some experience with BEAM.

In order to get some practical knowledge about BEAM, I am trying to produce a model forecasting a time series with sales data, but the results are useless, as the model writes the same forecasted value in every future time entity. The value seems to be a sort of average, but for me it is not possible to recognize the formula producing this value. The value does not seem to be a naiv-normal or naiv-seasonal forecast.

The original time series is a bit complex (data on daily base with a trend over the years and two seasonal patterns - a yearly pattern and a weekly pattern, shaped by the weekdays).
So I simplified the time series by aggregating it to a monthly base, with trend and one seasonality in the years, but the problem stays the same: the forecast shows only one value for every future month.
The time series goes back to 2015, so there should be enough opportunity for the model to 'learn' the seasonal pattern.
And at least the monthly time series has no zero values, so it should be recognized by Board as 'smooth' series. The series has only the month as a dimension, no customers etc.

So my questions are:

  1. As far a I know an ARIMA model needs a stationary time series. If the time series has a trend and / or a seasonality you have to difference the values by subtracting the
    value of the previous time entity from the value of the actual time entity, in order to use these new values for forecasting.
    Do I have to differencing the time series in Board, too, or is BEAM able to do that implicitly, while calculating the model?
  2. Do I have to separate a training data set from a test data set within the time series? Or does BEAM separate the data implicitly by a certain rule of thumb (f.e. 70% or 80% training data and the rest as test data)?
  3. What do I have to do in order to merge the actual data with the forecast values? Although I choosed the option in the scenario settings, the actual values are not written to the target cube.
  4. I tried to use the 'other outputs' in order to analyze the results of the models, but no values have been written to my cubes for the several IdsiARX results or the Holt-Winter results. What do I have to do in order to get that results?
  5. As I explained, I simplified the target cube to a monthly base. Is BEAM smart enough to use the month version of the cube with the actual data, or do I have to produce a static monthly cube with the actual data?

If I do not manage to make BEAM work, I will try to use the R-Integration instead. I was already able to do some simple calculations with R from within Board and the R-packages for time series and forecasting are at least well documented in the web. And there are even some great books about forecasting with R, for example the book of Hyndman and Athanasopoulos.

Answers

  • Hi Matthias, provided that without having visibility of the data it is difficult to determine the root cause, I can try to provide some general indications on setting up BEAM. Maybe you have already implemented those, in which case please ignore my comment :-)

    • When using BEAM it is recommended that your source (observed flag ticked on) and target cubes have the same dimensionalities (e.g. if your input sales data cube is by week, material and region the same should be for your target cube. If you are using covariates, you can add an additional dimension to your cube and use the cube as a “flag”)
    • Set your Observed time range according to the data series that you would like to study or set to first/last loaded period to consider the entire history.
    • To allow BEAM to choose the best algorithm set the global method to “competition”. Variability matters, when the system is not able to detect recognizable patterns then it will select the naïve forecast in most cases.
    • It is best not to use custom time entities in your source/target cubes since custom time entities relationships do not always follow Gregorian calendars (e.g., I have seen instances where 1 year had 52 fiscal weeks and the following one had 53). In my case I have successfully created forecasting series using weekly data, in my personal case I have not utilised monthly buckets but that should not be an issue.
    • Use the ‘Merge observed values’ to merge observed and forecasted data.
    • Clear entirely if you want to remove previously saved forecasts.
    • Regarding the Other Outputs, make sure that the cube dimensions follow the same structure of the source and target data.
    • You will not need to differentiate between test and training data.

    On the academy website you can find a useful Tips & Tricks series about setting up Board - see link below

  • Hi Matthias,

    After Stefano's reply, I would like to integrate your specific statistical questions with some technical information.

    1.Beam implements the KPSS statical test to have a reliable criterion to decide wheter to differentiate or not. The test is implemented in a recursive way: after the first differentiation, the obtained time series of differences is further analysed by the KPSS test. If necessary, it is differenced a second time and this process is repeated until there is no trend in the observed time series.

    2. BEAM adopts the cross validation to treat the training and test data. Initially, the entire training data set is broken up in k equal parts. The first part is kept as the hold out (testing) set and the remaining k-1 parts are used to train the model. Then the trained model is then tested on the holdout set. The above process is repeated k times, in each case we keep on changing the holdout set. Thus, every data point get an equal opportunity to be included in the test set.  Usually It is one of the best approaches if we have limited input data.

    5.Board is able to process and pick up the most situable actual version to predict the future accordingly with your target granularity, in other words if your target cube is set by month, it chooses the monthly version to predict over the forecast horizon.

    Regards

  • Hello Matthias,

    thank you for reaching us with the BEAM topic.

    Here below my answers to your open points:

    1. Beam implements the KPSS statical test to have a reliable criterion to decide wheter to differentiate or not. The test is implemented in a recursive way: after the first differentiation, the obtained time series of differences is further analysed by the KPSS test. If necessary, it is differenced a second time and this process is repeated until there is no trend in the observed time series.
    2. BEAM adopts the cross validation to treat the the training and test data. Initially, the entire training data set is broken up in k equal parts. The first part is kept as the hold out (testing) set and the remaining k-1 parts are used to train the model. Then the trained model is then tested on the holdout set. The above process is repeated k times, in each case we keep on changing the holdout set. Thus, every data point get an equal opportunity to be included in the test set. It is one of the best approaches if we have limited input data.
    3. As you said, the default option “Merge oberved value” in the scenario settings is aimed to do this. If this is not the case, the problem might be sit in how the target cube is set. Pleae doucle check it out.
    4. As soon as you declare and assign a cube to any of the desired outputs option, Beam feeds the cube with the expected values. Please let me kindly remind you that it would be preferrable to set up the "output" cubes equal to the granularity of the target cube.
    5. Board is able to parce and pick up the most situable available version to predict the future; in other words if your target cube is set by month, it chooses the monthly version from the observed cube (if any).

    Please let us know if you have any further question.

    Thank you and regards.

  • Hello,
    thanks for Your interesting and informativ comments.

    We found the solution with the help of the Board support.

    In my original model I used a cube structure, which only contains a time entity (the month).
    It is always necessary to use at least one non-time entity in your cubes, because otherwise the time intelligence in Board does not work correctly.