Choose Lead Times Based on Theory, Not Just Correlation

1. Abstract

Lead times define how far in advance one variable influences another. Selecting lead times based solely on statistical correlation can result in unrealistic or misleading relationships. Combining correlation analysis with economic and business theory helps ensure models reflect causal, defensible timing.

2. Context

Apply this best practice when building a Workbench, reviewing diagnostics, or refining explanatory indicators that naturally lead business outcomes. This is especially important for macroeconomic, income, price, and demand-related drivers.

3. Content

3.1 Why It Matters

Correlation alone does not imply causation, and it also does not guarantee realistic timing. In time-series analysis, it is common to find statistically strong correlations at implausible lead times, particularly when variables share long-term trends or are influenced by common external forces.

If lead times are chosen without considering theory:

Models may appear statistically strong but fail to explain real behavior
Coefficients may be difficult to justify to stakeholders
Forecasts may break down when conditions change

Choosing lead times that align with known business and economic mechanisms helps ensure that models capture how and when effects actually occur.

3.2 How to Apply

When selecting lead times, use a combination of diagnostics and judgment:

Review R-value or correlation table for the explanatory indicator across different lead times.
There are two ways to review the correlations at different lead times.
Use the slider bar beneath the indicator chart.
Use the Details table accessible in the three-dot menu in the upper righthand corner of the chart too see correlations across all lead times at once.
Identify the lead times with the strongest correlations.
Evaluate plausibility:
- Does the timing align with how this factor realistically affects the outcome?
- Would the business reasonably feel this effect in weeks, months, or quarters?
Select a lead time that balances:
- Strong statistical signal
- Realistic cause-and-effect timing

3.3 Example

An analyst observes that disposable income shows the highest correlation with retail sales at an 18-month lead. However, economic reasoning suggests consumers typically respond to income changes within a few months. Selecting a 3-month lead produces a slightly lower correlation but results in a far more realistic and explainable model.

3.4 Common Pitfalls

Automatically choosing the lead with the highest correlation
Ignoring industry-specific timing (for example, contract cycles or inventory lags)
Forgetting to revisit lead times after major shocks or structural changes