Kalshi First Research Report: When Predicting CPI, Crowd Wisdom Beats Wall Street Analysts

By: blockbeats|2025/12/24 16:30:03

Original Title: Beyond Consensus: Prediction Markets and the Forecasting of Inflation Shocks

Original Source: Kalshi Research

Original Translation: Azuma, Odaily Planet Daily

Editor's Note: The leading prediction market platform Kalshi announced yesterday the launch of a new research report series, Kalshi Research, aimed at providing Kalshi's internal data to scholars and researchers interested in prediction market-related topics. The first research report of this series has been released, and the following is the original content of the report, translated by Odaily Planet Daily:

Kalshi First Research Report: When Predicting CPI, Crowd Wisdom Beats Wall Street Analysts

Overview

Typically, in the week leading up to the release of important economic data, analysts from large financial institutions and senior economists provide estimates of the expected values. These forecasts, when aggregated, are referred to as "consensus expectations" and are widely regarded as a key reference for insight into market changes and position adjustments.

In this research report, we compare the consensus expectations with Kalshi's prediction market's implied pricing (referred to as "market forecasts" at times) in predicting the real value of the same core macroeconomic signal — year-over-year overall inflation rate (YOY CPI).

Key Highlights

· Superior Overall Accuracy: Across all market environments (including normal and shock environments), Kalshi's predicted average absolute error (MAE) is 40.1% lower than the consensus expectations.

· "Shock Alpha": During significant shocks (greater than 0.2 percentage points), within the one-week forecast window, Kalshi's prediction has a 50% lower MAE compared to the consensus expectations; if on the day before the data release, the MAE further expands to 60%; during moderate shocks (between 0.1 - 0.2 percentage points), within the one-week forecast window, Kalshi's prediction has a 50% lower MAE compared to the "consensus expectations," expanding to 56.2% on the day before the data release.

· Predictive Signal: When the deviation between market forecasts and consensus expectations exceeds 0.1 percentage points, the probability of a shock occurring in the prediction is about 81.2%, rising to around 82.4% on the day before the data release. In cases of inconsistency between market forecasts and consensus expectations, market forecasts are more accurate in 75% of the cases.

Background

Macroeconomic forecasters face an inherent challenge: predicting the most critical moments—when markets are in turmoil, policies pivot, and structural breaks occur—is precisely when historical models are most likely to fail. Financial market participants typically release consensus forecasts a few days before key economic data releases, aggregating expert opinions into the market's expectations. However, while these consensus views are valuable, they often share similar methodological pathways and information sources.

For institutional investors, risk managers, and policymakers, the stakes of forecast accuracy are asymmetric. In undisputed times, a slightly better forecast only offers limited value; but in times of market disarray—when volatility spikes, correlations break down, or historical relationships fail—a superior accuracy can deliver significant Alpha returns and mitigate drawdowns.

Therefore, understanding how parameters behave during market turbulence is crucial. We focus on a key macroeconomic indicator—the year-over-year Consumer Price Index (YOY CPI)—which is a core reference for future rate decisions and a vital signal of economic health.

We compare and evaluate forecast accuracy across multiple pre-official-data-release windows. Our key finding is that the so-called "Shock Alpha" indeed exists—meaning that in tail events, market-based forecasts can achieve additional predictive precision compared to consensus benchmarks. This outperformance is not merely of academic interest but is crucial at moments when prediction errors carry the highest economic costs, significantly enhancing signal quality. In this context, the truly important question is not whether markets are "always right" in forecasting, but whether they provide a signal worthy of inclusion in a traditional decision-making framework, one that offers differentiated value.

Methodology

Data

We analyze daily implied forecasts from prediction markets on the Kalshi platform, covering three time points: one week before data release (aligned with consensus expectation publication), the day before release, and the morning of the release day. Each market used is or has been a real tradable market, reflecting real money positions at various liquidity levels. For consensus expectations, we collect institution-level YoY CPI consensus forecasts, typically released around a week before the official data from the U.S. Bureau of Labor Statistics.

The sample period ranges from February 2023 to mid-2025, covering over 25 monthly CPI release cycles across various macroeconomic environments.

Impact Classification

We have classified events into three categories based on the "surprise magnitude" relative to historical levels. An "impact" is defined as the absolute difference between consensus expectations and the actual reported data:

· Normal Event: Forecast error for YOY CPI is less than 0.1 percentage point;

· Medium Impact: Forecast error for YOY CPI is between 0.1 and 0.2 percentage points;

· Major Impact: Forecast error for YOY CPI exceeds 0.2 percentage points.

This classification allows us to examine whether forecast accuracy exhibits systematic differences as forecast difficulty varies.

Performance Metrics

To assess predictive performance, we utilize the following metrics:

· Mean Absolute Error (MAE): Primary accuracy metric calculated as the average of the absolute differences between predicted values and actual values.

· Win Rate: When the difference between consensus expectations and market forecasts reaches or exceeds 0.1 percentage point (rounded to one decimal place), we record which prediction is closer to the final actual result.

· Forecast Horizon Analysis: We track how the accuracy of market estimates evolves from one week before release to the release day, revealing the value of ongoing information incorporation.

Results: CPI Prediction Performance

Overall Market-Based Predictions Hold an Edge

Across all market conditions, market-based CPI predictions exhibit, on average, 40.1% lower Mean Absolute Error (MAE) compared to consensus forecasts. Across all time horizons, market-based CPI prediction MAE is 40.1% lower than consensus expectations (one week ahead) to 42.3% lower (one day ahead).

Furthermore, in cases where there is a discrepancy between consensus expectations and market-implied values, Kalshi's market-based predictions demonstrate a statistically significant Win Rate ranging from 75.0% one week ahead to 81.2% on the release day. When considering cases where the prediction is tied with consensus (rounded to one decimal place), market-based predictions align with or outperform consensus in approximately 85% of cases one week ahead.

This high directional accuracy indicates: When there is a disparity between market forecasts and consensus expectations, this disparity itself holds significant informational value regarding the likelihood of an impact event.

The Existence of "Impact Alpha"

The difference in prediction accuracy is particularly pronounced during impact events. In a moderate impact event, the Market's Average Error (MAE) compared to the consensus expectation is expected to be 50% lower when the release time is consistent, expanding to 56.2% or more the day before data publication; in a significant impact event, the Market's MAE is also expected to be 50% lower than the consensus expectation when the release time is consistent, reaching 60% or more the day before data publication; while in a normal environment without impact, the Market's prediction performance is roughly equivalent to the consensus expectation.

Although the sample size of impact events is small (which is reasonable in a world where "impact is inherently highly unpredictable"), the overall pattern is very clear: when the predictive environment is most challenging, the Market's information aggregation advantage becomes most valuable.

However, what is even more important is not only that Kalshi's predictions perform better during impact periods, but also that the divergence between market predictions and the consensus expectation itself may be a signal of an impending impact. In cases of divergence, the Market's prediction outperforms the consensus expectation with a win rate of 75% (over a comparable time window). Furthermore, threshold analysis further indicates that when the market deviates from the consensus by more than 0.1 percentage points, the probability of a predictive impact is approximately 81.2%, which increases to around 84.2% the day before data publication.

This significant practical difference indicates that the predictive market can not only serve as a competitive forecasting tool alongside the consensus expectation, but also as a "meta-signal" about predictive uncertainty, transforming the market's divergence from the consensus into a quantifiable early indicator for alerting potential unexpected outcomes.

Further Discussion

An obvious question arises: Why, during impact periods, do market predictions outperform consensus predictions? We propose three complementary mechanisms to explain this phenomenon.

Market Participant Heterogeneity and "Wisdom of Crowds"

Although traditional consensus expectations integrate views from multiple institutions, they often share similar methodological assumptions and information sources. Econometric models, Wall Street research reports, and government data releases form a highly overlapping common knowledge base.

In contrast, the predictive market aggregates positions held by participants with different information bases: including proprietary models, industry-specific insights, alternative data sources, and experience-based intuitive judgments. This participant diversity has a strong theoretical basis in the "wisdom of crowds" theory. This theory suggests that when participants have relevant information and their prediction errors are not entirely correlated, aggregating independent predictions from diverse sources often leads to superior estimations.

And in the macro environment, during a "state transition," the value of this information diversity is particularly prominent—individuals with scattered, local information interact in the market, their informational fragments combine to form a collective signal.

Participant Incentive Structure Differences

At the institutional level, consensus forecasters are often within a complex organizational and reputational system, which systematically deviates from the goal of "purely seeking forecast accuracy." The professional risks faced by forecasters create an asymmetric reward structure—a significant forecasting error results in notable reputational costs, and even if the forecast is highly accurate, especially if it deviates significantly from peer consensus, it may not necessarily receive a proportionate professional reward.

This asymmetry triggers "herding behavior," where forecasters tend to cluster their forecasts around the consensus value, even if their private information or model output implies a different result. The reason is that in the professional system, the cost of "individually making a mistake" is often higher than the benefit of "individually being right."

In sharp contrast, the incentive mechanism faced by prediction market participants achieves a direct alignment between forecast accuracy and economic outcomes—forecasting accurately means profit, forecasting incorrectly means loss. In this system, the reputation factor is almost non-existent, and the sole cost of deviating from market consensus is economic loss, entirely dependent on whether the forecast is correct. This structure imposes stronger selection pressure for forecast accuracy—participants who can systematically identify consensus forecasting errors will continuously accrue capital and enhance their market influence through larger position sizes; while those who mechanically follow the consensus will continue to suffer losses when the consensus is proven wrong.

In periods of significantly increased uncertainty, when the professional cost of institutional forecasters deviating from the expert consensus reaches its peak, this differentiation in incentive structure is often most pronounced and economically significant.

Information Aggregation Efficiency

A notable empirical fact is that even a week before data publication—this time point aligns with the typical time window for consensus expectations—the market forecasts still exhibit a significant accuracy advantage. This suggests that market advantage does not solely stem from the "information acquisition speed advantage" that market forecast participants are usually referred to.

On the contrary, market forecasts may more efficiently aggregate those overly dispersed, overly specialized, or overly vague pieces of information that are difficult to be formally incorporated into traditional econometric forecasting frameworks. The relative advantage of market forecasts may not lie in earlier access to public information but in their ability to more effectively synthesize heterogeneous information on the same time scale—while consensus mechanisms based on surveys, even with the same time window, often struggle to efficiently process this information.

Limitations and Considerations

Our research findings need to be qualified with an important caveat. Due to the overall sample only covering about 30 months, major impact events are inherently rare by definition, meaning that statistical power remains limited for larger tail events. A longer time series would enhance future inferential capabilities, although the current results already strongly imply the market's forecasting superiority and signal distinctiveness.

Conclusion

We document the market's predictive performance exhibiting significant and economically meaningful outperformance relative to expert consensus expectations, especially during key periods of forecasting accuracy. Market-based CPI forecasts exhibit an overall error rate approximately 40% lower, with error reductions of up to around 60% during major structural shift periods.

Given these findings, several key areas for future research become especially important: first, studying whether "Impact Alpha" events themselves can be forecasted using volatility and prediction divergence indicators across a larger sample size encompassing multiple macroeconomic indicators; second, determining at which liquidity threshold the prediction market can consistently outperform traditional forecasting methods; and third, exploring the relationship between prediction market forecasts and those implied by high-frequency trading financial instruments.

Within an environment where consensus forecasts heavily rely on strongly correlated model assumptions and shared information sets, the prediction market offers an alternative information aggregation mechanism that can detect state changes earlier and more efficiently process heterogeneous information. For entities needing to make decisions in an economic environment characterized by rising structural uncertainty and tail event frequency, "Impact Alpha" may represent not just an incremental improvement in predictive ability, but should also be a fundamental part of their robust risk management infrastructure.

Original Article Link