How accurate are prediction markets?

The Brier score is a proper scoring rule that measures the accuracy of probabilistic predictions. Using API data from various prediction market platforms, we can calculate scores for each market and check how often they were correct.

The results: One month before close, 62% of markets were already within 30% of the correct resolution, representing a Brier score of 0.09 (n=76,941). The median Brier score at market midpoint was 0.0225 (n=443,535).

Additionally, we've matched markets across platforms in a curated collection of questions. Using those matches, we can generate scores for each platform that rewards markets that are correct, confident, and early.

Those results indicate that each platform's performance is similar on average. However, each platform outperforms the others in certain categories. See the chart below for those scores.

One month before close, most markets are already within 30% of the resolution.

KalshiManifoldMetaculusPolymarket020406080100Percent of markets within 30% of resolution one month before closing →

When graded directly against each other, some platforms perform better in specific categories.

Culture

Kalshi
A-
-0.013
Manifold
D
+0.018
Metaculus
F
+0.051
Polymarket
B
-0.008

Economics

Kalshi
B+
-0.008
Manifold
D+
+0.011
Metaculus
C
+0.004
Polymarket
A-
-0.011

Politics

Kalshi
B
-0.006
Manifold
C-
+0.005
Metaculus
C
+0.003
Polymarket
C+
-0.001

Science

Kalshi
C
+0.004
Manifold
C-
+0.008
Metaculus
B
-0.007
Polymarket
A
-0.025

Sports

Kalshi
A-
-0.015
Manifold
D+
+0.013
Metaculus
D
+0.019
Polymarket
A-
-0.011

Technology

Kalshi
A-
-0.014
Manifold
C
+0.005
Metaculus
C
+0.004
Polymarket
C+
-0.001

Top Questions

Each card here asks a general question that several prediction market platforms have indepenently tried to predict. We find and link markets when they resolve in order to judge all platforms on an even playing field. As of 6/4/2025 we have 942 linked markets across 378 unique questions.

A traditional accuracy analysis would look at a single point in time, usually midway through the market. We calculate our absolute scores in this way, and show it below as the midpoint Brier score.

However, that form of scoring often misses a lot of important information. When you look at the probability charts below, you can see how each market's prediction changes over time as they respond to news, polls, and other information. In addition, each prediction platform has different rules on what they predict and how early their markets open which makes a direct comparison difficult.

In order to address this, we start by scoring every market on every day that it's open. Then we aggregate them into a relative score - grading that market's performance relative to the other linked markets on each day and rewarding those that were correct earliest. Check out the scoring section for details about the system.

  • Resolved YES
  • Category Politics
  • 18,015
  • $1,794,564,130
  • Kalshi
  • 109
  • $262,334,200
Midpoint Brier score:
D-
0.1847
Relative score:
A
-0.027
  • Manifold
  • 462
  • 8,288
  • $750,630
Midpoint Brier score:
D-
0.2414
Relative score:
F
+0.039
  • Metaculus
  • 1,367
  • 1,041
Midpoint Brier score:
F
0.3206
Relative score:
A
-0.026
  • Polymarket
  • 307
  • 8,686
  • $1,531,479,300
Midpoint Brier score:
D-
0.1980
Relative score:
A
-0.017
Overall
Average Midpoint Brier score:
D-
0.2362

Probability History

020406080100↑ ProbabilityApr2023JulOctJan2024AprJulOct

Source: brier.fyi

  • Resolved YES
  • Category Politics
  • 10,634
  • $65,015,804
  • Kalshi
  • 54
  • $394,297
Midpoint Brier score:
B+
0.0025
Relative score:
C+
-0.000
  • Manifold
  • 205
  • 2,456
  • $69,335
Midpoint Brier score:
B
0.0043
Relative score:
B
-0.005
  • Metaculus
  • 887
  • 237
Midpoint Brier score:
C-
0.0183
Relative score:
C+
-0.000
  • Polymarket
  • 206
  • 7,941
  • $64,552,172
Midpoint Brier score:
C
0.0110
Relative score:
C-
+0.009
Overall
Average Midpoint Brier score:
C
0.0090

Probability History

020406080100↑ ProbabilityOct2022Jan2023AprJulOctJan2024AprJulOctJan2025

Source: brier.fyi

  • Kalshi
  • 263
  • $1,230,309
Midpoint Brier score:
F
0.5776
Relative score:
D
+0.016
  • Manifold
  • 339
  • 1,136
  • $24,832
Midpoint Brier score:
F
0.5339
Relative score:
F
+0.042
  • Metaculus
  • 1,730
  • 1,148
Midpoint Brier score:
F
0.4100
Relative score:
A
-0.016
  • Polymarket
  • 275
  • 3,399
  • $22,807,236
Midpoint Brier score:
F
0.4422
Relative score:
A
-0.038
Overall
Average Midpoint Brier score:
F
0.4909

Probability History

020406080100↑ Probability20212022202320242025

Source: brier.fyi

  • Resolved YES
  • Category Politics
  • 6,965
  • $54,657,350
  • Manifold
  • 216
  • 1,974
  • $94,178
Midpoint Brier score:
F
0.9751
Relative score:
D+
+0.012
  • Metaculus
  • 773
  • 1,163
Midpoint Brier score:
F
0.8152
Relative score:
A-
-0.011
  • Polymarket
  • 206
  • 3,828
  • $54,563,172
Midpoint Brier score:
F
0.9390
Relative score:
C
+0.001
Overall
Average Midpoint Brier score:
F
0.9098

Probability History

020406080100↑ ProbabilityOct2022Jan2023AprJulOctJan2024AprJul

Source: brier.fyi


What's a prediction market?

Predicting the future is hard, but it's also incredibly important. Let's say someone starts making predictions about important events. How much should you believe them when they say the world will end tomorrow? What about when they say there's a 70% chance the world will end in 50 years?

Prediction markets are based on a simple concept: If you're confident about something, you can place a bet on it. If someone else disagrees with you, declare terms with them and whoever wins takes the money. By aggregating the implied odds of these trades, you can gain an insight into the wisdom of the crowds.

Imagine a stock exchange, but instead of trading shares, you trade on the likelihood of future events. Each prediction market offers contracts tied to specific events, like elections, economic indicators, or scientific breakthroughs. You can buy or sell these contracts based on your belief about the outcome - if you are very confident about something, or you have specialized information, you can make a lot of money from a market.

Markets give participants a financial incentive to be correct, encouraging researchers and skilled forecasters to spend time investigating events. Individuals with insider information or niche skills can profit by trading, which also updates the market's probability. Prediction markets have out-performed polls and revealed insider information, making them a useful tool for information gathering or profit.

Some popular prediction market platforms include:


How do you calculate the scores?

The traditional way to score predictions is using Brier scores, which measure how far off your prediction was from reality. While these work great for individual predictions, they struggle to compare predictions across different time periods - being 90% confident a month before an event is more impressive than being 90% confident the day before.

To account for this, we use a relative Brier scoring system. For each matched question across platforms, we compare how early each platform reached the correct probability range. Platforms that arrive at accurate predictions earlier receive more points, while those that take longer or never reach accuracy receive fewer points.

As an example, let's look at the probability history for an actual set of markets.

  • Our first step is preprocessing - for every market we average the probability over each day to minimize transient spikes and normalize the data.
  • Next we narrow the range down to the period where at least two markets are open. For some markets we will also override the start or end dates to limit the scoring period.
  • For each day in the scoring range, we calculate the score for each market. In this case we will use the Brier score, but other scores such as log would also work. This is the daily absolute score.
  • We then calculate the median daily score as a baseline for comparison. Markets that do better than this median will have better scores at the end.
  • For each day, we find the difference between each market's daily absolute score and that median.
  • Finally, we sum all of the market's daily differences and divide them by the total number of scoring days. Not all markets are open for the same duration, so this grants better scores to the markets that were open for longer.
  • This gives us a number that can be graded similarly to a Brier score, in the sense that lower is better. However, the scores can now be from -1 to +1 and most will be centered around 0, the median score.
  • In order to more easily evaluate these scores, we can assign then letter grades at certain cutoffs. These are the grades that you see on each question card.

We calculate these scores for all linked markets, since we are confident that they meet our standards for serious markets making real predictions. We can average these scores together to get overall scores for each platform, category, and combination therof.

Daily Probabilities

020406080100↑ ProbabilityApr2023JulOctJan2024AprJulOctDate →

Daily Calculated Brier Scores

0.00.20.40.6↑ Brier score (lower is better)Apr2023JulOctJan2024AprJulOctDate →

Difference from Median Brier Score

−0.10.00.1↑ Score difference (lower is better)Oct2023Jan2024AprJulOctDate →

Source: brier.fyi

Relative score results:

Kalshi
A
-0.027
Manifold
F
+0.039
Metaculus
A
-0.026
Polymarket
A
-0.017

What about calibration?

Accuracy is a good metric, but another lens we can use for analysis is calibration. For a group of markets to be perfectly calibrated, their average resolution values must match their average prediction values.

For example, let's say there are a handful of markets that will be determined by rolling a 6 on a fair six-sided die. We would expect each market to have an average probability of around 17%, and once they resolve we would expect around 17% of them to resolve positively. If both are true, then those markets were well-calibrated. If not, then some of our assumptions were incorrect.

This plot takes all of the prediction and resolution values and shows how closely they match. They should form a straight line from the bottom-left to the top-right - points significantly under or over that line represent systemic errors.

Calibration Plot

KalshiManifoldMetaculusPolymarket
05101520253035404550556065707580859095100↑ Resolution0102030405060708090100Prediction (Midpoint) →

Calibration plot for all platforms, with market probability at midpoint versus average resolution value. Includes all resolved binary and multiple choice markets. n=443,535 markets

Source: brier.fyi