Numbers are great, but context is better. We do our best to describe our
results, but no textual description will convey as much information as a
chart or visualization of some kind. Our goal is to let you see the
numerical context and draw your own conclusions.
Below you'll find detailed charts for all markets on Polymarket.
Click here or the Charts
Home button below to see all platforms
Calibration is a very simple metric at its core. If a market's listed
probability is at 70%, we should expect it to resolve YES about 70% of the
time. For all past markets, we can look at the market's midpoint probability
and compare it to the end result. If those numbers match, we say the
platform is well-calibrated. If they don't, there may be some systemic
reason why forecasters routinely under- or over-estimate the odds.
While calibration is focused on how your predictions match reality, accuracy
is focused on how often you were correct. The main accuracy score we use
here is the Brier score, which compares the prediction (usually at the
market's midpoint) with the resolution at the end of the market. The thing
to remember is that the Brier score meausres how far off you are, meaning
that a lower score is actaully better!
Basic Filtering
Select markets to include in the calibration plot and accuracy scores
based on key attributes, such as number of traders, market volume, and
duration. We break these down by the percentile range - for every
million-dollar market there's a thousand others with just a few trades.
Here you can filter out the ones without much activity.
Having trouble deciphering the calibration plots? The main calibration
charts stack all of the platform data points in a column, which is
accurate but can be hard to read. This one breaks them out side by side
and shows an uncertainty range, inspired by JHK Forecasts.
The closer we are to an event, we should expect that the more accuract the
forecasts will be. For one, traders will be more likely to invest in the
market due to a shorter time to return, giving the market more liquidity.
Two, we will presumably have more information about the upcoming event,
such as polls, research, and generally reduced uncertainty.
This seems to be mostly backed up by the plot below. Kalshi and Manifold
have a gradually decreasing score (lower is better) starting around 6
months before resolution. Metaculus, which does not rely on liquidity,
shows a similar but smaller change. However, for Polymarket you'll notice
a large uptick instead - this is due to the fact that most of their
markets are very short-term. What you're actually seeing is a very low
sample size for most of those data points, since only 10% of their markets
are longer than 50 days.
Similarly to above, it is often noted that a large number of traders or
high volume signifies a higher expected market accuracy. A proper test of
this hypothesis would need to control for variables that we do not have,
bt we can at least see if our unfiltered samples show a correlation in
this way. Note that several of these datapoints have a low number of
sample markets, and any with less than 10 in the sample were excluded.
At a glance, you might think the resolution of a market shouldn't affect
the accuracy much. However, there tend to be distinct types of markets on
most platforms (often in the form of "Will X event happen by Y date?"). If
you believe that nothing ever happens, then you might expect that betting NO a lot will make you pretty
accurate. And according to these stats, you would be right.
What is a market, really? They're similar under the hood - people trade on
different outcomes at different prices, leading to a consensus probability
whether that's with a limit order book, an automated market maker, or some
other mechanism. However, you can see how flexible they are by how the
different platforms leverage this mechanism.
Here, we'll look at some of the specific traits we measure on each market to
see how they compare across platforms.
Histograms of Attributes
How much trade volume do these markets typically see? How many traders
participate on markets on each platform? How long are they typically open?
Each platform tends to focus on certain categories. Some of this is site
culture, some is based on how the platform is designed, and some is just
based on marketing strategy. We try to automatically categorize all
markets into our standard set, but how do the distributions turn out per
platform?
Prediction markets have become more popular over time, but are there more
markets to meet that increased demand? Have there been any particular
spikes in market opens/closes?
Market Open and Close Dates
Markets opened (positive) and closed (negative) over time
Are you looking for charts from the old Calibration City site? We're
working on bringing all of those features over here, but in the meantime
you can access it at https://old.calibration.city. Note that it doesn't get data updates as frequently, so it may be a bit
out of date.
Do you have an idea for a potentially interesting chart or visualization?
Contact us with your idea and we'll credit you if we decide to add it!