Will the highest Elo LLM on Chatbot Arena be non-proprietary during 2024?
  • 250
  • 49
  • $96
Chatbot Arena is a platform where large language models (LLMs) are benchmarked through anonymous conversations with human raters. The top model's "Arena Elo" score determines its ranking on the public leaderboard. Recently, Meta released LLaMA 3, which has achieved high scores in comparable-sized models and may become the highest-ranking model. For this question to resolve, a non-proprietary LLM must be listed as the highest Arena Elo model by April 24th of 2024 or have its top ranking until at least 2025. If Chatbot Arena shuts down or changes name, it will not affect the resolution criteria.

Probability History

0102030405060708090100↑ ProbabilityMay2024JunJulAugSepOctNovDecJan2025

Source: brier.fyi

Detailed Scores

Criterion Grade Brier/QuadraticLogarithmicSpherical
At Market Midpoint
C-
0.0153
-0.1321
0.9902
Time-Weighted Average
C
0.0130
-0.1212
0.9918
30 Days Before Close
C-
0.0153
-0.1321
0.9902
7 Days Before Close
A-
0.0019
-0.0450
0.9989
Relative Score
C-
+0.005
-0.038
-0.004
Criterion Grade Brier/QuadraticLogarithmicSpherical
At Market Midpoint
B+
0.0028
-0.0547
0.9984
Time-Weighted Average
B-
0.0052
-0.0747
0.9970
30 Days Before Close
A+
0.0001
-0.0122
0.9999
7 Days Before Close
A+
0.0001
-0.0117
0.9999
Relative Score
B
-0.005
+0.038
+0.004
  • Overall
  • 250
  • 49
  • $96
Criterion Grade Brier/QuadraticLogarithmicSpherical
At Market Midpoint
C
0.0091
-0.0934
0.9943
Time-Weighted Average
C
0.0091
-0.0980
0.9944
30 Days Before Close
C
0.0077
-0.0721
0.9951
7 Days Before Close
A
0.0010
-0.0284
0.9994

Similar Questions

  • 1,996
  • $436,675

Kalshi
D
+0.022
Manifold
C+
-0.002
Polymarket
A-
-0.014
  • 1,914
  • $940,331

Kalshi
B
-0.005
Manifold
D
+0.022
Polymarket
B-
-0.002
  • 1,819
  • $912,124

Kalshi
A
-0.018
Manifold
D
+0.017
Polymarket
C
+0.002
  • 1,821
  • $355,680

Kalshi
C-
+0.007
Polymarket
B
-0.007
  • 8
  • $157,163

Kalshi
D
+0.017
Manifold
A
-0.017