Will xAI have the top LLM on LMArena on March 1, 2025?
  • 72
  • 1,653
  • $1,749,240
Several organizations, including Google and OpenAI, are vying for dominance in the development of large language models (LLMs) that can operate on LMSys. Google currently holds a lead with two models, while OpenAI has recently announced its "o3" model, which some consider to be a potentially AGI-capable system. xAI is also making moves, releasing Grok 2.5 and promising an upcoming version of Grok that will surpass human performance on LMSys. The top spot on March 1st in New York is expected to go to one of these organizations, with resolution criteria based on their respective ELO scores. A tie for first place between Google and OpenAI would result in a partial resolution, taking into account the confidence intervals.

Probability History

0102030405060708090100↑ Probability22Dec295Jan1219262Feb916232Mar

Source: brier.fyi

Detailed Scores

Criterion Grade Brier/QuadraticLogarithmicSpherical
At Market Midpoint
F
0.7396
-1.9661
0.1607
Time-Weighted Average
F
0.3105
-0.8147
0.6221
7 Days Before Close
D+
0.0484
-0.2485
0.9624
Relative Score
A
-0.029
+0.197
+0.021
Criterion Grade Brier/QuadraticLogarithmicSpherical
At Market Midpoint
F
0.8794
-2.7767
0.0662
Time-Weighted Average
F
0.6608
-1.6763
0.2243
30 Days Before Close
F
0.9314
-3.3549
0.0362
7 Days Before Close
D-
0.1147
-0.4134
0.8901
Relative Score
D
+0.021
-0.078
-0.019
Criterion Grade Brier/QuadraticLogarithmicSpherical
At Market Midpoint
F
0.8883
-2.8560
0.0609
Time-Weighted Average
F
0.4894
-1.2025
0.3946
30 Days Before Close
F
0.9614
-3.9373
0.0199
7 Days Before Close
D+
0.0267
-0.1785
0.9814
Relative Score
C-
+0.008
-0.138
-0.005
  • Overall
  • 72
  • 1,653
  • $1,749,240
Criterion Grade Brier/QuadraticLogarithmicSpherical
At Market Midpoint
F
0.8358
-2.5329
0.0959
Time-Weighted Average
F
0.4869
-1.2311
0.4137
30 Days Before Close
F
0.9464
-3.6461
0.0280
7 Days Before Close
D
0.0633
-0.2801
0.9447

Similar Questions

  • 1,996
  • $436,675

Kalshi
D
+0.022
Manifold
C+
-0.002
Polymarket
A-
-0.014
  • 1,914
  • $940,331

Kalshi
B
-0.005
Manifold
D
+0.022
Polymarket
B-
-0.002
  • 1,819
  • $912,124

Kalshi
A
-0.018
Manifold
D
+0.017
Polymarket
C
+0.002
  • 1,821
  • $355,680

Kalshi
C-
+0.007
Polymarket
B
-0.007
  • 8
  • $157,163

Kalshi
D
+0.017
Manifold
A
-0.017