Confirmed previous forecast
0.242244
Relative Brier Score
Questions Forecasted
Scored Questions
15
Forecasts
1
Upvotes
Forecasting Activity
Forecasting Calendar
Past Week | Past Month | Past Year | This Season | All Time | |
---|---|---|---|---|---|
Forecasts | 0 | 7 | 47 | 15 | 63 |
Comments | 0 | 2 | 4 | 4 | 5 |
Questions Forecasted | 0 | 7 | 11 | 8 | 17 |
Upvotes on Comments By This User | 0 | 0 | 1 | 1 | 2 |
Definitions |

New Prediction

New Prediction
Probability
Answer
1%
(0%)
Yes
99%
(0%)
No
Files

New Prediction
Probability
Answer
10%
(-75%)
Yes
90%
(+75%)
No
Why do you think you're right?
No enough time till May 31st; I do expect it to become the most valuable by the end of the year, however
Files
Why might you be wrong?
An insanely capable reasoning model is released, making it painstakingly clear to everybody that the world needs more compute
Files

New Prediction
Probability
Answer
0%
(0%)
Estonia
0%
(0%)
Latvia
0%
(0%)
Lithuania
Confirmed previous forecast
Files

New Prediction
Probability
Answer
1%
(0%)
Moldova
1%
(0%)
Armenia
1%
(0%)
Georgia
1%
(0%)
Kazakhstan
Confirmed previous forecast
Files

New Prediction
Probability
Answer
Forecast Window
4%
(0%)
Yes
Feb 28, 2025 to Aug 28, 2026
96%
(0%)
No
Feb 28, 2025 to Aug 28, 2026
Confirmed previous forecast
Files

New Prediction
Probability
Answer
Forecast Window
1%
(0%)
Yes
Feb 28, 2025 to Aug 28, 2025
99%
(0%)
No
Feb 28, 2025 to Aug 28, 2025
Confirmed previous forecast
Files

New Prediction
This forecast expired on Feb 24, 2025 06:00AM
Probability
Answer
Forecast Window
92%
(+37%)
Yes
Jan 24, 2025 to Jan 24, 2026
8%
(-37%)
No
Jan 24, 2025 to Jan 24, 2026
Why do you think you're right?
I've gotten better acquainted with the DeepSeek R1 paper since my previous forecast
Files
Why might you be wrong?
My previous reason for potentially being wrong stands: reasoning abilities do not transform to high scores on the ChatBot Arena, yet I am fairly confident they will in this case
Files
I had similar concerns about reasoning models; but it would seem that the first one to enter the Arena (Genini 2.0 Flash Thinking Experimental) topped the LB immediately with a high score of 1380.
Now o3-mini is expected to be released in "~a couple of weeks" (from Jan 17): https://x.com/sama/status/1880356297985638649
Files

New Prediction
This forecast expired on Feb 20, 2025 07:58PM
Probability
Answer
Forecast Window
55%
(+41%)
Yes
Jan 20, 2025 to Jan 20, 2026
45%
(-41%)
No
Jan 20, 2025 to Jan 20, 2026
Why do you think you're right?
DeepSeek R1 is terrifying(ly good)
Files
Why might you be wrong?
Reasoning abilities do not transform to high score on the Chatbot Arena
Files
It's subtle but it matters: "on", not "in". I've tested this question on scores and scores of people over the years and easily >95% have no idea of what happened there. All they have in mind is tanks on the square crushing innocent student protestors. Virtually no one is aware of the amazingly violent cops- and soldiers-killing riots that happened a day before and continued on that day.
Files
Why do you think you're right?
I'm anchoring to the current situation and from there updating on the plausible future developments (e.g. a peace being signed with Ukraine largely retaining its territory, yet losing resources).
Under the possible options, a fluctuation back into the 30-39% range is more plausible than a drop to under 19
Why might you be wrong?
I didn't spend enough time researching