tsm

About:
Show more

No Scores Yet

Relative Brier Score
138000
Questions Forecasted

0

Forecasts

0

Upvotes
Forecasting Activity
Forecasting Calendar
No forecasts in the past 3 months
 

Past Week Past Month Past Year This Season All Time
Forecasts 0 0 17 0 17
Comments 0 0 1 0 1
Questions Forecasted 0 0 7 0 7
Upvotes on Comments By This User 0 0 0 0 0
 Definitions
New Comment

The competitiveness of the mainland's LLMs cannot be estimated by watching these rankings. The reseacher is recommended to discourage the interpretation or assumption, that these leaderboards were more than entertainment products, like keeping score in an imagined Sino-American rivalry, akin to fantasy football.

Reason one is the power consumption, a million tokens by a high-quality LLM with the best answers might cost so much that it only pays to be employed where cooling as well as electricity remains very cheap, or free. The analogy is the CPU, though not a piece of software, the CPU's by certain vendors are energy-hungrier than the competition, yet the same power-inefficient CPUs are designed to achieve maximal compute power, achieving top ranks. The rankings seldom calculate the performance by the employed effort, or cost, say the power consumption.

Reason two is the upfront costs for hardware that an LLM requires. The high quality LLMs currently require large amounts of VRAM, unusually large power supply units, and unusual cabling and cooling systems. The purchase alone has become a problem, many AI hobbyists and startups are waiting for datacenters to sell off their last generation's hardware, say, H100 accelerators. An LLM that can produce reasonable results with old, but cheaper hardware, is more desirable than the latest LLM that requires the latest hardware, for the best results. The leaderboard doesn't capture this reality.

Reason three is the absence of a productive use case and application. Not all tasks require the same kind of all-purpose LLM, many task-specific LLMs won't produce good answers to many types of questions, but might be excellent in a narrowly defined application and use case. China's English-language Tongyi (Qwen2.5) is said to be the programmers' favorite, because of its reasonable or excellent results in mathematics and programming despite lower hardware requirements and flaws in other tasks. The leaderboard assumes excellence as a generalist, while the need for computer software is usually specialized.

Reason four are the mainland's LLMs that are not taking part, some by major corporations, and LLMs that are not designed for an English-first-speaking audience, only Tongyi (aka Qwen) was explicitly made for English. It's also been widely shared and widely employed, even though the leaderboard doesn't capture its popularity among developers of LLMs.

Reason five is trivial, though: The mainland read and write Chinese, this leaderboard and audience doesn't, not even German or French. What the Chinese deem as intelligent responses, or capacities, may not be universally shared with Americans, for example, wit, demeanor, use of culturally specific sayings, attempts to negotiate or presentation.

In short, this is a flawed attempt, to estimate the progress of LLMs outside Silicon Valley, and even more flawed, if it shall estimate the effectiveness of sabotage. 

Files
New Prediction
tsm
made their 5th forecast (view all):
Probability
Answer
5% (0%)
Kharkiv
1% (0%)
Kyiv
0% (0%)
Odesa
Confirmed previous forecast
Files
New Prediction
Confirmed previous forecast
Files
New Prediction
Confirmed previous forecast
Files
New Prediction
tsm
made their 2nd forecast (view all):
Probability
Answer
0% (0%)
Estonia
0% (0%)
Latvia
0% (0%)
Lithuania
Confirmed previous forecast
Files
New Prediction
tsm
made their 2nd forecast (view all):
Probability
Answer
0% (0%)
Moldova
0% (0%)
Armenia
0% (0%)
Georgia
0% (0%)
Kazakhstan
Confirmed previous forecast
Files
New Prediction
Confirmed previous forecast
Files
New Badge
tsm
earned a new badge:

Active Forecaster

New Prediction
tsm
made their 2nd forecast (view all):
This forecast expired on Jan 17, 2025 02:24AM
Probability
Answer
Forecast Window
0% (0%)
Yes
Dec 17, 2024 to Jun 17, 2025
100% (0%)
No
Dec 17, 2024 to Jun 17, 2025
Confirmed previous forecast
Files
New Prediction
tsm
made their 4th forecast (view all):
Probability
Answer
5% (+3%)
Kharkiv
1% (0%)
Kyiv
0% (0%)
Odesa

That doesn't mean, the cities' infrastructure and factories were safe, or protected.

Files
Files
Tip: Mention someone by typing @username