In the next 12 months, will a Large Language Model built by a Chinese organization rank in the top 3 overall on the Chatbot Arena LLM Leaderboard? (Scores for forecasts between Oct 7, 2024 and Jan 23, 2025)

Started Oct 7, 2024 08:00PM
Closed Jan 23, 2025 05:00PM (3 months ago)

See more details

Topics

Science & Technology Artificial Intelligence

Tags

Cybersecurity

Seasons

2025 Season

The development of large language models (LLMs) has been a key area of competition among global AI organizations. The Chatbot Arena LLM Leaderboard, a benchmark platform that ranks LLMs based on a series of side-by-side user comparisons, allows participants to vote on which models provide the best responses in various scenarios (KDnuggets). LLMs developed by organizations such as OpenAI, Anthropic, and others have dominated the top positions on the leaderboard. However, Chinese organizations have significantly advanced in AI research and development, presenting a challenge to the global leaders in this space (CNBC, PYMNTS).

Resolution Criteria:

This question will resolve as "Yes" if, within the next 12 months, an LLM built by a Chinese organization (e.g., Alibaba, 01.AI, Zhipu AI, or others) ranks within the top 3 models on the overall Chatbot Arena LLM Leaderboard. For a model to resolve this question:

“Overall” must be selected in the leaderboard “Category” dropdown
The model’s “Rank” must be 1, 2, or 3
The organization listed under the “Organization” column must be a Chinese organization.

For the purposes of this question, a "Chinese organization" is one that meets at least one of the following criteria:

The organization is headquartered in mainland China.
The majority of the organization’s research, development, and production related to LLMs occurs in mainland China.
At least 50% of the organization is owned or controlled by entities, shareholders, or government bodies based in mainland China.

Organizations that do not meet any of the above criteria but have subsidiary operations in China will not count unless the specific model is developed primarily through its Chinese subsidiary.