Draft:Humanity's Last Exam
Submission declined on 16 February 2025 by LunaEclipse (talk). Thank you for your submission, but the subject of this article already exists in Wikipedia. You can find it and improve it at Humanity's Last Exam instead.
Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
| ![]() |
In artificial intelligence, Humanity's Last Exam is a project aiming to create a benchmark for large language models consisting of expert-level questions in a variety of areas. Organized by the nonprofit organization Center for AI Safety and the company Scale AI, the project started soliciting question submissions from the academic community and other subject-matter experts in September 2024.[1] The goal of the project is to measure progress of AI models towards human expert knowledge and abstract reasoning abilities, going beyond the difficulty level of undergraduate questions as included e.g. in the MMLU benchmark.[2]
Humanity's Last Exam was finally published in January 23, 2025. The dataset consists of 3,000 challenging questions across over a hundred subjects.[3]
Scope of questions
[edit]The project aims to collect several thousand questions in all fields, ranging from mathematics, physics, biology and electrical engineering to analytic philosophy.[1][4] Answers to these questions are required to be objective and self-contained and have to be submitted together with the question.[5]
Competition and History of the project
[edit]The project was originally announced on September 15, 2024, in a blog post[5] by Dan Hendrycks and Alexandr Wang, and covered by several news sources. To encourage the submission of questions, a prize pool of $500,000 (sponsored by Scale AI) was announced for the top 550 questions submitted until November 1, 2024. Additionally, all authors of questions accepted in the benchmark were offered co-authorship in the resulting publication.[2][6]
Submission of questions was later extended beyond the original November deadline.[7]
External links
[edit]References
[edit]- ^ a b Tharin Pillay (2024-12-24). "AI Models Are Getting Smarter. New Tests Are Racing to Catch Up". Time. Retrieved 2025-01-12.
- ^ a b Jeffrey Dastin and Katie Paul (2024-09-16). "AI experts ready 'Humanity's Last Exam' to stump powerful tech". Reuters. Retrieved 2025-01-12.
- ^ Humanity's Last Exam, January 23, 2025, retrieved February 3, 2025
- ^ "Humanity's Last Exam Submission Form". Center for AI Safety. Archived from the original on 2024-12-26. Retrieved 2025-01-12.
- ^ a b "Submit Your Toughest Questions for Humanity's Last Exam". Center for AI Safety. 2024-09-15. Retrieved 2025-01-12.
- ^ Carroll, Mickey (September 18, 2024). "Public asked to help create 'humanity's last exam' to spot when AI achieves peak intelligence". Sky News. Retrieved January 15, 2025.
- ^ Dan Hendrycks [@DanHendrycks] (2024-11-10). "As we clean up the dataset, we're accepting questions at agi.safe.ai" (Tweet). Retrieved 2025-01-12 – via Twitter.