TSIA - Learning - Thématiques Spécifiques en Intelligence Artificielle (Machine Learning Operations, Génie Logiciel pour l’Intelligence Artificielle) 2024

BenchArk - An efficient and robust benchmarking suite for AI – BenchArk

Submission summary

Numerical evaluation of novel methods, a.k.a. benchmarking, is a pillar of the scientific method in machine learning.
However, due to practical and statistical obstacles, the reproducibility of published results is currently insufficient: many details can invalidate numerical comparisons, from insufficient uncertainty quantification to improper methodology.
In 2022, the benchopt initiative (https://benchopt.github.io) provided an open-source Python package together with a framework to seamlessly run, reuse, share, and publish benchmarks in numerical optimization.
In this project, we aim to bring Benchopt to the whole machine learning community, making it a new standard in benchmarking by empowering researchers and practitioners with efficient and valid benchmarking methods.
Our goal is to ensure reproducibility and consistency in model evaluation.
We will federate the machine learning community to develop informative and statistically valid benchmarks while providing methods to reduce identified hurdles in implementing such practices.
The results of the project will be integrated into the open-source Benchopt library.

Project coordination

Thomas MOREAU (Centre Inria de Saclay)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partnership

Inria Saclay - MIND/SODA Centre Inria de Saclay
IMAG Institut Montpelliérain Alexander Grothendieck
OCKHAM Optimisation, Connaissances pHysiques, Algorithmes et Modèles

Help of the ANR 588,611 euros
Beginning and duration of the scientific project: September 2024 - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter