Recent studies raised concerns over the state of AI benchmarking, reporting issues such as benchmark overfitting, benchmark saturation and increasing centralization of benchmark dataset creation. To facilitate monitoring of the health of the AI benchmarking ecosystem, the authors introduce methodologies for creating condensed maps of the global dynamics of benchmark.
- Simon Ott
- Adriano Barbosa-Silva
- Matthias Samwald