Most popular benchmark repositories and open source projects

FewCLUE

FewCLUE 小样本学习测评基准,中文版

73   509   509  

Leaderboard

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking pla...

66   503   503  

z-bench

Z-Bench 1.0 by 真格基金:一个麻瓜的大语言模型中文测试集。Z-Bench is a...

44   495   495  

LongCite

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-con...

32   495   495  

pcam

The PatchCamelyon (PCam) deep learning classification benchmark.

105   492   492  

Visual-Tracking-Development

Visual Object Tracking

58   489   489  

llmc

[EMNLP 2024 Industry Track] This is the official PyTorch implementatio...

56   487   487  

kg-gen

Knowledge Graph Generation from Any Text

57   483   483  

RHEL7-CIS

Automated CIS Benchmark Compliance Remediation for RHEL 7 with Ansible

301   478   478  

glmark2

glmark2 is an OpenGL 2.0 and ES 2.0 benchmark

192   471   471  

awesome-state-of-depth-completion

Current state of supervised and unsupervised depth completion methods

24   469   469  

PaddleFleetX

飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域...

165   468   468  

web-tooling-benchmark

JavaScript benchmark for common web developer workloads

70   466   466  

tf_to_trt_image_classification

Image classification with NVIDIA TensorRT from TensorFlow models.

154   457   457  

prophiler

PHP Profiler & Developer Toolbar (built for Phalcon)

54   442   442  

sympact

🔥 Stupid Simple CPU/MEM "Profiler" for your JS code.

18   441   441  

benchmarks-of-javascript-package-managers

Benchmarks of JavaScript Package Managers

24   441   441  

LayoutFrameworkBenchmark

Benchmark the performances of various Swift layout frameworks (autolay...

33   438   438  

automlbenchmark

OpenML AutoML Benchmarking Framework

136   428   428  

ChineseBLUE

Chinese Biomedical Language Understanding Evaluation benchmark (Chines...

82   423   423  

gymfc

A universal flight control tuning framework

104   423   423  

BlurTestAndroid

This is a simple App to test some blur algorithms on their visual qual...

64   421   421  

pyaf

PyAF is an Open Source Python library for Automatic Time Series Foreca...

67   414   414  

srs-bench

SB(SRS Bench) is a set of benchmark and regression test tools, for SRS...

229   414   414  

oltpbench

Database Benchmarking Framework

264   411   411  

superpixel-benchmark

An extensive evaluation and comparison of 28 state-of-the-art superpix...

111   411   411  

modclean

Remove unwanted files and directories from your node_modules folder

14   406   406  

BenchMARL

A collection of MARL benchmarks based on TorchRL

81   403   403  

mixbench

A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational...

72   401   401  

FedScale

FedScale is a scalable and extensible open-source federated learning (...

122   397   397  

TheAgentCompany

An agent benchmark with tasks in a simulated software company.

54   395   395  

ant-application-security-testing-benchmark

xAST评价体系,让安全工具不再“黑盒”. The xAST evaluation benchmark ma...

50   393   393  

cob

Continuous Benchmark for Go Project

25   387   387  

jetson_benchmarks

Jetson Benchmark

77   382   382  

CSS-IN-JS-Benchmarks

58   380   380  

bigcodebench

[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI

47   378   378  

package-benchmark

Swift benchmark runner with many performance metrics and great CI supp...

26   376   376  

KernelBench

KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUD...

38   374   374  

Face-landmarks-detection-benchmark

Face landmarks(fiducial points) detection benchmark

129   372   372  

DynamicMap_Benchmark

The First Dynamic Map Removal Benchmark | Included 8 SOTA methods | Co...

25   367   367  

dance

DANCE: a deep learning library and benchmark platform for single-cell...

38   361   361  

SKAB

SKAB - Skoltech Anomaly Benchmark. Time-series data for evaluating Ano...

57   360   360  

RGBD-SODsurvey

RGB-D Salient Object Detection: A Survey

34   359   359  

MOSE-api

[ICCV 2023] MOSE: A New Dataset for Video Object Segmentation in Compl...

5   359   359  

puck

Puck is a high-performance ANN search engine

41   357   357  

vtebench

Generate benchmarks for terminal emulators

22   357   357  

Awesome_Satellite_Benchmark_Datasets

Supplementary material for our paper "THERE IS NO DATA LIKE MORE DATA"...

28   354   354  

EasyCompressor

⚡An Easy-to-Use and Optimized compression library for .NET that unifi...

20   353   353  

devtools

Inspect and Debug your Tauri applications in style 💃

11   351   351  

are-we-fast-yet

Are We Fast Yet? Comparing Language Implementations with Objects, Clos...

39   351   351