Topic

benchmark

Repositories (1623)

Knowledge_distillation_via_TF2.0
Knowledge_distillation_via_TF2.0 sseung0703 Python

The codes for recent knowledge distillation algorithms and benchmark results via TF2.0 low-level API

110
MTKD-CD
MTKD-CD circleLZY Python

Official implementation for "JL1-CD: A New Benchmark for Remote Sensing Change Detection and a Robust Multi-Teacher Knowledge Distillation Framework"

110
Shuhai
Shuhai RC4ML SystemVerilog

Shuhai is a benchmarking-memory tool that allows FPGA programmers to demystify all the underlying details of memories, e.g., HBM and DDR4, on a Xilinx...

110
kubernetes-iperf3
kubernetes-iperf3 Pharb Shell

Simple wrapper around iperf3 to measure network bandwidth from all nodes of a Kubernetes cluster

108
SubpopBench
SubpopBench YyzHarry Python

[ICML 2023] Change is Hard: A Closer Look at Subpopulation Shift

108
marbert
marbert UBC-NLP

UBC ARBERT and MARBERT Deep Bidirectional Transformers for Arabic

108
OmniBenchmark
OmniBenchmark ZhangYuanhan-AI Python

[ECCV2022] New benchmark for evaluating pre-trained model; New supervised contrastive learning framework.

108
xVerify
xVerify IAAR-Shanghai Python

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

107
dbbench
dbbench sj14 Go

🏋️ dbbench is a simple database benchmarking tool which supports several databases and own scripts

107
kaggle-dogs-vs-cats-caffe
kaggle-dogs-vs-cats-caffe mrgloom Python

Kaggle dogs vs cats solution in Caffe

106
WorldScore
WorldScore haoyi-duan Python

Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation

106
EvalNE
EvalNE Dru-Mara Python

Source code for EvalNE, a Python library for evaluating Network Embedding methods.

106
OpenS2V-Nexus
OpenS2V-Nexus PKU-YuanGroup Jupyter Notebook

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation

105
peaks-consolidation
peaks-consolidation hkpeaks Go

The Peaks Consolidation is equipped with state-of-the-art algorithms and data structures that support high-performance databending exercises. It speci...

105
video_object_detection_paper
video_object_detection_paper junliang230

update some video object detection papers (视频目标检测论文和代码整理)

105
gpumembench
gpumembench ekondis C++

A GPU benchmark suite for assessing on-chip GPU memory bandwidth

105
solidity-benchmarks
solidity-benchmarks alephao Solidity

Benchmarks of popular contract implementations in solidity

105
ORBIT-Dataset
ORBIT-Dataset microsoft Python

The ORBIT dataset is a collection of videos of objects in clean and cluttered scenes recorded by people who are blind/low-vision on a mobile phone. Th...

105
tastylib
tastylib chuyangliu C++

C++ implementations of data structures, algorithms, and system designs.

104
XRAutomatedTests
XRAutomatedTests Unity-Technologies C#

XRAutomatedTests is where you can find functional, graphics, performance, and other types of automated tests for your XR Unity development.

104
deepmark
deepmark IngestAI PHP

Deepmark AI enables a unique testing environment for language models (LLM) assessment on task-specific metrics and on your own data so your GenAI-powe...

104
RHEL8-STIG
RHEL8-STIG ansible-lockdown YAML

Automated STIG Benchmark Compliance Remediation for RHEL 8 with Ansible

103
pytorch-benchmark
pytorch-benchmark LukasHedegaard Python

Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption

103
annbench
annbench matsui528 Python

A lightweight benchmark for approximate nearest neighbor search

102
playwright-test
playwright-test hugomrdias JavaScript

Run unit tests with several test runners or benchmark inside real browsers with playwright and other Javascript runtimes.

102
dm_nevis
dm_nevis google-deepmind Python

NEVIS'22: Benchmarking the next generation of never-ending learners

102
NoLiMa
NoLiMa adobe-research Python

Official repository for "NoLiMa: Long-Context Evaluation Beyond Literal Matching"

101
mini-nbody
mini-nbody harrism C

A simple gravitational N-body simulation in less than 100 lines of C code, with CUDA optimizations.

101
benchmark-websocket
benchmark-websocket oatpp C++

Websocket Client and Server for benchmarks with Millions of concurrent connections.

101
zk-Harness
zk-Harness zkCollective Python

Benchmarking framework for general purpose zero-knowledge proofs languages and libraries

101
FollowBench
FollowBench YJiangcm Python

[ACL 2024] FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models

100
pplbench
pplbench facebookresearch Python

Evaluation Framework for Probabilistic Programming Languages

100
mqtt-mock
mqtt-mock daoshenzzg Go

mqtt压测工具。支持subscribe、publish压测方式,支持模拟客户端连接数。

100
endless-memory-gym
endless-memory-gym MarcoMeter Python

Challenging Memory-based Deep Reinforcement Learning Agents

100
tpch-spark
tpch-spark ssavvides C

TPC-H queries in Apache Spark SQL using native DataFrames API

99
smartbugs-curated
smartbugs-curated smartbugs Solidity

SB Curated is a curated dataset of Solidity smart contracts annotated with tagged vulnerabilities. The dataset was created to evaluate the accuracy of...

99
datacenter-speed-tests
datacenter-speed-tests jakejarvis Shell

⚡ Test speed and pings to all DigitalOcean, Linode, AWS, GCP, and Vultr regions

99
php-arrays-in-memory-comparison
php-arrays-in-memory-comparison morozovsk PHP

How to store 11kk items in memory? Comparison of methods: array vs object vs SplFixedArray vs pack vs swoole_table vs swoole_pack vs redis vs memsql v...

98
ping_pong_bench
ping_pong_bench IlyaGusev Python

A benchmark for role-playing language models

97
MMC
MMC FuxiaoLiu Python

[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning

97
PPM
PPM ZHKKKe

A High-Quality Photograpy Portrait Matting Benchmark

97
best
best salesforce TypeScript

:trophy: Delightful Benchmarking & Performance Testing

97
yjit-bench
yjit-bench Shopify Ruby

Set of benchmarks for the YJIT CRuby JIT compiler and other Ruby implementations.

97
unsafe
unsafe bramp Java

Assorted java classes that make use of sun.misc.Unsafe

96
trajectopy
trajectopy gereon-t Python

Trajectopy - Trajectory Evaluation in Python

96
VisualNews-Repository
VisualNews-Repository FuxiaoLiu Jupyter Notebook

[EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning

96
mPLUG-HalOwl
mPLUG-HalOwl X-PLUG Python

mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating

95
pglib-uc
pglib-uc power-grid-lib TeX

Benchmarks for the Unit Commitment Problem

95
DSRL
DSRL liuzuxin Python

🔥 Datasets and env wrappers for offline safe reinforcement learning

95
coir
coir CoIR-team Python

(ACL 2025 Main) A Comprehensive Benchmark for Code Information Retrieval.

95