Shuhai is a benchmarking-memory tool that allows FPGA programmers to demystify all the underlying details of memories, e.g., HBM and DDR4, on a Xilinx...
Benchmark for generative image models
The codes for recent knowledge distillation algorithms and benchmark results via TF2.0 low-level API
[ECCV2022] New benchmark for evaluating pre-trained model; New supervised contrastive learning framework.
Simple wrapper around iperf3 to measure network bandwidth from all nodes of a Kubernetes cluster
UBC ARBERT and MARBERT Deep Bidirectional Transformers for Arabic
[ICML 2023] Change is Hard: A Closer Look at Subpopulation Shift
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
🏋️ dbbench is a simple database benchmarking tool which supports several databases and own scripts
Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation
Source code for EvalNE, a Python library for evaluating Network Embedding methods.
Kaggle dogs vs cats solution in Caffe
The Peaks Consolidation is equipped with state-of-the-art algorithms and data structures that support high-performance databending exercises. It speci...
update some video object detection papers (视频目标检测论文和代码整理)
Benchmarks of popular contract implementations in solidity
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
A GPU benchmark suite for assessing on-chip GPU memory bandwidth
The ORBIT dataset is a collection of videos of objects in clean and cluttered scenes recorded by people who are blind/low-vision on a mobile phone. Th...
C++ implementations of data structures, algorithms, and system designs.
Deepmark AI enables a unique testing environment for language models (LLM) assessment on task-specific metrics and on your own data so your GenAI-powe...
XRAutomatedTests is where you can find functional, graphics, performance, and other types of automated tests for your XR Unity development.
Automated STIG Benchmark Compliance Remediation for RHEL 8 with Ansible
Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption
NEVIS'22: Benchmarking the next generation of never-ending learners
A lightweight benchmark for approximate nearest neighbor search
Run unit tests with several test runners or benchmark inside real browsers with playwright and other Javascript runtimes.
Websocket Client and Server for benchmarks with Millions of concurrent connections.
Benchmarking framework for general purpose zero-knowledge proofs languages and libraries
A simple gravitational N-body simulation in less than 100 lines of C code, with CUDA optimizations.
Official repository for "NoLiMa: Long-Context Evaluation Beyond Literal Matching"
mqtt压测工具。支持subscribe、publish压测方式,支持模拟客户端连接数。
[ACL 2024] FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models
Challenging Memory-based Deep Reinforcement Learning Agents
Evaluation Framework for Probabilistic Programming Languages
TPC-H queries in Apache Spark SQL using native DataFrames API
SB Curated is a curated dataset of Solidity smart contracts annotated with tagged vulnerabilities. The dataset was created to evaluate the accuracy of...
⚡ Test speed and pings to all DigitalOcean, Linode, AWS, GCP, and Vultr regions
How to store 11kk items in memory? Comparison of methods: array vs object vs SplFixedArray vs pack vs swoole_table vs swoole_pack vs redis vs memsql v...
A High-Quality Photograpy Portrait Matting Benchmark
:trophy: Delightful Benchmarking & Performance Testing
Set of benchmarks for the YJIT CRuby JIT compiler and other Ruby implementations.
A benchmark for role-playing language models
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
Assorted java classes that make use of sun.misc.Unsafe
Trajectopy - Trajectory Evaluation in Python
[EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning
Benchmarks for the Unit Commitment Problem
(ACL 2025 Main) A Comprehensive Benchmark for Code Information Retrieval.
🔥 Datasets and env wrappers for offline safe reinforcement learning
mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating