generalization

Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a small set of examples and anti-examples, then detect which item truly fits that theme among a collection of misleading candidates.

benchmark

View on GitHub

70 Stars

2 Forks

70 Watchers

100 SrcLog Score

Cost to Build

$2.63M

Market Value

$6.54M

How is this calculated?

Growth over time

4 data points · 2025-06-06 → 2026-04-26

Stars Forks Watchers

💬

How do you feel about this project?

Ask AI about generalization

Question copied to clipboard

What is the lechmazur/generalization GitHub project? Description: "Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a small set of examples and anti-examples, then detect which item truly fits that theme among a collection of misleading candidates.". Explain what it does, its main use cases, key features, and who would benefit from using it.

Question is copied to clipboard — paste it after the AI opens.

How to clone generalization

Clone via HTTPS

git clone https://github.com/lechmazur/generalization.git

Clone via SSH

[email protected]:lechmazur/generalization.git

Download ZIP

Download master.zip

Found an issue?

Report bugs or request features on the generalization issue tracker:

Open GitHub Issues

Similar to generalization

netdata fashion-mnist FrameworkBenchmarks BenchmarkDotNet jmeter awesome-semantic-segmentation sysbench hyperfine tsung benchmark_results across web-frameworks php-framework-benchmark jsperf.com go-web-framework-benchmark huststore phoronix-test-suite Attabench ann-benchmarks sbt-jmh caffenet-benchmark chillout IocPerformance prophiler TBCF NBench sympact awesome-http-benchmark BlurTestAndroid pytest-benchmark