vincentkoc / tiny_qa_benchmark_pp Sponsor Star 1 Code Issues Pull requests Tiny QA Benchmark++ a micro-benchmark suite (52-item gold + on-demand multilingual synthetic packs), generator CLI, and CI-ready eval harness for ultra-fast LLM smoke-testing & regression-catching. benchmark evaluation dataset smoke-test synthetic-data qa-dataset huggingface-datasets llm llmops litellm llm-testing tinybenchmarks Updated May 19, 2025 Python
Leftinant / tiny_qa_benchmark_pp Star 0 Code Issues Pull requests Tiny QA Benchmark++ a micro-benchmark suite (52-item gold + on-demand multilingual synthetic packs), generator CLI, and CI-ready eval harness for ultra-fast LLM smoke-testing & regression-catching. benchmark evaluation dataset synthetic-data qa-dataset huggingface-datasets llm llmops litellm llm-testing tinybenchmarks Updated May 19, 2025