Skip to content
Change the repository type filter

All

    Repositories list

    • folktexts

      Public
      Get classification risk scores on tabular tasks using LLMs
      Jupyter Notebook
      MIT License
      01600Updated Jan 17, 2025Jan 17, 2025
    • Code to reproduce the paper "Questioning the Survey Responses of Large Language Models"
      Jupyter Notebook
      MIT License
      1700Updated Dec 8, 2024Dec 8, 2024
    • Code to reproduce the experiments in the paper Training on the Test Task Confounds Evaluation and Emergence.
      Jupyter Notebook
      1900Updated Dec 3, 2024Dec 3, 2024
    • Jupyter Notebook
      MIT License
      0000Updated Dec 2, 2024Dec 2, 2024
    • Code to reproduce the paper "Do causal predictors generalize better to new domains?"
      Python
      Other
      7700Updated Oct 23, 2024Oct 23, 2024
    • A framework for few-shot evaluation of language models.
      Python
      MIT License
      2k100Updated Sep 20, 2024Sep 20, 2024
    • lawma

      Public
      Lawma: A lightly fine-tuned Llama model for legal classification tasks.
      Jupyter Notebook
      01600Updated Sep 14, 2024Sep 14, 2024
    • BenchBench is a Python package to evaluate multi-task benchmarks.
      Python
      MIT License
      11300Updated Jul 18, 2024Jul 18, 2024
    • Datasets derived from US census data
      Python
      MIT License
      1824753Updated May 15, 2024May 15, 2024
    • Achieve error-rate fairness between societal groups for any score-based classifier.
      Python
      MIT License
      41501Updated Apr 26, 2024Apr 26, 2024
    • tttlm

      Public
      Test-time-training on nearest neighbors for large language models
      Python
      MIT License
      53700Updated Apr 18, 2024Apr 18, 2024
    • Code for "Is your model predicting the past?"
      Jupyter Notebook
      MIT License
      0100Updated Mar 10, 2024Mar 10, 2024
    • whynot

      Public
      A Python sandbox for decision making in dynamics
      Python
      MIT License
      4342082Updated Aug 21, 2023Aug 21, 2023