This repository contains code and data accompanying:
Tuckute, G., Sathe, A., Srikant, S., Taliaferro, M., Wang, M., Schrimpf, M., Kay, K., Fedorenko, E.: Driving and suppressing the human language network using large language models. Nat Hum Behav (2024). https://doi.org/10.1038/s41562-023-01783-7
The environment is a Python 3.8.11 environment that makes heavy use of pandas, scikit-learn, HuggingFace Transformers, and matplotlib. To use the exact Python environment used in the paper, install it as:
conda env create -f env_drive-suppress-brains.yml
To populate the data
, data_SI
, model-actv
, and regr-weights
folders, please see the "Downloading data" section below.
The data
folder contains a csv file with the event-related data (brain-lang-data_participant_20230728.csv; main experiment), a csv file for the blocked experiment (brain-lang-blocked-data_participant_20230728.csv), a csv file with the noise ceilings computed based on the event-related data (NC-allroi-data.csv), and finally a file with the associated column name descriptions (column_name_descriptions.csv). These files are used to run the Figure_2.ipynb, Figure_3.ipynb, Figure_4.ipynb, and Figure_5.ipynb notebooks.
The data_SI
folder contains csv files used to run the SI_Figures.ipynb.
The env
folder contains the conda yml file env_drive-suppress-brains.yml.
The model-actv
folder contains pre-computed model activations for GPT2-XL (last-token representation). The file beta-control-neural-T_actv.pkl contains the activations for the baseline set in a Pandas dataframe. The rows correspond to sentences, and the columns are multi-indexed according to layer and unit. The first level is layer (49 layers in GPT2-XL) and the second level is unit (1600 units in each representation vector in GPT2-XL). The file beta-control-neural-T_stim.pkl contains the corresponding stimuli metadata in a Pandas dataframe. The two files are row-indexed using the same identifiers. The files beta-control-neural-D_actv.pkl and beta-control-neural-D_stim.pkl contain the activations for the baseline set along with the drive/suppress activations (derived via the main search approach).
The regr-weights
folder contains the encoding model regression weights in the fit_mapping
subfolder with an additional subfolder according to the parameters that were used to fit the encoding model.
The results
folder is the default folder for storing outputs from src/run_analyses
.
The src
folder contains all code in the following subfolders:
plot_data
contains a notebook that reproduces each of the main figures, as well as a notebook for the SI figures.run_analyses
contains code to run all main analyses in the paper.statistics
contains linear mixed effect (LME) statistics (in R).
To download data used in the paper, run the download_files.py script. By default, it will download the files for the data
folder.
The data
folder contains a csv file with the event-related data (brain-lang-data_participant_20230728.csv; main experiment). This file contains brain responses for the left hemisphere (LH) language regions for n=10 participants (n=5 train participants, n=5 evaluation participants) along with various metadata and behavioral data for each sentence (n=10 linguistic properties). The data
folder also contains a csv file with brain responses for the blocked experiment (brain-lang-blocked-data_participant_20230728.csv, n=4 evaluation participants). The folder also contains the noise ceilings computed based on the event-related data on n=5 train participants (NC-allroi-data.csv). Finally, the file column_name_descriptions.csv contains descriptions of the content of the columns in these csv files.
Using the additional flags, you can specify whether you want to download the data_SI
files, the model-actv
files, and the regr-weights
files.
All code is in src
.
- The
src/plot_data
folder contains Jupyter Notebooks that analyze and generate plots for the main results in the paper. - The
src/run_analyses
folder contains Python scripts for running analyses. The two main scripts are:- /src/run_analyses/fit_mapping.py fits an encoding model from features from a source model (in this case, GPT2-XL, cached in
model-actv/gpt2-xl
) to the participant-averaged brain data. The script will store outputs inresults
and the fitted regression weights inregr-weights
. - /src/run_analyses/use_mapping_external.py loads the regression weights from the encoding model and predicts each sentence in the supplied stimulus set.
- /src/run_analyses/fit_mapping.py fits an encoding model from features from a source model (in this case, GPT2-XL, cached in
- The
src/statistics
folder contains R code to run LME models.
If you use this repository or data, please cite:
@article{Tuckute2024,
title = {Driving and suppressing the human language network using large language models},
author = {Tuckute, Greta and Sathe, Aalok and Srikant, Shashank and Taliaferro, Maya and Wang, Mingye and Schrimpf, Martin and Kay, Kendrick and Fedorenko, Evelina},
journal = {Nature Human Behaviour},
year = {2024},
date = {2024/01/03},
abstract = {Transformer models such as GPT generate human-like language and are predictive of human brain responses to language. Here, using functional-MRI-measured brain responses to 1,000 diverse sentences, we first show that a GPT-based encoding model can predict the magnitude of the brain response associated with each sentence. We then use the model to identify new sentences that are predicted to drive or suppress responses in the human language network. We show that these model-selected novel sentences indeed strongly drive and suppress the activity of human language areas in new individuals. A systematic analysis of the model-selected sentences reveals that surprisal and well-formedness of linguistic input are key determinants of response strength in the language network. These results establish the ability of neural network models to not only mimic human language but also non-invasively control neural activity in higher-level cortical areas, such as the language network.},
doi = {10.1038/s41562-023-01783-7},
url = {https://doi.org/10.1038/s41562-023-01783-7},
issn = {2397-3374}
}