#

corpus-linguistics

Here are 325 public repositories matching this topic...

rahonalab / TEITOK-docker

Open Corpus Workbench with TEITOK Docker compose file

linguistics cwb digital-humanities corpus-linguistics cqp digital-philology opencorpusworkbench

Updated May 30, 2019
Dockerfile

gisly / evenki-corpus

evenki-corpus

nlp corpus linguistics corpus-linguistics evenki evenki-corpus

Updated Jul 2, 2022
Python

PaulCaroline / comm313_S21_Final_Project

Corpus linguistics final project for the course COMM 313: Computational Text Analysis at the University of Pennsylvania. Aims to determine how the anti-vaccination movement has evolved on social media before and during the COVID-19 pandemic.

twitter university sentiment-analysis corpus-linguistics covid19 snscrape

Updated May 8, 2021
Jupyter Notebook

matbahasa / ETA

Easy Text Annotator

visualization nlp information-retrieval annotation linguistics corpus-linguistics annotaton-tool

Updated Feb 1, 2023
JavaScript

stewieboomhauer / IVK-Ler-Corpus

The data and code located in this repository introduce an international preparatory class learner corpus and its complexity analyses.

nlp natural-language-processing complexity corpus-linguistics complexity-analysis

Updated Oct 24, 2022
R

Affenmilchmann / lingwiki

(Ongoing module in development) Getting Wikipedia articles parsed content. Created for getting text corpuses data fast and easy. But can be freely used for other purpuses too

parser wikipedia multithreading linguistics corpus-linguistics corpus-data corpus-tools article-extractor wikipedia-corpus

Updated Jan 3, 2023
Python

leoalenc / nheengatu

Tools and resources for the computational processing of the Nheengatu language

natural-language-processing machine-translation treebank computational-linguistics corpus-linguistics grammatical-framework grammar-parser nheengatu

Updated Aug 27, 2022
Grammatical Framework

KurdishBLARK / KTC

Kurdish Textbooks Corpus

natural-language-processing corpus corpus-linguistics kurdish language-resources kurdish-language-processing

Updated Feb 9, 2024

sonalsinha / Marwari_recordings

The recordings of marwari speech by Bharti, the speaker of it. It Includes setences of all kinds using translation method and narrations of health care and lifecycle.

language documentation data corpus speech corpus-linguistics marwari

Updated Jul 4, 2024

craigmateo / pipeline-corpus

Corpus for linguistic study of natural gas pipeline debates.

corpus-linguistics corpus-data

Updated Apr 6, 2024

suomela / bnc-metadata

Extract BNC metadata

corpus-linguistics

Updated May 21, 2018
C++

chrisdrymon / Treebanks

Treebanks modified from PROIEL and Perseus.

linguistics greek treebank ancient-greek perseus computational-linguistics corpus-linguistics perseus-digital-library treebanks perseusdl proiel

Updated Jun 1, 2018

andcarnivorous / CorpusInfo

A module to quickly create Corpus objects containing TTR, tokenized sentences, lexical density, class frequencies and more.

nlp computational-linguistics corpus-linguistics

Updated Jun 30, 2019
Python

UIUCLearningLanguageLab / CreateWikiCorpus

Extract raw text articles from Wikipedia dump

nlp corpus-linguistics

Updated Jun 21, 2022
Python

JorgeFCS / multimodal-annotation-distance

A tool for determinating distances between multimodal annotations.

gesture corpus-linguistics data-processing prosody

Updated Oct 16, 2023
Python

sohypmotizin / fr-wikipedia-corpus

2019 project - french wikipedia corpus data analysis

pandas conll corpus-linguistics corpus-data conll-u

Updated Aug 17, 2021
Python

LingConLab / data_oral_khakas_corpus

linguistics corpus-linguistics corpus-data khakas languages-of-russia

Updated Aug 17, 2022
R

CaterinaBi / interrogatives-corpus-work

Paper that Lena Baunaz and I are working on as part of my SNSF-funded 'Focus in diachrony' research project at the University of Cambridge, UK.

python syntax excel nltk data-analysis corpus-linguistics syntactic-analysis corpus-processing wh-movement

Updated Jan 31, 2023
Jupyter Notebook

MevSillire / Corpus_CODIM

All scripts needed to exploit French corpus and create the associated database for the CODIM Project.

corpus-linguistics

Updated Aug 22, 2023
Jupyter Notebook

zelewskap / BA_heuristics

Heuristics and cognitive biases in public discourse on climate changes - lingustic data analysis

annotations transformers bachelor-thesis corpus-linguistics linguistic-analysis corpus-processing data-analysis-python corpus-statistics bachelor-degree corpus-analysis

Updated Jun 30, 2023
Jupyter Notebook

Improve this page

Add a description, image, and links to the corpus-linguistics topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the corpus-linguistics topic, visit your repo's landing page and select "manage topics."