semi-structured-data

Star

Here are 28 public repositories matching this topic...

snap-stanford / stark

Star

(NeurIPS D&B 2024) STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases

nlp information-retrieval graph knowledge-base semi-structured-data multimodal llm

Updated Jul 21, 2025
Python

VorTECHsa / refinery

Star

Refinery is a tool to extract and transform semi-structured data from Excel spreadsheets of different layouts in a declarative way.

excel extraction poi data-extraction semi-structured-data wrangling excel-extraction-api

Updated Jul 28, 2025
Kotlin

BartJongejan / Bracmat

Star

Programming language for symbolic computation with unusual combination of pattern matching features: Tree patterns, associative patterns and expressions embedded in patterns.

Updated Apr 4, 2025
C

rub-ksv / MyFixit-Dataset

Star

A dataset for extracting information from repair manuals

nlp information-extraction semi-structured-data repair-manuals instructional-text procedural-task

Updated Jul 4, 2020
Python

utahnlp / infotabs-code

Star

Implementation of the semi-structured inference model in our ACL 2020 paper, INFOTABS: Inference on Tables as Semi-structured Data.

nlp wikipedia svm inference transformer nlp-machine-learning tables semi-structured-data nli nlp-datasets roberta acl2020 infotabs

Updated Dec 7, 2021
Python

utahnlp / knowledge_infotabs

Star

Repository containing code for the NAACL 2021 paper (Incorporating External Knowledge to Enhance Tabular Reasoning)

nlp naacl knowledge wikipedia inference transformer nlp-machine-learning tables semi-structured-data nli naacl2021 infotabs

Updated Jun 20, 2021
Python

ropensci / EndoMineR

Star

Endoscopic and Pathological data extraction for various endo-pathological data extraction

text-mining r rstats r-package semi-structured-data gastroenterology endoscopy peer-reviewed

Updated Sep 2, 2024
R

mansakondo / activemodel-embedding

Star

An ActiveModel extension to model your semi-structured data using embedded associations

ruby rails json database activemodel jsonb denormalization semi-structured-data type-casting document-modelling

Updated Oct 28, 2021
Ruby

cyk1337 / UrbanDict

Star

Urban Dict spelling variant dataset. Source code of How to Evaluate Word Representations of Informal Domain?

information-extraction urban-dictionary sequence-tagger semi-structured-data word-representation-evaluation hashtag-prediction

Updated Feb 5, 2020
Jupyter Notebook

Dibyakanti / AutoTNLI-code

Star

This repository contains the official code for the paper : Realistic Data Augmentation Framework for Enhancing Tabular Reasoning (Findings-EMNLP, 2022).

nlp wikipedia inference transformer nlp-machine-learning tables semi-structured-data nli nlp-datasets roberta infotabs emnlp2022 autotnli

Updated Nov 26, 2024
HTML

taehyounpark / queryosity

Star

Coherent data analysis library

cpp data-analysis semi-structured-data

Updated Dec 24, 2024
C++

lucaliechti / FCAInference

Star

Schema inference for semistructured data using Formal Concept Analysis

schema-inference formal-concept-analysis semi-structured-data

Updated May 4, 2017
Java

rub-ksv / MyFixit-Annotator

Star

A semi-automatic web-based annotation tool for MyFixit dataset :

nlp information-extraction sequence-labeling annotation-tool semi-structured-data instructional-text procedural-task repair-manual semi-automatic-annotation

Updated Sep 20, 2022
CSS

Info-Sync / InfoSync

Star

Implementation of the semi-structured inference model in our ACL 2023 paper: INFOSYNC: Information Synchronization across Multilingual Semi-structured Tables.

multilingual nlp translation wikipedia nlp-machine-learning tables semi-structured-data infosync acl2023 information-synchronization wikipedia-update

Updated Sep 5, 2023
HTML

Web-based workflow management system that computes candidate tool workflows given input file(s) and the user's requirements regarding the output. Afterwards, runs a workflow selected by the user from the list of candidates. Implemented in Bracmat (~75%) and Java (~25%).