A system for quickly generating training data with weak supervision
-
Updated
May 2, 2024 - Python
A system for quickly generating training data with weak supervision
The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.
Synthetic data generators for tabular and time-series data
skweak: A software toolkit for weak supervision applied to NLP tasks
Computer vision based ML training data generation tool 🚀
A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels for supervised learning.
Pure Python, lightweight, Pillow-based solver for Amazon's text captcha.
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
Web application for image labeling and segmentation
🏖TagEditor - Annotation tool for spaCy
A lightweight web application for brushing labels onto time series data; useful for building training sets.
Augmenty is an augmentation library based on spaCy for augmenting texts.
Natural Language Data Augmentation Tool for Conversational Systems
Generating training data from the Carla driving simulator in the KITTI dataset format
Aubo i5 Dual Arm Collaborative Robot - RealSense D435 - 3D Object Pose Estimation - ROS
Collection of casual conversations that can be used with the Rasa Stack
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask prompting.
COVID-19 Coughs files for training AI models
Full resources supporting the publication "A Pragmatic Guide to Geoparsing Evaluation."
Convert all files in git repository to .txt files. Useful for training LLMs on your codebase.
Add a description, image, and links to the training-data topic page so that developers can more easily learn about it.
To associate your repository with the training-data topic, visit your repo's landing page and select "manage topics."