Skip to content

Pull requests: NVIDIA/NeMo-Curator

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add tests/test_classifiers.py PyTests gpuci Run GPU CI/CD on PR
#421 opened Dec 11, 2024 by sarahyurick Draft
2
Create notebook tutorials for distributed data classifiers documentation Improvements or additions to documentation
#415 opened Dec 6, 2024 by sarahyurick Loading…
Added LookUp error handling during encoding detection.
#412 opened Dec 6, 2024 by ggcr Loading…
Create separate files for each deduplication class
#409 opened Dec 3, 2024 by sarahyurick Loading…
Version bump to 0.6.0rc1.dev0
#396 opened Nov 27, 2024 by github-actions bot Loading…
Fix GPU error messages for fuzzy deduplication
#387 opened Nov 22, 2024 by sarahyurick Draft
1 of 2 tasks
Fuzzy Dedup: Make skipping the False positive check the default enhancement New feature or request gpuci Run GPU CI/CD on PR
#386 opened Nov 21, 2024 by ayushdg Loading…
2 of 3 tasks
Remove max_text_bytes_per_part gpuci Run GPU CI/CD on PR
#385 opened Nov 20, 2024 by sarahyurick Loading…
Global cache_dir variable for exact, fuzzy, and semantic deduplication gpuci Run GPU CI/CD on PR
#384 opened Nov 19, 2024 by sarahyurick Loading…
3 tasks done
Allow users to write to single file
#383 opened Nov 19, 2024 by sarahyurick Loading…
ci: Add copyright-check workflow
#369 opened Nov 14, 2024 by ko3n1g Loading…
3 tasks
Task-Complexity Classifier
#364 opened Nov 13, 2024 by sarahyurick Draft
Content Type Classifier
#361 opened Nov 13, 2024 by sarahyurick Loading…
Dapt data curation tutorial fuzzy and semantic dedupe gpuci Run GPU CI/CD on PR
#322 opened Oct 24, 2024 by ruchaa-apte Loading…
Added example notebook for translation with ct2 model. documentation Improvements or additions to documentation
#262 opened Sep 25, 2024 by uahmed93 Draft
3 tasks
Fixed bug: changed to correct model name
#186 opened Aug 6, 2024 by ByteWrite Loading…
1 of 3 tasks
Adding an example for executing NeMo modules using kubernetes Python … documentation Improvements or additions to documentation
#148 opened Jul 9, 2024 by dpadmanabhan03 Loading…
2 of 3 tasks
ProTip! Add no:assignee to see everything that’s not assigned.