Skip to content

Issues: NVIDIA/NeMo-Curator

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Domain / Quality Classifiers failing on nightly bug Something isn't working
#419 opened Dec 10, 2024 by praateekmahajan
LookupError not caught during Encoding handling bug Something isn't working
#411 opened Dec 6, 2024 by ggcr
Add Jupyter notebook tutorials for data classifiers documentation Improvements or additions to documentation
#406 opened Dec 2, 2024 by sarahyurick
5 of 7 tasks
Add CI tests for Hugging Face classifiers
#405 opened Dec 2, 2024 by sarahyurick
1 of 8 tasks
Add Trafilatura text extraction enhancement New feature or request
#400 opened Dec 2, 2024 by sarahyurick
Update columns documentation documentation Improvements or additions to documentation
#378 opened Nov 18, 2024 by sarahyurick
Use CrossFit for TokenizerFertilityFilter enhancement New feature or request
#377 opened Nov 15, 2024 by sarahyurick
Add GPU test with NeMo 2.0
#376 opened Nov 15, 2024 by sarahyurick
[IMP] Decrease Merge Peak Memory Usage of ConnectedComponents bug Something isn't working
#375 opened Nov 15, 2024 by VibhuJawa
Zyda2 tutorial - key error when running compute_counts script bug Something isn't working
#345 opened Nov 5, 2024 by ronjer30
Zyda2 tutorial - TypeError when initializing Dask CPU cluster bug Something isn't working
#344 opened Nov 5, 2024 by ronjer30
Deprecate max_text_bytes_per_part enhancement New feature or request
#331 opened Oct 28, 2024 by sarahyurick
Improve Pytorch Model Performance enhancement New feature or request
#329 opened Oct 28, 2024 by VibhuJawa
3 tasks
Resuming the job on slurm after it gets cancelled. enhancement New feature or request
#297 opened Oct 11, 2024 by uahmed93
Unmanaged memory is high and frozen execution bug Something isn't working
#295 opened Oct 11, 2024 by pappagari
Check Pytorch cuda context is valid across GPUs bug Something isn't working
#284 opened Oct 8, 2024 by VibhuJawa
ProTip! Add no:assignee to see everything that’s not assigned.