-
Notifications
You must be signed in to change notification settings - Fork 141
Issues: IBM/data-prep-kit
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug] ededup removes all samples if the document_id is an int
bug
Something isn't working
#868
opened Dec 10, 2024 by
burn2l
2 tasks done
[Bug] dpk_web2parquet module is missing 0.2.2 release
bug
Something isn't working
#863
opened Dec 6, 2024 by
sujee
1 of 2 tasks
[Bug] Only up to 100 actors can be used
bug
Something isn't working
#858
opened Dec 5, 2024 by
dtsuzuku-ibm
2 tasks done
Making sure that we run the Jupyter lab inside venv, when running notebooks locally
enhancement
New feature or request
#856
opened Dec 4, 2024 by
shahrokhDaijavad
2 tasks done
Link to the three example notebooks of fdedup in the README file
enhancement
New feature or request
#848
opened Dec 2, 2024 by
shahrokhDaijavad
1 of 2 tasks
Add the first Google Colab Compatible Notebook as a template for all transforms
enhancement
New feature or request
#844
opened Nov 30, 2024 by
shahrokhDaijavad
2 tasks done
[Feature] HAP example for kickstart
enhancement
New feature or request
#843
opened Nov 29, 2024 by
AishaDarga
2 tasks done
[Feature] PII example for kickstart
enhancement
New feature or request
#842
opened Nov 29, 2024 by
PoojaHolkar
2 tasks done
[Discussion] Could someone kindly help to answer the question in the discussion area
enhancement
New feature or request
#841
opened Nov 29, 2024 by
vincent-pli
1 of 2 tasks
[Bug] Fix link to language modules listed in pypi
bug
Something isn't working
#827
opened Nov 25, 2024 by
dtsuzuku-ibm
1 of 2 tasks
[Feature] Add discord link to the front page (README.md) and add Fuzzy Dedup python only to the table
enhancement
New feature or request
#820
opened Nov 20, 2024 by
sujee
2 tasks done
[Feature] Some subdirs not cleaning up venv on make clean
enhancement
New feature or request
#819
opened Nov 20, 2024 by
daw3rd
2 tasks done
[Bug] parquet files with columns containing large list of byte arrays can not be read by pyarrow.
bug
Something isn't working
#816
opened Nov 20, 2024 by
daw3rd
2 tasks done
[Bug] pdf2parquet: identical PDF files have different Something isn't working
contents
bug
#812
opened Nov 19, 2024 by
sujee
1 of 2 tasks
[Bug] Cannot run KFP pipeline for fuzzy dedup with more than 100 actors
bug
Something isn't working
#803
opened Nov 16, 2024 by
cmadam
2 tasks done
[Feature] Create a 'User Feedback' section in discussions
enhancement
New feature or request
#802
opened Nov 14, 2024 by
sujee
1 of 2 tasks
[Feature] RAG: when saving DPK processed data into vector database, optionally save it in llama-index format
enhancement
New feature or request
#795
opened Nov 12, 2024 by
sujee
2 tasks done
[Feature] Modify pdf2parquet to accept a parquet file with the payload in the content column
enhancement
New feature or request
#792
opened Nov 11, 2024 by
touma-I
1 of 2 tasks
[Feature] add an example of html2pq in the documentation
documentation
Improvements or additions to documentation
#788
opened Nov 8, 2024 by
sujee
2 tasks done
Enable DPK on native windows and then add info to readme
med priority
simplify-DPK
#783
opened Nov 6, 2024 by
Bytes-Explorer
Rename the "Intro" notebooks to call out specific functionality it supports (PDF to Embedings)
#782
opened Nov 6, 2024 by
Bytes-Explorer
Pass parameters to modules in a way familiar to Python users/developers
enhancement
New feature or request
simplify-DPK
#776
opened Nov 5, 2024 by
shahrokhDaijavad
1 of 2 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.