Skip to content

README Mismatch, Missing Dependency & Permission Errors #4

@FrankLiu1102

Description

@FrankLiu1102

1. README and Code Structure Mismatch

  • README refers to training/, but the actual directory is ds_training/.
  • generate_all_train_datasets.sh is mentioned in README but does not exist. Instead, generate_all_train_datasets_v1.sh and generate_all_train_datasets_v2.sh are present.

2. Missing Dependency (sentencepiece)

  • LlamaTokenizer requires sentencepiece, but it is not listed in requirements.txt.
  • Running the code without it causes an ImportError.

3. Default /output Directory Causes Permission denied

  • Scripts attempt to write to /output, which requires root access.
  • Users without root permissions encounter Permission denied errors.

These issues prevent users from running the code without modifications.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions