Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing duration files for extract-alignments #576

Open
SamuelLarkin opened this issue Nov 1, 2024 · 2 comments
Open

Missing duration files for extract-alignments #576

SamuelLarkin opened this issue Nov 1, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@SamuelLarkin
Copy link
Collaborator

SamuelLarkin commented Nov 1, 2024

Bug description

According the aligner's preprocess, running python -m everyvoice.model.aligner.DeepForcedAligner.dfaligner.cli preprocess config/everyvoice-aligner.yaml should generate all require files to do python -m everyvoice.model.aligner.DeepForcedAligner.dfaligner.cli extract-alignments config/everyvoice-aligner.yaml --no-predict. The duration files are not generated.

If they aren't generated because sox is not installed, a clear warning should be displayed.

How to reproduce the bug

Preprocess

python -m everyvoice.model.aligner.DeepForcedAligner.dfaligner.cli \
  preprocess \
  config/everyvoice-aligner.yaml

Extract Alignments

python -m everyvoice.model.aligner.DeepForcedAligner.dfaligner.cli \
  extract-alignments \
  config/everyvoice-aligner.yaml \
  --no-predict

Error messages and logs

python -m everyvoice.model.aligner.DeepForcedAligner.dfaligner.cli \
  extract-alignments \
  config/everyvoice-aligner.yaml \
  --no-predict
2024-11-01 12:16:57.455 | INFO     | __main__:extract_alignments:144 - Loading modules for alignment...
  0%|                                                                                     | 0/5000 [00:00<?, ?it/s]
╭─────────────────────────────────────── Traceback (most recent call last) ───────────────────────────────────────╮
│ /fs/hestia_Hnrc/ict/sam037/git/EveryVoice/everyvoice/model/aligner/DeepForcedAligner/dfaligner/cli.py:211 in    │
│ extract_alignments                                                                                              │
│                                                                                                                 │
│   208 │   │   speaker = item["speaker"]                                                                         │
│   209 │   │   language = item["language"]                                                                       │
│   210 │   │   tokens = item["tokens"].cpu()                                                                     │
│ ❱ 211 │   │   pred = np.load(                                                                                   │
│   212 │   │   │   save_dir                                                                                      │
│   213 │   │   │   / "duration"                                                                                  │
│   214 │   │   │   / SEP.join([basename, speaker, language, "duration.npy"])                                     │
│                                                                                                                 │
│ /home/sam037/.conda/envs/EveryVoice.sl/lib/python3.10/site-packages/numpy/lib/npyio.py:427 in load              │
│                                                                                                                 │
│    424 │   │   │   fid = file                                                                                   │
│    425 │   │   │   own_fid = False                                                                              │
│    426 │   │   else:                                                                                            │
│ ❱  427 │   │   │   fid = stack.enter_context(open(os_fspath(file), "rb"))                                       │
│    428 │   │   │   own_fid = True                                                                               │
│    429 │   │                                                                                                    │
│    430 │   │   # Code to distinguish from NumPy binary files and pickles.                                       │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
FileNotFoundError: [Errno 2] No such file or directory:
'/gpfs/fs3c/nrc/dt/sam037/exp/EveryVoice/tiny.lj/263_wrong_checkpoint/preprocessed/duration/LJ008-0036--speaker_0--
eng--duration.npy'
Loading EveryVoice modules: 100%|████████████████████████████████████████████████████| 6/6 [00:32<00:00,  5.38s/it]

Environment

Current environment
#- EveryVoice Version:
#- PyTorch Lightning Version (e.g., 2.4.0):
#- PyTorch Version (e.g., 2.4):
#- Python version (e.g., 3.12):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed EveryVoice (`conda`, `pip`, source):

More info

Help Message

python -m everyvoice.model.aligner.DeepForcedAligner.dfaligner.cli preprocess --help

 Usage: python -m everyvoice.model.aligner.DeepForcedAligner.dfaligner.cli preprocess
            [OPTIONS] CONFIG_FILE

 ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
 ┃                                                Preprocess Help                                                ┃
 ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
 This command will preprocess all of the data you need for use with DeepForcedAligner. For example:

 dfaligner preprocess config/everyvoice-aligner.yaml

╭─ Arguments ─────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *    config_file      FILE  The path to your model configuration file.                                          │
│                             [default: None]                                                                     │
│                             [required]                                                                          │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --steps        -s      [audio|spec|text]  Which steps of the preprocessor to use. If none are provided, all     │
│                                           steps will be performed.                                              │
│                                           [default: audio, spec, text]                                          │
│ --config-args  -c      TEXT               Override the configuration.                                           │
│                                           [default: None]                                                       │
│ --cpus         -C      INTEGER            How many CPUs to use when preprocessing                               │
│                                           [default: 4]                                                          │
│ --overwrite    -O                         Redo all preprocessing, even if files already exist and aren't        │
│                                           expected to change.                                                   │
│ --debug        -D                         Enable debugging.                                                     │
│ --help         -h                         Show this message and exit.                                           │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
@SamuelLarkin SamuelLarkin added the bug Something isn't working label Nov 1, 2024
@joanise
Copy link
Member

joanise commented Nov 1, 2024

I also saw the same problem when trying to test #565: I was unable to create a DFA model to see the changes from #565 in action.

@joanise
Copy link
Member

joanise commented Nov 1, 2024

And while the help messages are already clearer with EveryVoiceTTS/DeepForcedAligner#26 merged in, this issue highlights that some more improvement might still be required: it should be clear how to use dfaligner from its help messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants