Skip to content

LimDoHyeon/pncc_asr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pncc_asr

An implementation of "Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition(Chanwoo Kim, 2016)".

This is unofficial code. Original code is implemented by C (please see the paper).

Usage

  • This code is based on torchaudio; the audio data have to be loaded by torchaudio.
  • All parameters follow the origianl paper.
  • You can use GPU computation in this code.

First, Clone this repository and install requirements.

cd ~
git clone https://github.com/LimDoHyeon/pncc_asr.git
cd pncc_asr
pip install -r requirements.txt

And use this function:

def pncc(
    wav: torch.Tensor,
    sr: int,
    n_fft: int = 1024,
    win_length: int = 400,
    hop_length: int = 160,
    n_ch: int = 40,
    f_min: int = 200,
    f_max: int = 8000,
    n_ceps: int = 13,
) -> torch.Tensor:
import torchaudio
import pncc_asr as PNCC

y, sr = torchaudio.load('your_audiofile.wav')  # or you can use dataloader
pncc = PNCC.pncc(y, sr)

Author: Do-Hyeon Lim

About

An implementation of Power-Normalized Cepstral Coefficients(PNCC) (supports GPU computation)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages