✋ [Boostcamp-AI-Tech-Level3] HiBoostCamp ✋

최종 프로젝트 목표

피아노 연주 영상에서 시각 정보만으로 악보 및 음악 파일 생성

개발 배경 및 필요성

온라인 영상을 보고 동일한 곡을 연주하고 싶으나 악보가 없어 어려움이 있음
기존 악보 생성 모델들은 음원에 노이즈가 많거나 여러 악기가 함께 녹음된 경우 사용할 수 없다는 단점이 존재

해결 하려는 과제

Multi-Label Classification으로 각 프레임에서 연주된 건반 예측 ( Piano Roll )
예측된 건반을 토대로 악보 생성!

Model & Training

1. Dataset

1.1. 영상 수집

유튜브에서 top-view 피아노 연주 연상 수집

1.2. Labeling

Onset and Frames를 이용하여 프레임별 연주된 건반을 Pseudo-labeling

1.3. Pre-Processing

유튜브 영상에서 키보드 영역만 추출하여 Input Images 구성

2. Model

reference : Audeo: Audio Generation for a Silent Performance Video (Su et al., NeurIPS 2020)

정확한 건반의 눌림 유무 파악을 위해 타겟 프레임과 전/후 각 2 프레임씩 포함하여 총 5 프레임을 입력으로 사용

Video2RollNet	타겟 프레임의 눌러진 키를 추론하여 Piano Roll (5, 88) 출력
Roll2MidiNet	Piano Roll을 보완하여 Pseudo MIDI (5, 88) 출력 (좀 더 깔끔한 Piano Roll 출력)

3. Loss & Metric

F1 - Score 사용

$$ F1 = \frac{2 \times \text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} $$

Serving Process

Baseline 모델

Su, Kun, Xiulong Liu, and Eli Shlizerman. "Audeo: Audio generation for a silent performance video." Advances in Neural Information Processing Systems 33 (2020): 3325-3337.
Gan, Chuang, et al. "Foley music: Learning to generate music from videos." Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16. Springer International Publishing, 2020.

사용 중인 데이터 셋

Repository 구조

Repository 는 다음과 같은 구조로 구성되어있습니다.


├── models
|      ├── make_wav.py
|      ├── roll_to_midi.py
|      └── video_to_roll.py
├── server_training
|      ├── dataset
|      ├── model
|      ├── tools
|      ├── trainer
|      ├── util
|      ├── notebooks
|            ...
├── README.md
├── frontend.py
├── game.py
├── generate_score.py
├── midi_file.py
├── preprocess.py
├── inferenec.py
└── process.py

역할 분담

역할	담당 개발자
Data Collection, Audeo Model Experiment, Research	이종목
Data Collection, Foley Model Experiment, Research	정성혜
데이터 전처리 개발 및 최적화, 악보 생성 파트 개발, Multi-Modal 실험	강나훈
frontend / backend	김근욱
Frontend Application 개발, Git Action CD 개발	김희상

Contributors?

_강나훈

_이종목

_김희상

_김근욱

_정성혜

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
.github/workflows		.github/workflows
app		app
images		images
models		models
server_training		server_training
.gitignore		.gitignore
README.md		README.md
frontend.py		frontend.py
game.py		game.py
generate_score.py		generate_score.py
inference.py		inference.py
midi_file.py		midi_file.py
poetry.lock		poetry.lock
preprocess.py		preprocess.py
process.py		process.py
pyproject.toml		pyproject.toml
streamlit_run.py		streamlit_run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✋ [Boostcamp-AI-Tech-Level3] HiBoostCamp ✋

최종 프로젝트 목표

피아노 연주 영상에서 시각 정보만으로 악보 및 음악 파일 생성

개발 배경 및 필요성

해결 하려는 과제

Model & Training

1. Dataset

1.1. 영상 수집

1.2. Labeling

1.3. Pre-Processing

2. Model

3. Loss & Metric

Serving Process

Baseline 모델

사용 중인 데이터 셋

Repository 구조

역할 분담

Contributors?

About

Releases

Packages

Contributors 4

Languages

boostcampaitech5/level3_cv_finalproject-cv-08

Folders and files

Latest commit

History

Repository files navigation

✋ [Boostcamp-AI-Tech-Level3] HiBoostCamp ✋

최종 프로젝트 목표

피아노 연주 영상에서 시각 정보만으로 악보 및 음악 파일 생성

개발 배경 및 필요성

해결 하려는 과제

Model & Training

1. Dataset

1.1. 영상 수집

1.2. Labeling

1.3. Pre-Processing

2. Model

3. Loss & Metric

Serving Process

Baseline 모델

사용 중인 데이터 셋

Repository 구조

역할 분담

Contributors?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages