Skip to content

Parsing, Ini matrix#9

Open
maxtretiakov wants to merge 11 commits intovmkononenko:devfrom
maxtretiakov:dev
Open

Parsing, Ini matrix#9
maxtretiakov wants to merge 11 commits intovmkononenko:devfrom
maxtretiakov:dev

Conversation

@maxtretiakov
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Owner

@vmkononenko vmkononenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Try to go further to your end goal and try to use your dataset classes from the tct_hmm.py script. It'll highlight and dictate some of the structural changes.
  2. Your commit messages need to be improved. Get familiar with these guidelines: https://gist.github.com/robertpainsi/b632364184e70900af4ab688decf6f53

Comment thread tools/DatasetExtraction/dataset.py Outdated
pass


class Iso_Dataset(Dataset):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first things first:

  1. Each child dataset implementation should have a separate class.
  2. Please decide which naming style are you going to use - if it's camel case then you'll have IsoDataset.

Comment thread tools/DatasetExtraction/dataset.py Outdated
class Iso_Dataset(Dataset):

# assume we have chord_id dictionary which maps Chords into ids
def get_hmm_matrix_i(self, chord_id, dir_path):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of having this function duplicated in each child class? We discussed earlier that this is a parent class job.

Comment thread tools/DatasetExtraction/dataset.py Outdated
# assume we have chord_id dictionary which maps Chords into ids
def get_hmm_matrix_i(self, chord_id, dir_path):
chord_id = {}
song_list = self.load(dir_path)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't you use songlist defined in parent?

Comment thread tools/DatasetExtraction/dataset.py Outdated
song_list = self.load(dir_path)
i_matrix = np.array(len(chord_id), dtype=float)
for song in song_list:
i_matrix[chord_id.get(song.chords_list[0].label)] += 1 #для каждого аккорда получили кол-во раз, когда аккорд являлся первым в песне
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use English for all comments/code/variable names etc.

Comment thread tools/DatasetExtraction/dataset.py Outdated
i_matrix = np.array(len(chord_id), dtype=float)
for song in song_list:
i_matrix[chord_id.get(song.chords_list[0].label)] += 1 #для каждого аккорда получили кол-во раз, когда аккорд являлся первым в песне
i_matrix = i_matrix/len(song_list) #оцениваем вероятность через частоту
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use English for all comments/code/variable names etc.

Comment thread tools/DatasetExtraction/dataset.py Outdated


if __name__ == '__main__':
Iso_Dataset().load("D:\\ChordsRecognition\\music-dsp\\tools\\DatasetExtraction\\test_chords_txt\\*.txt")
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no hardcoding of your dev machine specific stuff, no one needs it except of you

def __init__(self, chords_list):
self.chords_list = chords_list
try:
if chords_list[0].label == 'N':
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you removing "no chord" representation?

@@ -0,0 +1,62 @@
0.000000 2.612267 N
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far I understand these are your test files generated for checking correctness of the loading mechanism?
In this case we don't need them in the repo. Otherwise think through file/directory structure and come up with self explanatory naming. How's one supposed to know what dataset and song is represented by 1.txt?

Comment on lines +67 to +74
if label == 'Emin/4':
label = 'E:min/4'
elif label == 'A7/3':
label = 'A:7/3'
elif label == 'Bb7/3':
label = 'Bb:7/3'
elif label == 'Bb7/5':
label = 'Bb:7/5'
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this can all be done with a simple generic logic for all cases without hardcoding.

Comment on lines +57 to +61
splits = line.split()
if len(splits) == 3:
s = splits[0]
e = splits[1]
l = self.label_error_modify(splits[2])
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this types of parsing always deserves a comment. Just give an example of a string to be parsed and highlight what is s, e and l. I had to go and look up dataset format to find out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants