You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I ran BLISS your experiment with the pretrained embedding download from ttps://dl.fbaipublicfiles.com/fasttext/vectors-wiki/wiki.en.vec (English and Tamil) for two languages. It is working great. I gettting good result.
After this I am getting "0" precisions for some languages, for some languages error: Dimension out of range (expected to be in range of [-1, 0], but got 1)
Here is the one of the exection results,
Loading faiss with AVX2 support.
2019-11-20 09:53:10,122: INFO: data params
data_dir: ./muse_data/
languages: [{'filename': 'wiki.en.vec', 'name': 'en'}, {'filename': 'ta.vec', 'name': 'ta'}]
mean_center: False
mode: rand
output_dir: ./output/en-ta
supervised: {'fname': 'en-ta.0-5000.txt', 'max_freq': -1}
unit_norm: True
unsupervised: True
save_dir: ./output/en-ta/run-13
2019-11-20 09:53:10,123: INFO: generator parameters
embed_dim: 300
init: eye
2019-11-20 09:53:10,123: INFO: discriminator parameters
dropout_prob: 0.1
embed_dim: 300
hidden_dim: 2048
max_freq: 75000
2019-11-20 09:53:10,123: INFO: GAN parameters
src: en
tgt: ta
2019-11-20 09:53:10,123: INFO: Training Parameters
batch_sz: 32
epochs: 200
eval_batches: 500
factor: {'ortho': 1.0, 'sup': 1.0, 'unsup': 1.0}
iters_per_epoch: 5000
k: 10
log_after: 500
lr_decay: 0.98
lr_local_dk: 0.5
num_disc_rounds: 5
num_gen_rounds: 1
num_nbrs: 100000
num_supervised_rounds: 1
opt: SGD
opt_params: {'lr': 0.1}
ortho_params: {'ortho_type': 'none'}
orthogonal: auto_loss
patience: 2
procrustes_dict_size: 0
procrustes_iters: 3
procrustes_tgt_rank: 15000
procrustes_thresh: 0.0
smoothing: 0.1
eval_metric: unsupervised
sup_opt: SGD
supervised_method: rcsls
2019-11-20 09:53:16,774: INFO: Unit Norming
2019-11-20 09:53:17,026: INFO: Unit Norming
Traceback (most recent call last):
File "/home/bharaj/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/bharaj/anaconda3/lib/python3.6/runpy.py", line 85, in run_code
exec(code, run_globals)
File "/home/bharaj/third_work/BLISS/bliss/main.py", line 86, in
lang.load(w['filename'], data_dir, max_freq=75000)
File "/home/bharaj/third_work/BLISS/bliss/data/data.py", line 104, in load
self.embeddings.div(self.embeddings.norm(2, 1, keepdim=True))
File "/home/bharaj/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 253, in norm
return torch.norm(self, p, dim, keepdim, dtype=dtype)
File "/home/bharaj/anaconda3/lib/python3.6/site-packages/torch/functional.py", line 705, in norm
return torch._C._VariableFunctions.norm(input, p, dim, keepdim=keepdim)
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
The text was updated successfully, but these errors were encountered:
Hi,
I ran BLISS your experiment with the pretrained embedding download from ttps://dl.fbaipublicfiles.com/fasttext/vectors-wiki/wiki.en.vec (English and Tamil) for two languages. It is working great. I gettting good result.
But when I try to create my embedding by downloading wikidump xml (https://dumps.wikimedia.org/tawiki/latest/tawiki-latest-pages-articles.xml.bz2),
extracting to text (using http://wiki.apertium.org/wiki/Wikipedia_Extractor),
train word embedding using fasttext (https://fasttext.cc/docs/en/unsupervised-tutorial.html) [$ ./fasttext skipgram -input input_data_loc/wikidump_ta.txt -output result/ta -dim 300]
After this I am getting "0" precisions for some languages, for some languages error: Dimension out of range (expected to be in range of [-1, 0], but got 1)
Here is the one of the exection results,
Loading faiss with AVX2 support.
2019-11-20 09:53:10,122: INFO: data params
data_dir: ./muse_data/
languages: [{'filename': 'wiki.en.vec', 'name': 'en'}, {'filename': 'ta.vec', 'name': 'ta'}]
mean_center: False
mode: rand
output_dir: ./output/en-ta
supervised: {'fname': 'en-ta.0-5000.txt', 'max_freq': -1}
unit_norm: True
unsupervised: True
save_dir: ./output/en-ta/run-13
2019-11-20 09:53:10,123: INFO: generator parameters
embed_dim: 300
init: eye
2019-11-20 09:53:10,123: INFO: discriminator parameters
dropout_prob: 0.1
embed_dim: 300
hidden_dim: 2048
max_freq: 75000
2019-11-20 09:53:10,123: INFO: GAN parameters
src: en
tgt: ta
2019-11-20 09:53:10,123: INFO: Training Parameters
batch_sz: 32
epochs: 200
eval_batches: 500
factor: {'ortho': 1.0, 'sup': 1.0, 'unsup': 1.0}
iters_per_epoch: 5000
k: 10
log_after: 500
lr_decay: 0.98
lr_local_dk: 0.5
num_disc_rounds: 5
num_gen_rounds: 1
num_nbrs: 100000
num_supervised_rounds: 1
opt: SGD
opt_params: {'lr': 0.1}
ortho_params: {'ortho_type': 'none'}
orthogonal: auto_loss
patience: 2
procrustes_dict_size: 0
procrustes_iters: 3
procrustes_tgt_rank: 15000
procrustes_thresh: 0.0
smoothing: 0.1
eval_metric: unsupervised
sup_opt: SGD
supervised_method: rcsls
2019-11-20 09:53:16,774: INFO: Unit Norming
2019-11-20 09:53:17,026: INFO: Unit Norming
Traceback (most recent call last):
File "/home/bharaj/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/bharaj/anaconda3/lib/python3.6/runpy.py", line 85, in run_code
exec(code, run_globals)
File "/home/bharaj/third_work/BLISS/bliss/main.py", line 86, in
lang.load(w['filename'], data_dir, max_freq=75000)
File "/home/bharaj/third_work/BLISS/bliss/data/data.py", line 104, in load
self.embeddings.div(self.embeddings.norm(2, 1, keepdim=True))
File "/home/bharaj/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 253, in norm
return torch.norm(self, p, dim, keepdim, dtype=dtype)
File "/home/bharaj/anaconda3/lib/python3.6/site-packages/torch/functional.py", line 705, in norm
return torch._C._VariableFunctions.norm(input, p, dim, keepdim=keepdim)
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
The text was updated successfully, but these errors were encountered: