diff --git a/FAQ.md b/FAQ.md index 081cb50..ebda5db 100644 --- a/FAQ.md +++ b/FAQ.md @@ -1,6 +1,6 @@ --- layout: page -title: FAQ +title: Frequently asked questions order: 4 --- @@ -10,73 +10,100 @@ order: 4 ## General -### Why neural machine translation (NMT)? +### Why should I use OpenNMT? -We care about neural machine translation for several reasons. +OpenNMT has been successfully used in [many research and industry applications](/publications). It is now a robust and complete ecosystem for machine translation. -1) Results show that NMT produces automatic translations that are -significantly preferred by humans to other machine translation -outputs. This has led several companies to switch NMT-based -translation systems. +Here are some reasons to use OpenNMT: -2) Similar methods (often called seq2seq) are also effective for many -other NLP and language-related applications such as dialogue, image -captioning, and summarization. A recent talk from HarvardNLP describing some of these recent -advances is available here. +* The 2 supported implementations, [OpenNMT-py](https://github.com/OpenNMT/OpenNMT-py) and [OpenNMT-tf](https://github.com/OpenNMT/OpenNMT-tf), give the choice between [PyTorch](https://pytorch.org/) and [TensorFlow](https://www.tensorflow.org/) which are 2 of the most popular deep learning toolkits. +* The implementations are packed with many configurable models and features including for other generation tasks: summarization, speech to text, image to text, etc. +* The [OpenNMT ecosystem](https://github.com/OpenNMT) includes tools and projects to cover the full NMT workflow: from advanced tokenization to training automation. +* The NMT systems are able to reach high translation quality and speed, on par with the best online translation offerings. +* You get reactive and free support on the [community forum](http://forum.opennmt.net/). -3) NMT has been used as a representative application of the recent -success of deep learning-based artificial intelligence. For instance a -recent NYT -magazine cover story focused on Google's NMT system. +### Who is behind OpenNMT? +OpenNMT is currently supported by the companies [SYSTRAN](http://www.systransoft.com/) and [Ubiqus](https://www.ubiqus.com/). The main maintainers are: -### Where can I go to learn about NMT? +* Guillaume Klein (SYSTRAN) +* Vincent Nguyen (Ubiqus) +* Jean Senellart (SYSTRAN) -We recommend starting with the ACL'16 NMT tutorial -produced by researchers at Stanford and NYU. The tutorial also -includes a detailed bibliography describing work in the field. +The project was initiated in December 2016 by the [Harvard NLP](https://nlp.seas.harvard.edu/) group and SYSTRAN. -### What do I need to train an NMT model? +### Which OpenNMT implementation should I use? -You just need two files: a source file and a target file. Each with -one sentence per line with words space separated. These files can come from -standard free translation corpora such a WMT, or it can be any other sources -you want to train from. +OpenNMT has 2 main implementations: [OpenNMT-py](https://github.com/OpenNMT/OpenNMT-py) and [OpenNMT-tf](https://github.com/OpenNMT/OpenNMT-tf). +You first need to decide between [PyTorch](https://pytorch.org/) and [TensorFlow](https://www.tensorflow.org/). Both frameworks have strengths and weaknesses, see which one is more suited and easier for you to integrate. -## Installation / Setup +Then, each OpenNMT implementation has its own design and set of unique features. For example OpenNMT-py has better support for other tasks (summarization, speech, image) and is generally faster while OpenNMT-tf supports modular architectures and language modeling. See their respective GitHub repository for more details. + +### How to cite the OpenNMT project? + +If you are using the project for academic work, please cite the initial [system demonstration paper](https://www.aclweb.org/anthology/P17-4012): + +``` +@inproceedings{klein-etal-2017-opennmt, + title = "{O}pen{NMT}: Open-Source Toolkit for Neural Machine Translation", + author = "Klein, Guillaume and + Kim, Yoon and + Deng, Yuntian and + Senellart, Jean and + Rush, Alexander", + booktitle = "Proceedings of {ACL} 2017, System Demonstrations", + month = jul, + year = "2017", + address = "Vancouver, Canada", + publisher = "Association for Computational Linguistics", + url = "https://www.aclweb.org/anthology/P17-4012", + pages = "67--72", +} +``` + +### Where can I get support? + +The [OpenNMT forum](http://forum.opennmt.net/) is the place to go for asking general questions or requesting support. You can usually expect a reply within 1 or 2 work days. + +For bugs, please report them on the GitHub directly. + +### Where can I go to learn about NMT? + +We recommend starting with the [ACL'16 NMT](https://sites.google.com/site/acl16nmt/home) produced by researchers at Stanford and NYU. The tutorial also includes a detailed bibliography describing work in the field. + +## Requirements + +### What do I need to train a NMT model? + +You just need two files: a source file and a target file. Each with one sentence per line with words that are space separated. These files can come from standard free translation corpora such a WMT, or it can be any other sources you want to train from. ### What type of computer do I need to train with? -While in theory you can train on any machine; in practice for all but -trivally small data sets you will need a GPU that supports CUDA if you -want training to finish in a reasonable amount of time. For -medium-size models you will need at least 4GB; for full-size -state-of-the-art models 8-12GB is recommend. +While in theory you can train on any machine, in practice for all but trivally small data sets you will need a GPU that supports CUDA if you want training to finish in a reasonable amount of time. -## Models +We recommend a GPU with at least 8GB of memory. -### How can I replicate the full-scale NMT translation results? +## Models -We have posted a complete tutorial for training a German-to-English translation system on standard data. +### How can I replicate the full-scale NMT translation results? +We published a [baseline training script](https://github.com/OpenNMT/OpenNMT-tf/tree/master/scripts/wmt) to train a robust German-English translation model using OpenNMT-tf. If you are using OpenNMT-py, you can apply the same preprocessing and train a Transformer model with the [recommended options](http://opennmt.net/OpenNMT-py/FAQ.html#how-do-i-use-the-transformer-model-do-you-support-multi-gpu). ### Are there pretrained translation models that I can try? -There are several different pretrained models available on the pretrained models page for the Torch version. +There are several pretrained models available for [OpenNMT-py](/Models-py) and [OpenNMT-tf](/Models-tf). -### Where can I get training data for translation from X-to-X? +### Where can I get training data for translation? -Try the OPUS Project. An open-source collection of parallel corpora. After stripping XML tags, you should be able to use the raw files directly in OpenNMT. +Try the [OPUS](http://opus.lingfil.uu.se/) project. An open-source collection of parallel corpora. After stripping XML tags, you should be able to use the raw files directly in OpenNMT. ### I am interested in other seq2seq-like problems such as summarization, dialogue, tree-generation. Can OpenNMT work for these? -Yes. OpenNMT is a general-purpose attention-based seq2seq system. There is very little code that is translation specific, and so it should be effective for many of these applications. +Yes. OpenNMT is a general-purpose attention-based seq2seq system. There is very little code that is translation specific, and so it should be effective for many of these applications. -For the case of summarization, OpenNMT has been shown to be more effective than neural systems like NAMAS, and will be supported going forward. See the models page for a pretrained summarization system on the Gigaword dataset. +For the case of summarization, OpenNMT has been shown to be more effective than neural systems like [NAMAS](https://github.com/facebook/NAMAS), and will be supported going forward. See the [OpenNMT-py models](/Models-py) page for a pretrained summarization system on the Gigaword dataset. -### I am interested in variants of seq2seq such as image-to-sequence generation. Can OpenNMT work for these? +### I am interested in variants of seq2seq such as image-to-text generation. Can OpenNMT work for these? -Yes. As an example, we have implemented a relatively general-purpose im2text system, with a small amount of additional code. Feel free to use this as a model for extending OpenNMT. +Yes. OpenNMT-py includes a relatively general-purpose [im2text](http://opennmt.net/OpenNMT-py/im2text.html) system, with a small amount of additional code. Feel free to use this as a model for extending OpenNMT. diff --git a/LICENSE.md b/LICENSE similarity index 93% rename from LICENSE.md rename to LICENSE index c344d14..983fa23 100644 --- a/LICENSE.md +++ b/LICENSE @@ -1,9 +1,9 @@ -# Released under MIT License +MIT License -Copyright (c) 2013 Mark Otto. +Copyright (c) 2019 The OpenNMT authors. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. \ No newline at end of file +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/Models-py.md b/Models-py.md index ba38163..1b95ca1 100644 --- a/Models-py.md +++ b/Models-py.md @@ -1,58 +1,74 @@ --- layout: page -title: PyTorch Models -order: 3 +title: OpenNMT-py models --- -## Pretrained - - -Available models trained using OpenNMT. - -* [German to English](http://lstm.seas.harvard.edu/latex/opennmt-py-models/translate/de-en/baseline-brnn2.s131_acc_62.71_ppl_7.74_e20.pt) -* [English Summarization](http://lstm.seas.harvard.edu/latex/opennmt-py-models/summary/model-copy_acc_51.78_ppl_11.71_e20.pt) -* [Chinese Summarization](http://lstm.seas.harvard.edu/latex/opennmt-py-models/summary/LCSTS/model_acc_56.86_ppl_10.97_e11.pt) -* [Dialog System](http://lstm.seas.harvard.edu/latex/opennmt-py-models/dialog/model_acc_39.74_ppl_26.63_e13.pt) - -## Benchmarks - -This page benchmarks training results of open-source NMT systems with generated models of OpenNMT and other systems. - -### English-> German (WMT) - - -| Who/When | Corpus Prep | Training Tool | Training Parameters | Server Details | Training Time/Memory | Translation Parameters | Scores | Model | -|:------------- |:--------------- |:-------------|:-------------------|:---------------|:-------------|:------------|:------|:-----| -| 2018/03/15
Baseline | [WMT](https://s3.amazonaws.com/opennmt-trainingdata/wmt_ende_sp.tar.gz) | OpenNMT | 6 layers, LSTM 512, BPE, Transformer | | | [here](http://opennmt.net/OpenNMT-py/FAQ.html#how-do-i-use-the-transformer-model) | BLEU Score: WMT14 26.89 WMT17: 28.09 | [here](https://s3.amazonaws.com/opennmt-models/transformer-ende-wmt-pyOnmt.tar.gz) | - -### German->English - -| Who/When | Corpus Prep | Training Tool | Training Parameters | Server Details | Training Time/Memory | Translation Parameters | Scores | Model | -|:------------- |:--------------- |:-------------|:-------------------|:---------------|:-------------|:------------|:------|:-----| -| 2018/02/11
Baseline | [IWSLT '14 DE-EN](https://github.com/pytorch/fairseq/blob/e734b0fa58fcf02ded15c236289b3bd61c4cffdf/data/prepare-iwslt14.sh) | OpenNMT `d4ab35a` | 2 layers, LSTM 500, WE 500, encoder_type brnn input feed
20 epochs | Trained on 1 GPU TITAN X | | | BLEU Score: 30.33 | 203MB [here](https://s3.amazonaws.com/opennmt-models/iwslt-brnn2.s131_acc_62.71_ppl_7.74_e20.pt) | - -### English Summarization - -| Who/When | Corpus Prep | Training Tool | Training Parameters | Server Details | Training Time/Memory | Translation Parameters | Scores | Model | -|:------------- |:--------------- |:-------------|:-------------------|:---------------|:-------------|:------------|:------|:-----| -| 2018/02/11
Baseline | [Gigaword Standard](https://github.com/harvardnlp/sent-summary) | OpenNMT `d4ab35a` | 2 layers, LSTM 500, WE 500, input feed
20 epochs | Trained on 1 GPU TITAN X | | | Gigaword F-Score R1: 33.60 R2: 16.29 RL: 31.45 | 331MB [here](https://s3.amazonaws.com/opennmt-models/gigaword_nocopy_acc_51.33_ppl_12.74_e20.pt) | -| 2018/02/22
Baseline | [Gigaword Standard](https://github.com/harvardnlp/sent-summary) | OpenNMT `338b3b1` | 2 layers, LSTM 500, WE 500, input feed, copy_attn, reuse_copy_attn
20 epochs | Trained on 1 GPU TITAN X | | replace_unk | Gigaword F-Score R1: 35.51 R2: 17.35 RL: 33.17 | 331MB [here](https://s3.amazonaws.com/opennmt-models/gigaword_copy_acc_51.78_ppl_11.71_e20.pt) | - - - | Who/When | Corpus Prep | Training Tool | Training Parameters | Server Details | Training Time/Memory | Translation Parameters | Scores | Model | -|:------------- |:--------------- |:-------------|:-------------------|:---------------|:-------------|:------------|:------|:-----| -| 2018/03/20
| [CNN/Daily Mail](https://github.com/harvardnlp/sent-summary) | OpenNMT | Transfomer 6x512 | Trained on 1 GPU TITAN X | | [here](http://opennmt.net/OpenNMT-py/Summarization.html) | Gigaword F-Score R1: R2: RL: | 1.1GB [here](https://s3.amazonaws.com/opennmt-models/sum_transformer_model_acc_57.25_ppl_9.22_e16.pt) | -| 2018/03/20
| [CNN/Daily Mail](https://github.com/harvardnlp/sent-summary) | OpenNMT | 1 layers BiLSTM 512 | Trained on 1 GPU TITAN X | | | Gigaword F-Score R1: 39.12 R2: 17.35 RL: 36.12 | 900MB [here](https://s3.amazonaws.com/opennmt-models/ada6_bridge_oldcopy_tagged_larger_acc_54.84_ppl_10.58_e17.pt) | - -### Chinese Summarization - -| Who/When | Corpus Prep | Training Tool | Training Parameters | Server Details | Training Time/Memory | Translation Parameters | Scores | Model | -|:------------- |:--------------- |:-------------|:-------------------|:---------------|:-------------|:------------|:------|:-----| -|[playma](https://github.com/playma) 2018/02/25 | [LCSTS](http://icrc.hitsz.edu.cn/Article/show/139.html)
src_vocab_size 8000, tgt_vocab_size 8000, src_seq_length 400, tgt_seq_length 30, src_seq_length_trunc 400, tgt_seq_length_trunc 100 | OpenNMT `338b3b1` | 1 layer, LSTM 300, WE 500, encoder_type brnn, input feed
AdaGrad, adagrad_accumulator_init 0.1, learning_rate 0.15
30 epochs | | | | Gigaword F-Score R1: 35.67 R2: 23.06 RL: 33.14 | 99MB [here](https://s3.amazonaws.com/opennmt-models/lcsts_acc_56.86_ppl_10.97_e11.pt) | - -### Dialog System - -| Who/When | Corpus Prep | Training Tool | Training Parameters | Server Details | Training Time/Memory | Translation Parameters | Scores | Model | -|:------------- |:--------------- |:-------------|:-------------------|:---------------|:-------------|:------------|:------|:-----| -| 2018/02/22
Baseline | [Opensubtitles](http://opus.lingfil.uu.se/download.php?f=OpenSubtitles/en.tar.gz) | OpenNMT `338b3b1` | 2 layers, LSTM 500, WE 500, input feed, dropout 0.2, global_attention mlp, start_decay_at 7
13 epochs | Trained on 1 GPU TITAN X | | | TBD | 355MB [here](https://s3.amazonaws.com/opennmt-models/dialog_acc_39.74_ppl_26.63_e13.pt) | - +This page lists pretrained models for OpenNMT-py. + +* TOC +{:toc} + +## Translation + +{:.pretrained} +| | English-German - Transformer ([download](https://s3.amazonaws.com/opennmt-models/transformer-ende-wmt-pyOnmt.tar.gz)) | +| --- | --- | +| Configuration | Base Transformer configuration with standard [training options](http://opennmt.net/OpenNMT-py/FAQ.html#how-do-i-use-the-transformer-model-do-you-support-multi-gpu) | +| Data | [WMT](https://s3.amazonaws.com/opennmt-trainingdata/wmt_ende_sp.tar.gz) with shared SentencePiece model | +| BLEU | newstest2014 = 26.89
newstest2017 = 28.09 | + +{:.pretrained} +| | German-English - 2-layer BiLSTM ([download](https://s3.amazonaws.com/opennmt-models/iwslt-brnn2.s131_acc_62.71_ppl_7.74_e20.pt)) | +| --- | --- | +| Configuration | 2-layer BiLSTM with hidden size 500 trained for 20 epochs | +| Data | [IWSLT '14 DE-EN](https://github.com/pytorch/fairseq/blob/e734b0fa58fcf02ded15c236289b3bd61c4cffdf/data/prepare-iwslt14.sh) | +| BLEU | 30.33 | + +## Summarization + +### English + +{:.pretrained} +| | 2-layer LSTM ([download](https://s3.amazonaws.com/opennmt-models/gigaword_nocopy_acc_51.33_ppl_12.74_e20.pt)) | +| --- | --- | +| Configuration | 2-layer LSTM with hidden size 500 trained for 20 epochs | +| Data | [Gigaword standard](https://github.com/harvardnlp/sent-summary) | +| Gigaword F-Score | R1 = 33.60
R2 = 16.29
RL = 31.45 | + +{:.pretrained} +| | 2-layer LSTM with copy attention ([download](https://s3.amazonaws.com/opennmt-models/gigaword_copy_acc_51.78_ppl_11.71_e20.pt)) | +| --- | --- | +| Configuration | 2-layer LSTM with hidden size 500 and copy attention trained for 20 epochs | +| Data | [Gigaword standard](https://github.com/harvardnlp/sent-summary) | +| Gigaword F-Score | R1 = 35.51
R2 = 17.35
RL = 33.17 | + +{:.pretrained} +| | Transformer ([download](https://s3.amazonaws.com/opennmt-models/sum_transformer_model_acc_57.25_ppl_9.22_e16.pt)) | +| --- | --- | +| Configuration | See OpenNMT-py [summarization example](http://opennmt.net/OpenNMT-py/Summarization.html) | +| Data | [CNN/Daily Mail](https://github.com/harvardnlp/sent-summary) | + +{:.pretrained} +| | 1-layer BiLSTM ([download](https://s3.amazonaws.com/opennmt-models/ada6_bridge_oldcopy_tagged_larger_acc_54.84_ppl_10.58_e17.pt)) | +| --- | --- | +| Configuration | See OpenNMT-py [summarization example](http://opennmt.net/OpenNMT-py/Summarization.html) | +| Data | [CNN/Daily Mail](https://github.com/harvardnlp/sent-summary) | +| Gigaword F-Score | R1 = 39.12
R2 = 17.35
RL = 36.12 | + +### Chinese + +{:.pretrained} +| | 1-layer BiLSTM ([download](https://s3.amazonaws.com/opennmt-models/lcsts_acc_56.86_ppl_10.97_e11.pt)) | +| --- | --- | +| Author | [playma](https://github.com/playma) | +| Configuration | **Preprocessing options:** src_vocab_size 8000, tgt_vocab_size 8000, src_seq_length 400, tgt_seq_length 30, src_seq_length_trunc 400, tgt_seq_length_trunc 100.
**Training options:** 1 layer, LSTM 300, WE 500, encoder_type brnn, input feed, AdaGrad, adagrad_accumulator_init 0.1, learning_rate 0.15, 30 epochs | +| Data | [LCSTS](http://icrc.hitsz.edu.cn/Article/show/139.html) | +| Gigaword F-Score | R1 = 35.67
R2 = 23.06
RL = 33.14 | + +## Dialog + +{:.pretrained} +| | 2-layer LSTM ([download](https://s3.amazonaws.com/opennmt-models/dialog_acc_39.74_ppl_26.63_e13.pt)) | +| --- | --- | +| Configuration | 2 layers, LSTM 500, WE 500, input feed, dropout 0.2, global_attention mlp, start_decay_at 7, 13 epochs | +| Data | [OpenSubtitles](http://opus.lingfil.uu.se/download.php?f=OpenSubtitles/en.tar.gz) | diff --git a/Models-tf.md b/Models-tf.md new file mode 100644 index 0000000..1ebb9de --- /dev/null +++ b/Models-tf.md @@ -0,0 +1,15 @@ +--- +layout: page +title: OpenNMT-tf models +--- + +This page lists pretrained models for OpenNMT-tf. + +## Translation + +{:.pretrained} +| | English-German - Transformer ([checkpoint](https://s3.amazonaws.com/opennmt-models/averaged-ende-ckpt500k.tar.gz), [SavedModel](https://s3.amazonaws.com/opennmt-models/averaged-ende-export500k.tar.gz)) | +| --- | --- | +| Configuration | Base Transformer configuration with standard [training options](https://github.com/OpenNMT/OpenNMT-tf/tree/master/scripts/wmt) | +| Data | [WMT](https://s3.amazonaws.com/opennmt-trainingdata/wmt_ende_sp.tar.gz) with shared SentencePiece model | +| BLEU | newstest2014 = 26.9
newstest2017 = 28.0 | diff --git a/Models.md b/Models.md deleted file mode 100644 index eb351bc..0000000 --- a/Models.md +++ /dev/null @@ -1,28 +0,0 @@ ---- -layout: page -title: Torch Models -order: 3 ---- - -Available models trained using OpenNMT. - -### English-German Translation - -| Corpus | Tokenization | Model | BLEU score | | -| --- | --- | --- | --- | --- | -| [WMT 17](http://www.statmt.org/wmt17/translation-task.html) | `-mode aggressive -segment_case -joiner_annotate` | 2 layers, 1024 hidden size, 32K shared BPE | *newstest2017*: 25.1 | [download](https://s3.amazonaws.com/opennmt-models/wmt-ende_l2-h1024-bpe32k_release.tar.gz) | -| [WMT 17](http://www.statmt.org/wmt17/translation-task.html) (with back-translation) | `-mode aggressive -segment_case -joiner_annotate` | 2 layers, 1024 hidden size, 32K shared BPE | *newstest2016*: 32.8 | [download](https://s3.amazonaws.com/opennmt-models/wmt-ende-with-bt_l2-h1024-bpe32k_release.tar.gz) | - -### Multi-way Translation (FR,ES,PT,IT,RO<>FR,ES,PT,IT,RO) - -| Corpus | Tokenization | Model | | -| --- | --- | --- | -| [Multi](https://s3.amazonaws.com/opennmt-trainingdata/multi-esfritptro-parallel-tokenized.tgz) | `-mode aggressive` | 4 layers, 1000 hidden size,
600 embedding size, brnn
32K shared BPE | [download](https://s3.amazonaws.com/opennmt-models/onmt_esfritptro-4-1000-600_epoch13_3.12_release_v2.t7) | - -More details on the [forum](http://forum.opennmt.net/t/training-romance-multi-way-model/86). - -### English Summarization - -| Corpus | Model | Score | | -| --- | --- | --- | --- | -| [Gigaword standard](https://github.com/harvardnlp/sent-summary) | 2 layers, 500 hidden size,
500 embedding size, 11 epochs | R1: 33.13
R2: 16.09
RL: 31.00 | [download](https://s3.amazonaws.com/opennmt-models/textsum_epoch7_14.69_release.t7) | diff --git a/_config.yml b/_config.yml index 6246249..442e67c 100644 --- a/_config.yml +++ b/_config.yml @@ -13,10 +13,3 @@ description: 'An open source neural machine translation system.' url: http://opennmt.net baseurl: http://opennmt.net google_analytics: UA-89222039-1 -author: - name: 'Yoon Kim and HarvardNLP' - url: https://twitter.com/harvardnlp - - -# Custom vars -version: 0.1 diff --git a/_includes/footer.html b/_includes/footer.html new file mode 100644 index 0000000..1470fdf --- /dev/null +++ b/_includes/footer.html @@ -0,0 +1,23 @@ + diff --git a/_includes/head.html b/_includes/head.html index b4a2ff5..943ebb1 100644 --- a/_includes/head.html +++ b/_includes/head.html @@ -41,6 +41,4 @@ - - diff --git a/_includes/header.html b/_includes/header.html new file mode 100644 index 0000000..fff8356 --- /dev/null +++ b/_includes/header.html @@ -0,0 +1,29 @@ +
+
+ + +
+

+ + + {{ site.title }} + +

+

{{ site.description }}

+
+ + +
+
diff --git a/_includes/sidebar.html b/_includes/sidebar.html deleted file mode 100644 index 92fae0a..0000000 --- a/_includes/sidebar.html +++ /dev/null @@ -1,57 +0,0 @@ - diff --git a/_layouts/default.html b/_layouts/default.html index 111ae9f..7223551 100644 --- a/_layouts/default.html +++ b/_layouts/default.html @@ -3,12 +3,15 @@ {% include head.html %} - - {% include sidebar.html %} + + {% include header.html %} -
- {{ content }} +
+
+ {{ content }} +
+ {% include footer.html %} diff --git a/_layouts/index.html b/_layouts/index.html new file mode 100644 index 0000000..f7b5bf0 --- /dev/null +++ b/_layouts/index.html @@ -0,0 +1,7 @@ +--- +layout: default +--- + +
+ {{ content }} +
diff --git a/about.md b/about.md deleted file mode 100644 index 25a6d01..0000000 --- a/about.md +++ /dev/null @@ -1,30 +0,0 @@ ---- -layout: page -title: About -order: 5 ---- - - -OpenNMT was originally developed by Yoon Kim and harvardnlp. - -
-Natural language processing research group at Harvard SEAS -
- -Major source contributions and support come from SYSTRAN. - -
-SYSTRAN - PURE NEURAL MACHINE TRANSLATION - ARTIFICIAL INTELLIGENCE AND DEEP LEARNING -
- -## Technical report - -A technical report on OpenNMT is available. If you use the system for academic work, please cite: - - @ARTICLE{2017opennmt, - author = { {Klein}, G. and {Kim}, Y. and {Deng}, Y. - and {Senellart}, J. and {Rush}, A.~M.}, - title = "{OpenNMT: Open-Source Toolkit - for Neural Machine Translation}", - journal = {ArXiv e-prints}, - eprint = {1701.02810} } diff --git a/atom.xml b/atom.xml deleted file mode 100644 index 96c9681..0000000 --- a/atom.xml +++ /dev/null @@ -1,28 +0,0 @@ ---- -layout: null ---- - - - - - {{ site.title }} - - - {{ site.time | date_to_xmlschema }} - {{ site.url }} - - {{ site.author.name }} - {{ site.author.email }} - - - {% for post in site.posts %} - - {{ post.title }} - - {{ post.date | date_to_xmlschema }} - {{ site.url }}{{ post.id }} - {{ post.content | xml_escape }} - - {% endfor %} - - diff --git a/history.md b/history.md new file mode 100644 index 0000000..8277050 --- /dev/null +++ b/history.md @@ -0,0 +1,16 @@ +--- +layout: page +title: History +order: 2 +--- + +{:#history} +| June 2016 | [Yoon Kim](http://www.people.fas.harvard.edu/~yoonkim/) from the Harvard NLP group publishes the project [seq2seq-attn](https://github.com/harvardnlp/seq2seq-attn) that lays the foundation of the OpenNMT initiative. | +| December 2016 | Initial release of [OpenNMT](https://github.com/OpenNMT/OpenNMT), the original implementation using LuaTorch. | +| January 2017 | Release of [CTranslate](https://github.com/OpenNMT/CTranslate), a custom and lightweight inference engine for OpenNMT models. | +| March 2017 | The PyTorch version [OpenNMT-py](https://github.com/OpenNMT/OpenNMT-py) is released in collaboration with the Facebook AI Research team. | +| July 2017 | The project is awarded "Best Demonstration Paper Runner-Up" at [ACL 2017](https://www.aclweb.org/anthology/P17-4012). | +| November 2017 | The TensorFlow version [OpenNMT-tf](https://github.com/OpenNMT/OpenNMT-tf) is released. | +| March 2018 | The first [OpenNMT workshop](http://workshop-paris-2018.opennmt.net/) is held in Paris gathering more than 100 people from around the world. | +| July 2018 | Last version of the original LuaTorch implementation, now fully superseded by OpenNMT-py and PyTorch. | +| August 2018 | OpenNMT publishes the fastest model running on a single CPU core at [WNMT 2018](https://aclweb.org/anthology/papers/W/W18/W18-2715/) using the CTranslate engine. | diff --git a/index.md b/index.md index cbda812..6a572af 100644 --- a/index.md +++ b/index.md @@ -1,27 +1,71 @@ --- -layout: page +layout: index title: Home order: 1 --- -[OpenNMT](http://opennmt.net/) is an open source (MIT) initiative for neural machine translation and neural sequence modeling. +**OpenNMT** is an open source ecosystem for neural machine translation and neural sequence learning. -
+
-Since its launch in December 2016, OpenNMT has become a collection of implementations targeting both academia and industry. The systems are designed to be simple to use and easy to extend, while maintaining efficiency and state-of-the-art accuracy. +Started in December 2016 by the [Harvard NLP](https://nlp.seas.harvard.edu/) group and SYSTRAN, the project has since been used in [several research and industry applications](/publications). It is currently maintained by [SYSTRAN](http://www.systransoft.com/) and [Ubiqus](https://www.ubiqus.com/). -**OpenNMT has currently 3 main implementations:** +**OpenNMT provides implementations in 2 popular deep learning frameworks:** -* [OpenNMT-lua](https://github.com/OpenNMT/OpenNMT) (a.k.a. OpenNMT): the original project developed with [LuaTorch](http://torch.ch).
Full-featured, optimized, and stable code ready for quick experiments and production. -* [OpenNMT-py](https://github.com/OpenNMT/OpenNMT-py): an OpenNMT-lua clone using the more modern [PyTorch](http://pytorch.org).
Initially created by the Facebook AI research team as an example, this implementation is easier to extend and particularly suited for research. -* [OpenNMT-tf](https://github.com/OpenNMT/OpenNMT-tf): a [TensorFlow](https://www.tensorflow.org/) alternative.
The more recent project focusing on large scale experiments and high performance model serving using the latest TensorFlow features. +

+

+

-All versions are currently maintained. +Each implementation has its own set of unique features but shares similar goals: -**Common features include:** +* Highly configurable model architectures and training procedures +* Efficient model serving capabilities for use in real world applications +* Extensions to allow other tasks such as text generation, tagging, summarization, image to text, and speech to text -* Simple general-purpose interface, requiring only source/target files. -* Highly configurable models and training procedures. -* Recent research features to improve system performance. -* Extensions to allow other sequence generation tasks such as summarization, image-to-text, or speech-recognition. -* Active community welcoming both academic and industrial requests and contributions. +**The OpenNMT ecosystem also includes projects to cover the full NMT workflow:** + +

+

+

diff --git a/public/css/general.css b/public/css/general.css index 5e0969b..33eda1a 100644 --- a/public/css/general.css +++ b/public/css/general.css @@ -1,69 +1,210 @@ -/* Language top bar */ -#select-language { - margin-top: 20px; - font-size: 0.7em; +html, body { + height: 100%; +} + +body { + background-color: #eee; +} + +#main-container { + position: relative; + overflow: hidden; + background-color: white; +} + +main a { + color: #ac4142; +} + +/* Header */ +header { + background-color: #ac4142; + box-shadow: 0 3px 4px -1px rgba(0,0,0,0.25); + padding-bottom: 10px; + z-index: 10; + position: relative; +} +header, header a { color: white; } -#select-language ul { + +#meta-links { + font-size: 0.8em; +} +#meta-links ul { list-style-type: none; padding: 0; -} -#select-language ul li { - display: inline-block; - padding: 0 5px; + margin: 0; } -/* Sidebar customization */ -.sidebar { - padding: 0 1rem; +#meta-links ul li { + display: inline-block; + margin-left: 15px; + border: 1px solid rgba(255, 255, 255, 0.3); + margin-top: -1px; } -.sidebar-about h1 { +#meta-links ul li a { display: block; - margin: 0 auto -1rem auto; - width:200px + color: rgba(255, 255, 255, 0.6); + padding: 3px 10px; } -.sidebar-about h1 center a img { - margin: 0; - border-radius: 0; - width: auto; +#meta-links ul li a:hover { + background-color: #d67272; + color: rgba(255, 255, 255, 1); + text-decoration: none; } -.lead { - font-size: 1.1rem; - line-height: 1.1rem; - padding: 1rem 0 1rem 0; +#title { text-align: center; } -.sidebar-nav ul { - list-style-type: none; - margin: 0; - padding-left: 20px; - font-size: 0.9em; +#title h1 { + font-family: serif; } +#title h1 img { + display: inline-block; + width: 60px; + vertical-align: middle; + margin: 0; +} +#title h1 a:hover { + text-decoration: none; +} +#title .description { + font-size: 0.9em; + color: #ddd; +} - -.container.sidebar-sticky > p { - font-size: 0.7rem; - padding: 2rem 0; +nav ul { + list-style-type: none; + padding: 0; + margin: 0; + text-align: center; +} +nav ul li { + display: inline-block; +} +nav ul li+li { + margin-left: 20px; } +/* Index cards */ +.cards { + flex-direction: row; + display: flex; + justify-content: center; +} +.cards .card { + display: inline-block; + flex: 0.3; + border-radius: 12px; + box-shadow: 0 1px 2px 0 rgba(60,64,67,0.302), 0 1px 3px 1px rgba(60,64,67,0.149); + overflow: hidden; +} +.cards .card+.card { + margin-left: 15px; +} +.cards .card .card-main { + display: block; + padding: 5px 15px; + color: inherit !important; +} +.cards .card .card-main:hover { + text-decoration: none; + background-color: #f1f1f1; +} +.cards .card img { + width: 80px; + margin: 0 auto 10px; +} +.cards .card .project { + text-align: center; + font-size: 1.3em; + margin-bottom: 10px; +} +.cards .card ul { + padding-top: 10px; + padding-bottom: 10px; + margin: 0; +} +.cards .card ul { + border-top: 1px solid #ccc; +} +.cards .card ul a { + color: #777; +} +#cards-primary .card { + flex: 0.4; +} +#cards-secondary .project { + font-size: 1.1em; +} +#cards-secondary .card-main { + height: 100%; +} +#history, #history td { + border: none; +} +#history tr+tr { + border-top: 1px solid #e5e5e5; +} +#history td { + padding-top: 1rem; + padding-bottom: 1rem; +} +#history tbody tr:nth-child(odd) td, #history tbody tr:nth-child(odd) th { + background-color: inherit; +} +#history td:first-child { + font-size: 0.8em; + font-weight: bold; + white-space: nowrap; + vertical-align: middle; + text-align: right; + padding-right: 0.8rem; +} -/* small devices */ -@media (min-width: 48em) { - .lead { - line-height: 1.2rem; - } +.pretrained tbody tr:nth-child(odd) td, .pretrained tbody tr:nth-child(odd) th { + background-color: inherit; +} +.pretrained td { + font-size: 0.9em; +} +.pretrained td:first-child { + width: 25%; + white-space: nowrap; + font-weight: bold; + background-color: #f9f9f9 !important; + vertical-align: top; } -@media (max-width: 48em) { - .sidebar-about h1 { - margin: 0 auto; - } +/* Footer */ +footer { + border-top: 1px solid #ccc; + padding: 20px 0; + font-size: 0.8em; + text-align: center; +} +#quick-links, #quick-links ul { + list-style-type: none; + padding: 0; + margin: 0; +} +#quick-links > li { + display: inline-block; +} +#quick-links > li+li { + margin-left: 15%; +} +#quick-links a { + color: inherit; +} +#quick-links ul { + text-align: left; } diff --git a/public/css/hyde.css b/public/css/hyde.css index b08f5d6..4e57e56 100644 --- a/public/css/hyde.css +++ b/public/css/hyde.css @@ -36,12 +36,12 @@ html { } @media (min-width: 48em) { html { - font-size: 16px; + font-size: 15px; } } @media (min-width: 58em) { html { - font-size: 20px; + font-size: 18px; } } diff --git a/public/pytorch.png b/public/pytorch.png new file mode 100644 index 0000000..bad49bf Binary files /dev/null and b/public/pytorch.png differ diff --git a/simple-attn.png b/public/simple-attn.png similarity index 100% rename from simple-attn.png rename to public/simple-attn.png diff --git a/public/tensorflow.png b/public/tensorflow.png new file mode 100644 index 0000000..b4b067e Binary files /dev/null and b/public/tensorflow.png differ diff --git a/publications.md b/publications.md new file mode 100644 index 0000000..460378d --- /dev/null +++ b/publications.md @@ -0,0 +1,46 @@ +--- +layout: page +title: Publications +--- + +If you are using OpenNMT for academic work, please cite the initial [system demonstration paper](https://www.aclweb.org/anthology/P17-4012): + +``` +@inproceedings{klein-etal-2017-opennmt, + title = "{O}pen{NMT}: Open-Source Toolkit for Neural Machine Translation", + author = "Klein, Guillaume and + Kim, Yoon and + Deng, Yuntian and + Senellart, Jean and + Rush, Alexander", + booktitle = "Proceedings of {ACL} 2017, System Demonstrations", + month = jul, + year = "2017", + address = "Vancouver, Canada", + publisher = "Association for Computational Linguistics", + url = "https://www.aclweb.org/anthology/P17-4012", + pages = "67--72", +} +``` + +## Research + +Here is a list of selected papers using OpenNMT: + +* [Challenges in Data-to-Document Generation](http://arxiv.org/abs/1707.08052). Sam Wiseman, Stuart M. Shieber, Alexander M. Rush. 2017. +* [Model compression via distillation and quantization](http://arxiv.org/abs/1802.05668). Antonio Polino, Razvan Pascanu, Dan Alistarh. 2018. +* [A causal framework for explaining the predictions of black-box sequence-to-sequence models](http://arxiv.org/abs/1707.01943). David Alvarez-Melis, Tommi S. Jaakkola. 2017. +* [Deep Learning Scaling is Predictable, Empirically](http://arxiv.org/abs/1712.00409). Joel Hestness, Sharan Narang, Newsha Ardalani, Gregory F. Diamos, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, Yanqi Zhou. 2017. +* [What You Get Is What You See: A Visual Markup Decompiler](http://arxiv.org/abs/1609.04938). Yuntian Deng, Anssi Kanervisto, Alexander M. Rush. 2016. +* [Semantically Equivalent Adversarial Rules for Debugging NLP models](https://www.aclweb.org/anthology/P18-1079). Ribeiro, Marco Tulio, Singh, Sameer, Guestrin, Carlos. 2018. +* [A Regularized Framework for Sparse and Structured Neural Attention](http://papers.nips.cc/paper/6926-a-regularized-framework-for-sparse-and-structured-neural-attention.pdf). Niculae, Vlad, Blondel, Mathieu. 2017. +* [Controllable Invariance through Adversarial Feature Learning](http://papers.nips.cc/paper/6661-controllable-invariance-through-adversarial-feature-learning.pdf). Xie, Qizhe, Dai, Zihang, Du, Yulun, Hovy, Eduard, Neubig, Graham. 2017. +* [Neural Semantic Parsing by Character-based Translation: Experiments with Abstract Meaning Representations](http://arxiv.org/abs/1705.09980). Rik van Noord, Johan Bos. 2017. +* [When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size)](http://arxiv.org/abs/1809.00069). Liang Huang, Kai Zhao, Mingbo Ma. 2018. +* [Handling Homographs in Neural Machine Translation](http://arxiv.org/abs/1708.06510). Frederick Liu, Han Lu, Graham Neubig. 2017. +* [Bottom-Up Abstractive Summarization](http://arxiv.org/abs/1808.10792). Sebastian Gehrmann, Yuntian Deng, Alexander M. Rush. 2018. +* [Dataset for a Neural Natural Language Interface for Databases (NNLIDB)](http://arxiv.org/abs/1707.03172). Florin Brad, Radu Iacob, Ionel Hosu, Traian Rebedea. 2017. +* [Coarse-to-Fine Attention Models for Document Summarization](https://www.aclweb.org/anthology/W17-4505). Ling, Jeffrey, Rush, Alexander. 2017. +* [Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models](http://arxiv.org/abs/1804.09299). Hendrik Strobelt, Sebastian Gehrmann, Michael Behrisch, Adam Perer, Hanspeter Pfister, Alexander M. Rush. 2018. + +Find more references on [Google Scholars](https://scholar.google.fr/scholar?cites=6651054115351140376).