Skip to content

Releases: OpenNMT/OpenNMT-py

OpenNMT-py v2.1.1

30 Apr 10:49
edf4b46
Compare
Choose a tag to compare

Fixes and improvements

  • Fix potential deadlock (b1a4615)
  • Add more CT2 conversion checks (e4ab06c)

OpenNMT-py v2.1.0

16 Apr 12:58
43e5c2a
Compare
Choose a tag to compare

New features

  • Allow vocab update when training from a checkpoint (cec3cc8, 2f70dfc)

Fixes and improvements

  • Various transforms related bug fixes
  • Fix beam warning and buffers reuse
  • Handle invalid lines in vocab file gracefully

OpenNMT-py v2.0.1

27 Jan 09:21
1cf165a
Compare
Choose a tag to compare

Fixes and improvements

  • Support embedding layer for larger vocabularies with GGNN (e8065b7)
  • Reorganize some inference options (9fb5f30)

OpenNMT-py v2.0.0

20 Jan 13:35
58bae87
Compare
Choose a tag to compare

First official release for OpenNMT-py major upgdate to 2.0!

New features

  • Language Model (GPT-2 style) training and inference
  • Nucleus (top-p) sampling decoding

Fixes and improvements

  • Fix some BART default values

OpenNMT-py v2.0.0rc2

10 Nov 18:09
f12dd51
Compare
Choose a tag to compare

Fixes and improvements

  • Parallelize onmt_build_vocab (422d824)
  • Some fixes to the on-the-fly transforms
  • Some CTranslate2 related updates
  • Some fixes to the docs

This will be the first release to be automatically deployed via GitHub Actions.

OpenNMT-py v2.0.0rc1

25 Sep 17:15
2c63a53
Compare
Choose a tag to compare

This is the first release candidate for OpenNMT-py major upgdate to 2.0.0!

The major idea behind this release is the -- almost -- complete makeover of the data loading pipeline . A new 'dynamic' paradigm is introduced, allowing to apply on the fly transforms to the data.

This has a few advantages, amongst which:

  • remove or drastically reduce the preprocessing required to train a model;
  • increase and simplify the possibilities of data augmentation and manipulation through on-the fly transforms.

These transforms can be specific tokenization methods, filters, noising, or any custom transform users may want to implement. Custom transform implementation is quite straightforward thanks to the existing base class and example implementations.

You can check out how to use this new data loading pipeline in the updated docs and examples.

All the readily available transforms are described here.

Performance

Given sufficient CPU resources according to GPU computing power, most of the transforms should not slow the training down. (Note: for now, one producer process per GPU is spawned -- meaning you would ideally need 2N CPU threads for N GPUs).

Breaking changes

A few features are dropped, at least for now:

  • audio, image and video inputs;
  • source word features.

Some very old checkpoints with previous fields and vocab structure are also incompatible with this new version.

For any user that still need some of these features, the previous codebase will be retained as legacy in a separate branch. It will no longer receive extensive development from the core team but PRs may still be accepted.

OpenNMT-py v1.2.0

17 Aug 15:26
60125c8
Compare
Choose a tag to compare

Fixes and improvements

  • Support pytorch 1.6 (e813f4d, eaaae6a)
  • Support official torch 1.6 AMP for mixed precision training (2ac1ed0)
  • Flag to override batch_size_multiple in FP16 mode, useful in some memory constrained setups (23e5018)
  • Pass a dict and allow custom options in preprocess/postprocess functions of REST server (41f0c02, 8ec54d2)
  • Allow different tokenization for source and target in REST server (bb2d045, 4659170)
  • Various bug fixes

New features

OpenNMT-py v1.1.1

20 Mar 08:59
Compare
Choose a tag to compare

Fixes and improvements

  • Fix backcompatibility when no 'corpus_id' field (c313c28)

OpenNMT-py v1.1.0

19 Mar 14:45
Compare
Choose a tag to compare

New features

  • Support CTranslate2 models in REST server (91d5d57)
  • Extend support for custom preprocessing/postprocessing function in REST server by using return dictionaries (d14613d, 9619ac3, 92a7ba5)
  • Experimental: BART-like source noising (5940dcf)

Fixes and improvements

  • Add options to CTranslate2 release (e442f3f)
  • Fix dataset shard order (458fc48)
  • Rotate only the server logs, not training (189583a)
  • Fix alignment error with empty prediction (91287eb)

OpenNMT-py v1.0.2

05 Mar 14:53
Compare
Choose a tag to compare

Fixes and improvements

  • Enable CTranslate2 conversion of Transformers with relative position (db11135)
  • Adapt -replace_unk to use with learned alignments if they exist (7625b53)