Upgrading PyLaia to support torch 1.8.1, so that we can use the A100s.
PyLaia is a device agnostic, PyTorch based, deep learning toolkit for handwritten document analysis.
It is also a successor to Laia.
Get started by having a look at our Wiki!
Several (mostly undocumented) examples of its use are provided at PyLaia-examples.
In order to install PyLaia, follow this recipe:
git clone https://github.com/jpuigcerver/PyLaia
cd PyLaia
pip install -e .
The following Python scripts will be installed in your system:
pylaia-htr-create-model
: Create a VGG-like model with BLSTMs on top for handwriting text recognition. The script has different options to costumize the model. The architecture is based on the paper "Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?" (2017) by J. Puigcerver.pylaia-htr-train-ctc
: Train a model using the CTC algorithm and a set of text-line images and their transcripts.pylaia-htr-decode-ctc
: Decode text line images using a trained model and the CTC algorithm. It can also output the char/word segmentation boundaries of the symbols recognized.pylaia-htr-netout
: Dump the output of the model for a set of text-line images in order to decode using an external language model.
Work in this toolkit was financially supported by the Pattern Recognition and Human Language Technology (PRHLT) Research Center
@misc{puigcerver2018pylaia,
author = {Joan Puigcerver and Carlos Mocholí},
title = {PyLaia},
year = {2018},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/jpuigcerver/PyLaia}},
commit = {commit SHA}
}