Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Keras Compatibility, Tokenization, and Performance Optimizations #28

Merged
merged 54 commits into from
Jan 6, 2025

Conversation

TimKoornstra
Copy link
Collaborator

@TimKoornstra TimKoornstra commented Oct 22, 2024

Pull Request Notes

This pull request introduces several significant updates aimed at enhancing compatibility, performance, and flexibility:

  • Keras 2 and 3 Compatibility: Forward and backward compatibility between Keras 2 and 3 has been introduced. This ensures smooth upgrades from TensorFlow 2.14.1 to 2.17.1 and intermediate versions. However, we have observed that TensorFlow versions >= 2.16 may run slower. As a workaround, we advise using the --use_float32 parameter.

  • Broken Parameter: The --steps_per_epoch parameter is currently broken.

  • Tokenizer Update: The charlist.txt file has been replaced by a tokenizer.json file. This change introduces more flexible tokenization schemes and improves readability. Padding and OOV tokens have been updated to "[PAD]" and "[UNK]", respectively. Any existing charlist.txt files will be automatically converted to the new tokenizer.json format.

  • VGSL Specification Update: The local implementation of the Variable-Size Graph Specification Language (VGSL) has been replaced by the vgslify package, which simplifies the codebase. Although the VGSL specifications have been slightly modified, the model library should function as before. Please refer to the VGSLify documentation for further details.

  • Training Log Enhancement: Learning rate values are now logged at each training step.

  • CTCLoss Update: CTCLoss has been refactored as a subclass of the Keras Loss class, instead of a simple function. It now also uses tf.function for performance improvement.

  • Augmentation Layers: All augmentation layers have been updated to use tf.function for faster execution.

  • Deprecated Arguments Removed: All arguments and configuration items marked for removal by May 2024 have now been removed.

  • Dataloader Optimization: All dataloader functions have been converted to tf.function to improve performance.

  • API Model Path: The environment variables for loading models has changed, making them similar to the way Laypa model loading works. One should supply aLOGHI_BASE_MODEL_DIR and LOGHI_MODEL_NAME that refer to the directory where your models are stored and the specific model directory, respectively.

  • Enhanced API Processing: Removed the image preparation worker, instead handling it with a tf.data.Dataset generator function. This allows for processing that is more similar to that of the non-API way.

  • Inference Time Improvement for Beam Search: Beam search decoding has been significantly optimized. Inference, validation, and test times have dramatically reduced for beam search with higher thread counts.

  • New Argument: The introduction of the --decoding_threads parameter provides additional flexibility for decoding performance tuning.

  • Unified Inference, Test, and Validation Functions: The inference, test, and validation functions have been refactored into a single, unified implementation with slight adaptations for specific use cases. This change improves code stability and maintainability.

@rvankoert rvankoert merged commit c2ea0bd into master Jan 6, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants