MangaQuick is a Streamlit-powered web application, designed to facilitate the automatic translation of manga. This tool is part of my Final Degree Project Diseño y desarrollo de un traductor de comics (UPM, Spanish). It offers a streamlined solution for translating manga pages, with support for both single-page and batch processing. The application integrates Manga Text Segmentation for text segmentation and detection, LaMa for image inpainting and manga-ocr for optical character recognition.
If you are searching for automatic manga translator applications, the open-source community has produced other excellent tools that you might find valuable:
🔄 Versatile manga and comic translation pipeline
🖌️ High-quality text detection and inpainting
🔤 Support for various OCR engines and translation services
💻 Both CLI and web-based interfaces for flexibility
🔬 Focused on precise text segmentation in manga images
🎯 Accurately identifies and isolates text regions
🖼️ Handles complex manga layouts and artistic text styles
🔗 Can be integrated as a crucial step in translation pipelines
📚 Specialized OCR tool for manga and comics
🇯🇵 Highly accurate for Japanese text in manga-style fonts
🧠 Based on deep learning for robust recognition
🔧 Easy to integrate into other manga translation pipelines
🚀 Powerful, feature-rich manga translation tool
🔍 Advanced text detection and segmentation
🌐 Support for multiple languages and translation services
🖥️ User-friendly GUI for easy editing and fine-tuning
It's highly recommended to use a virtual environment for managing dependencies and isolating the project, conda
is a great tool for this purpose:
Create a new conda environment named 'MangaQuick' with Python 3.11
conda create --name MangaQuick python=3.11
Activate the 'MangaQuick' environment
conda activate MangaQuick
-
Clone the MangaQuick repository:
git clone https://github.com/yourusername/MangaQuick.git
-
Navigate to the MangaQuick directory:
cd MangaQuick
-
Install the required dependencies:
pip install -r requirements.txt
To utilize GPU, ensure you install the correct version of PyTorch that matches your system and CUDA setup. You can find the appropriate installation commands on:
https://pytorch.org/get-started/locally/
This application has been tested on an RTX 3080 GPU, which has 10GB of VRAM. It's important to note that the application nearly utilizes the full capacity of the 10GB VRAM. Therefore, to ensure smooth operation, a GPU with at least 10GB of VRAM is recommended.
The application supports CPU usage as well, with options to select either CPU or GPU for each different model within the web interface. The Text Segmentation model is the most resource-intensive component.
To download the Text Segmentation model, visit the GitHub repository. The repository offers 5 model variants; you may download one or all to switch between them in the web application.
Create a models folder inside components/text_detection and place the downloaded .pkl model file(s) inside it following this directory structure:
components/text_detection/models/fold.0.-.final.refined.model.2.pkl
Download the LaMa inpainting model from its GitHub page using the following commands:
curl -LJO https://huggingface.co/smartywu/big-lama/resolve/main/big-lama.zip
unzip big-lama.zip
Create a models folder inside components/image_inpainting and move the big-lama folder into it, resulting in the following path: components/image_inpainting/models/big-lama
To start using MangaQuick, follow these steps:
- Launch the application:
streamlit run MangaQuick.py
Upon launching, you will see the MangaQuick web interface in your browser:
(source: manga109, © Yagami Ken)
To use MangaQuick in Google Colab:
- Download the repository and place it inside your Google Drive.
- Open the example Colab notebook (link below) and follow the instructions in the comments.
- If you encounter any network issues, try refreshing the webpage as indicated.
- Text segmentation: Select the preferred model and the processing unit, either GPU ("cuda") or CPU ("cpu"), to fit your hardware capabilities.
- Text block detection: options for mask dilation and the removal of unnecessary text blocks, particularly useful for reducing false positives.
- OCR: Select either GPU ("cuda") or CPU ("cpu").
- Translation: Enter your DeepL API key and select the desired target language to translate the manga into your preferred language.
- Inpainting: select either GPU ("cuda") or CPU ("cpu").
- Text injection: Choose the appropriate font size and style. Note you need to match the font style with the target language for a coherent look.
To store your DeepL key, create a .env file and include the following line:
DEEPL_KEY=<your_deepl_key>
- Activate the
Modify text boxes
option to enable editing. - Within the user interface, adjusting detection boxes is straightforward: simply double-click on any box you wish to exclude. This feature is particularly useful for eliminating unnecessary or incorrect detections.
- The functionality is focused solely on the removal of boxes; additional modifications to the boxes are not supported.
(source: manga109, © Yagami Ken)
-
All processing steps are executed simultaneously. Therefore, to adjust detection boxes or make any other changes, ensure you make these selections before initiating the process by clicking on the "Process Files" button.
-
When multiple files are uploaded, they are processed collectively, not individually. This means that all images undergo each stage—starting with text segmentation, followed by text block detection, and so on—sequentially as a batch, rather than processing each image from start to finish before moving on to the next. This batch-processing approach means that you can adjust text boxes for all uploaded images simultaneously.
-
Once the images are processed, you can download the translated manga as a zip file, ready for reading in your chosen language.
- Manga Text Segmenation: https://github.com/juvian/Manga-Text-Segmentation
- Manga inpainting: https://github.com/advimman/lama
- Manga OCR: https://github.com/kha-white/manga-ocr