Replies: 8 comments 1 reply
-
只看到这个代码了输出了一次时间 |
Beta Was this translation helpful? Give feedback.
-
The time is displayed when OCR_runnable() is called. |
Beta Was this translation helpful? Give feedback.
-
试一下ocr多张图片?(ocr = PaddleOCR(use_angle_cls=False, lang='en', show_log=False)只需运行一次 |
Beta Was this translation helpful? Give feedback.
-
Thank you for your advice. ・Directly call ・Called on multiprocessing |
Beta Was this translation helpful? Give feedback.
-
奇怪,这我就不清楚了。 |
Beta Was this translation helpful? Give feedback.
-
Hey @FRAki73 I am running into the same issue trying to process multiple images at once, do you have any advices? the total of 40minutes for all images is way too long and my CPU alone is only running at 10% usage (26Core Server CPU) |
Beta Was this translation helpful? Give feedback.
-
Hello @asdat3 , I have not solved this problem yet, so I have no idea. Sorry. |
Beta Was this translation helpful? Give feedback.
-
The significant increase in processing time when using multiprocessing in your code occurs due to the initialization of the Why is this happening?
How to Fix the IssueTo address the increased processing time, you can use one of the following approaches: 1. Initialize the Model Once and Share ItInstead of initializing the Example: import time
import multiprocessing
from paddleocr import PaddleOCR
def OCR_runnable(ocr):
start_time = time.time()
result = ocr.ocr("./test.png", cls=False)
end_time = time.time()
elapsed_time = end_time - start_time
print(f"Processing time of OCR: {elapsed_time:.4f} [sec]")
print("OCR Result: " + result[0][0][1][0])
if __name__ == '__main__':
multiprocessing.freeze_support()
# Initialize the PaddleOCR model once
ocr_model = PaddleOCR(use_angle_cls=False, lang='en', show_log=False)
# Use multiprocessing Pool for process management
with multiprocessing.Pool(processes=1) as pool:
pool.apply(OCR_runnable, args=(ocr_model,)) However, note that object sharing in multiprocessing might still require extensive handling, as PaddleOCR may not be directly shareable across processes. 2. Use Threading Instead of MultiprocessingSince the PaddleOCR model is CPU- and GPU-intensive, and the Global Interpreter Lock (GIL) may not be a bottleneck in this case (due to external library usage), you can use the Example: import time
from threading import Thread
from paddleocr import PaddleOCR
def OCR_runnable(ocr):
start_time = time.time()
result = ocr.ocr("./test.png", cls=False)
end_time = time.time()
elapsed_time = end_time - start_time
print(f"Processing time of OCR: {elapsed_time:.4f} [sec]")
print("OCR Result: " + result[0][0][1][0])
if __name__ == '__main__':
# Initialize the PaddleOCR model once
ocr_model = PaddleOCR(use_angle_cls=False, lang='en', show_log=False)
# Create and start a thread
thread = Thread(target=OCR_runnable, args=(ocr_model,))
thread.start()
thread.join() 3. Use a Multiprocessing Pool with Lazy InitializationIf you need to use multiprocessing for scaling, consider using a Example: import time
from multiprocessing import Pool
from paddleocr import PaddleOCR
def init_worker():
global ocr_model
ocr_model = PaddleOCR(use_angle_cls=False, lang='en', show_log=False)
def OCR_runnable(_):
start_time = time.time()
result = ocr_model.ocr("./test.png", cls=False)
end_time = time.time()
elapsed_time = end_time - start_time
print(f"Processing time of OCR: {elapsed_time:.4f} [sec]")
print("OCR Result: " + result[0][0][1][0])
if __name__ == '__main__':
with Pool(processes=1, initializer=init_worker) as pool:
pool.map(OCR_runnable, [None]) Here, the Key Takeaways
By using these improvements, you should be able to significantly reduce the processing time when using multiprocessing for PaddleOCR. Response generated by feifei-bot | chatgpt-4o-latest |
Beta Was this translation helpful? Give feedback.
-
I measured the processing time for OCR by executing the following code.
The execution results are as follows, and the processing time is about 5 times longer when executed with multiprocessing.
I would like to know why the time is getting longer and what to do about it.
Can someone please help me?
Processing time of OCR: 1.1000 [sec]
OCR Result: The difficult thing in the life is
Processing time of OCR: 5.5675 [sec]
OCR Result: The difficult thing in the life is
Originally posted by @FRAki73 in #14221
Beta Was this translation helpful? Give feedback.
All reactions