Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reporting undefined behaviors (integer overflow and nullptr dereference) #4299

Open
gal1ium opened this issue Aug 10, 2024 · 4 comments
Open

Comments

@gal1ium
Copy link

gal1ium commented Aug 10, 2024

Hi! While testing with Tesseract APIs we spotted some issues that might lead to undefined behaviors.
An API call sequence:

r0 = TessBaseAPICreate()
TessBaseAPIInitForAnalysePage(r0)
TessBaseAPISetImage(r0, 'ffffff\x87\xe2\xe2888888888\xe2\xe2\xe7\xe8fffffff\x99ffkk:::::kwkkkkk\xdaIfeffffffffffffff\x99ff\xea\xea\xea\xea\xea\xea\xea\xea\xea\xea\x00\x00\x00\x01::::kkkkkkk\xdaIfed3eee4333\x00\x04[\xff\xe5\xe5%%%%%%%%\x00\x0034\x8dff\xbb[\xf0-,3\xccz\"C\xd5\x00\xe2\xe2\xe2\xe2\xe2\xf2\xe2\xe2\xe2\xe7\xe8\xe2\xe2\xe2\xe2\xe2\xe2\xe2\xe2\xe2\xe2\xe2\xf3\xe2\xe2\xe2\xe2\xe7\x10\x10 [\xa5\xe2\xe2\xe2@\xe2\x00\xe7\xe8\xe7\x10\x10 r\x00\x01\xe2b\xe2ffffffffffffffffffffffffffffffff\xbb\x92\xc6::@p\"\xe5\xde\xffffffffff\xe5\xe5\xe5\xe5\xe5\xe5\xe5+\xd5/\x0ek3\x7f\x00\xe8\xd5/\x0ek3\x7f\x00\xe8\x00\x00\x10}\x10\x00\x00\xe8\xe8\xe7\xe2\xe2\xe2\xe2ffffssssff\xe2\xe2\xe2vvvvv\x00\x00a+\xff\xdaeeeeeeee66666666eeeeee\x86f\x7ffyffff8:::@ffe\xea\x7ffCff\x9afffBr\x00:@p\"eeeeeeeeeee\xe5\xde\xff\xe5\xd4\xe5\xe5\xe5\xe5\xe5+\xd5/\x0ek', 0x405, 0x1c, 0x0, 0x10)
TessBaseAPIAdaptToWordStr(r0, 0xc1249078, 0x0)

would crash at

word_res->word->set_text(wordstr);

under address sanitizer due to not checking if wordstr is a valid pointer.

Also, it reaches there due to an integer overflow in:

inline bool PSM_OSD_ENABLED(int pageseg_mode) {
return pageseg_mode <= PSM_AUTO_OSD || pageseg_mode == PSM_SPARSE_TEXT_OSD;
}

if the second argument PageSegMode in TessBaseAPIAdaptToWordStr is negative and makes PSM_OSD_ENABLED wrongly return true.

@stweil
Copy link
Member

stweil commented Aug 10, 2024

Isn't it normal that API functions will do weird things or even crash if they are called with illegal values? Would you expect that strcpy or other functions of the standard C library also work with nullptr arguments?

@gal1ium
Copy link
Author

gal1ium commented Aug 10, 2024

At least for PSM_OSD_ENABLED, I would expect the behavior that relies on integer comparison to avoid integer overflow.

@stweil
Copy link
Member

stweil commented Aug 10, 2024

I don't think there is an integer overflow. You call the function with a negative argument which is not a valid PSM value, and the function compares this int argument with an enum value. For invalid arguments any result is okay.

@amitdo
Copy link
Collaborator

amitdo commented Aug 14, 2024

@stweil,

AdaptToWordStr() looks like an obsolete method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants