Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address 4186 #4189

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Address 4186 #4189

wants to merge 1 commit into from

Conversation

JorjMcKie
Copy link
Collaborator

JPEG with CMYK color space need inversion of the pixel colors.
This was an an undetected problem so far.
The fix checks whether we have extracted a JPEG image with color space CMYK, respectively number of components = 4.
If so, we convert / reprocess the image buffer specifying the option "invert_cmyk". As we cannot know the original JPEG quality, the reprocessing uses JPEG quality 95 (close to lossless).
This fix covers cases Document.extract_image(xref) as well as page.get_text("dict") outputs.

There seems to be no way of verifying the success of this.

Note:
The MuPDF CLI tool mutool extract also does it wrong and extracts JPEG/CMYK images with inverted (wrong) colors.

JPEG with CMYK color space need inversion of the pixel colors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant