Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Webpack, don't use require.resolve() #76

Open
quintesse opened this issue Nov 16, 2023 · 6 comments
Open

Support Webpack, don't use require.resolve() #76

quintesse opened this issue Nov 16, 2023 · 6 comments

Comments

@quintesse
Copy link

quintesse commented Nov 16, 2023

Is your feature request related to a problem? Please describe.

Because of the use of the function require.resolve() in this code: https://github.com/opengovsg/pdf2md/blob/master/lib/util/pdf.js#L19 the use of pdf2md in my project results in errors in production builds at runtime. The project uses WebPack to bundle the code and that particular function cannot be used.

An explanation of the problem can be found in this Webpack issue: webpack/webpack#13931

(It's basically because once a project has been bundled, asking a module for it's filename just doesn't make sense anymore)

Describe the solution you'd like

I'd like the code to either not use that method by using an alternative or provide some kind of option/flag that avoids its use.
Also, given the fact that this project is about converting a PDF file to Markdown, which is text-only, is providing font directories even useful? Would it perhaps be possible to do without?

Describe alternatives you've considered

The work-around mentioned in the Webpack issue fortunately works. It's a completely incidental fix that reduces runtime code efficiency, so I'd prefer not to use it.

@LoneRifle
Copy link
Collaborator

LoneRifle commented Jan 11, 2024

If you are using Next.js, another workaround suggested in #69 (comment) also works: in brief, specify @opendocsg/pdf2md and pdfjs-dist under experimental.serverComponentsExternalPackages in your Next config.

@LoneRifle
Copy link
Collaborator

is providing font directories even useful? Would it perhaps be possible to do without?

I might be wrong, but pdfjs may be using the fonts to better recognise text within the PDF documents. I would be happy for you (or someone else) to see if removing the font directory would have any impact on the conversion process! If it doesn't, I'll gladly accept a PR.

@quintesse
Copy link
Author

If you are using Next.js, another workaround suggested in #69 (comment) also works: in brief, specify @opendocsg/pdf2md and pdfjs-dist under experimental.serverComponentsExternalPackages in your Next config.

Can confirm that this solution seems to work as well.

@LoneRifle
Copy link
Collaborator

Please try this with v0.2.0 of this package, which removes the dependency on node-specific libs, including path.

@quintesse
Copy link
Author

@LoneRifle I haven't been able to take a good look at it, but this is the error I get by simply updating to the latest version and removing the work-around:

Failed to compile.

./node_modules/unpdf/dist/index.cjs
Module not found: Package path ./pdfjs is not exported from package C:\Projects\test\node_modules\unpdf (see exports field in C:\Projects\test\node_modules\unpdf\package.json)

https://nextjs.org/docs/messages/module-not-found

Import trace for requested module:
./node_modules/@opendocsg/pdf2md/lib/util/pdf.js
./node_modules/@opendocsg/pdf2md/lib/pdf2md.js
...

> Build failed because of webpack errors

If I re-add the work-around (the one where you add serverComponentsExternalPackages to the next.config.js file) it at least compiles. I haven't had time yet to test if it still works.

@whats2000
Copy link

@LoneRifle Here is my test platform.

Desktop (please complete the following information):

  • OS: Window
  • Run browser: VSCode
  • Version: 1.96.0

Test pdf2md Version

  • "@opendocsg/pdf2md": "^0.2.1"

The result is that webpack successfully generates chunk files, and both test and deploy environments are all fixed now. But there is still a warning from the dependence unpdf that might have issues in another application:

WARNING in ./node_modules/unpdf/dist/pdfjs.mjs 1:1602511-1602533
Critical dependency: the request of a dependency is an expression
 @ ./node_modules/unpdf/dist/index.cjs 40:35-56
 @ ./node_modules/@opendocsg/pdf2md/lib/util/pdf.js 1:47-63
 @ ./node_modules/@opendocsg/pdf2md/lib/pdf2md.js 1:18-39
 @ ./src/utils/fileOperations/readFile.ts 3:0-39 11:34-40
 @ ./src/utils/fileOperations/fileOperationsProvider.ts 6:0-38 51:15-23
 @ ./src/utils/fileOperations/index.ts 1:0-41 1:0-41
 @ ./src/utils/index.ts 1:0-33 1:0-33
 @ ./src/api/historyManager.ts 18:0-50 268:37-68
 @ ./src/api/index.ts 2:0-33 2:0-33
 @ ./src/extension.ts 3:0-172 11:28-55 12:31-45 24:18-31 32:8-22 34:9-31 35:9-31 36:4-28 36:51-65 37:4-16 38:4-23

1 warning has detailed information that is not shown.
Use 'stats.errorDetails: true' resp. '--stats-error-details' to show it.

Addition Context

image
line 40

I hope the information can help you out! Also, thank you a lot for your update on webpack!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants