Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confirm folder pre-processor is efficiently parallel #61

Open
makew0rld opened this issue Aug 1, 2024 · 0 comments
Open

Confirm folder pre-processor is efficiently parallel #61

makew0rld opened this issue Aug 1, 2024 · 0 comments
Assignees
Labels
question Further information is requested

Comments

@makew0rld
Copy link
Contributor

makew0rld commented Aug 1, 2024

The folder preprocessor should be making use of parallelism, it has goroutines in the code. How well this works needs to be confirmed, maybe by first testing with two or more large files that will be slow to upload.

Another question is whether it would be more efficient to limit its parallelism to a max number of workers. I'm pretty sure the main task of the workers is just uploading, for generic files at least. Too many workers could overload the webhook server.

First serious ingest attempt results:

  • 12 GiB of data
  • 412 files
  • Largest over 2.5 GiB
  • Most under 4 MiB
  • Server only had 2 gigs of memory, no swap
  • No issues
  • Took 10m20s
    • 19.8 MiB/s (avg)
    • 1.5 seconds per file (avg)
@makew0rld makew0rld added the question Further information is requested label Aug 1, 2024
@makew0rld makew0rld self-assigned this Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant