You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The folder preprocessor should be making use of parallelism, it has goroutines in the code. How well this works needs to be confirmed, maybe by first testing with two or more large files that will be slow to upload.
Another question is whether it would be more efficient to limit its parallelism to a max number of workers. I'm pretty sure the main task of the workers is just uploading, for generic files at least. Too many workers could overload the webhook server.
First serious ingest attempt results:
12 GiB of data
412 files
Largest over 2.5 GiB
Most under 4 MiB
Server only had 2 gigs of memory, no swap
No issues
Took 10m20s
19.8 MiB/s (avg)
1.5 seconds per file (avg)
The text was updated successfully, but these errors were encountered:
The folder preprocessor should be making use of parallelism, it has goroutines in the code. How well this works needs to be confirmed, maybe by first testing with two or more large files that will be slow to upload.
Another question is whether it would be more efficient to limit its parallelism to a max number of workers. I'm pretty sure the main task of the workers is just uploading, for generic files at least. Too many workers could overload the webhook server.
First serious ingest attempt results:
The text was updated successfully, but these errors were encountered: