Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OnDiskCorpus actually only evicts the input from memory, not the entire Testcase #2877

Open
riesentoaster opened this issue Jan 21, 2025 · 5 comments
Labels
bug Something isn't working

Comments

@riesentoaster
Copy link
Contributor

riesentoaster commented Jan 21, 2025

From the docs of OnDiskCorpus:

//! The [`OnDiskCorpus`] stores all [`Testcase`]s to disk.
//!
//! It _never_ keeps any of them in memory.
//! This is a good solution for solutions that are never reused, or for *very* memory-constraint environments.
//! For any other occasions, consider using [`CachedOnDiskCorpus`]
//! which stores a certain number of [`Testcase`]s in memory and removes additional ones in a FIFO manner.

OnDiskCorpus uses CachedOnDiskCorpus under the hood:

//! The [`CachedOnDiskCorpus`] stores [`Testcase`]s to disk, keeping a subset of them in memory/cache, evicting in a FIFO manner.

However, both these docs are wrong: CachedOnDiskCorpus only ever evicts the actual input from memory, the rest of the Testcase is kept in memory, including all metadata.

We should either implement what the docs say (not sure about the performance implications though), or at least fix the docs.

(Disclaimer: I'm only about 95% sure.)

@riesentoaster riesentoaster added the bug Something isn't working label Jan 21, 2025
@domenukk
Copy link
Member

Yeah it's how it is. Fixing the docs sounds good, unless you really really want to implement it..

@riesentoaster
Copy link
Contributor Author

riesentoaster commented Jan 21, 2025

Not right now, no. In the future, this may be helpful.

Btw: there is no great way right now to do corpus minimization on the fly, right? The cmin examples show it after fuzzing, not during fuzzing.

@domenukk
Copy link
Member

Not sure, but it should be doable with a simple-ish stage
I don't know how useful it is though?

@riesentoaster
Copy link
Contributor Author

I'm currently running a target that has a lot of wait states, so I'm using decently large overcommit values (10 leads to ~30% CPU load), and I'm mainly constrained by memory. And all these clients have to keep an entire corpus, so any entry that can be removed is good for me. Either because it's offloaded or because it's removed because it's a duplicate. That's why I was wondering.

That's somewhat off topic though.

@domenukk
Copy link
Member

Add more swap? Add an async executor? Stochastically remove random entries from the corpus, assuming the others will likely still have a copy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants