-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new source hashing methods: content_sha256
, content_sha384
, content_sha512
#5277
Conversation
CodSpeed Performance ReportMerging #5277 will not alter performanceComparing Summary
|
I think this is cool. It would also work nicely with the new proposal for "rendered recipes" (conda/ceps#74). On that note - should we continue adding features to conda-build without any standardization (e.g. CEP) process? |
I'm planning to submit a CEP. I opened this draft to explore what kind of things are needed for a stable yet robust logic, cross platform. Things like permissions and so on don't translate well to Windows. |
Awesome. Yeah, I also recently looked at a few content hash implementations in Rust but didn't find anything super convincing yet. There are a bunch though (https://crates.io/search?q=content%20hash) |
So far the scheme I followed looks a lot like https://github.com/DrSLDR/dasher?tab=readme-ov-file#hashing-scheme. Things to standardize would be how the tree is sorted, the normalization of the path, the separators (to prevent this), and the allowed algorithms. I've seen a few merkle tree based packages but we don't need all the proof stuff, or leaf querying; just comparing the root hash. Maybe it could be implemented in a recursive way that doesn't involve obtaining the whole file tree beforehand if that increases performance or simplifies implementation elsewhere. IMO this feels like one of those CEPs that does require prototyping first to see which things have to be standardized. |
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
content_sha256
hash checkscontent_md5
, content_sha1
, content_sha256
Hm, true, we could add a little subcommand here to make it easier. Although honestly, I usually run |
I don't think it is wise to add support for two hashing algorithms with known vulnerabilities in 2024. Although it may be unlikely, that smells like an avenue for a supply chain attack to me. I think if we were to support multiple hashing algorithms, we should support algorithms that are still deemed viable and secure, like the other SHA-* bit lengths. |
#4793 will probably land soon. I'll update this branch with it once it reaches main and then remove the weak hashes. If we are that concerned about |
We'll need a lot of advanced notice to deprecate them on conda-forge. For now we should probably add a lint + minimigrator to move to sha256. |
I am not concerned, I just don't understand why we want to add them? IMO it doesn't add value. Having the MD5 hash available for the regular hash makes sense because some pacakges might publish the known-good value (and that can be used in the recipe), but for something that we have invented ourselves I don't see a use-case where it is justified to use anything other than the best available hash. |
content_md5
, content_sha1
, content_sha256
content_sha256
, content_sha384
, content_sha512
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! We should probably vote on the CEP before merging.
Setting as blocked on the CEP vote |
@jezdez, the CEP passed, so I guess this review can be dismissed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments.
CI errors are unrelated. Investigating in #5577. |
Description
Checklist - did you ...
news
directory (using the template) for the next release's release notes?