-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSV spec #240
Comments
Also, if there are fix patch information available, it would be super useful to users if we provide those as well via OSV's git ranges. e.g. in OSV/OSS-Fuzz we already have this: https://github.com/google/oss-fuzz-vulns/blob/53ad4dcef49b6c5f1c26480832011328e472b7c3/vulns/curl/OSV-2022-450.yaml#L28 This would allow users who are using curl as a library (and pulling it in as a e.g. a submodule) to automatically make use of curl advisories by just checking git trees. |
- Provide a bogus (curl) Id and make the CVE an alias - Add a (made up) time to the published string to make it correct syntax - Provide URL and CWE in a "database_specific" object - Rename 'last' to 'last_affected' within the affected ranges Reported-by: Oliver Chang Ref: #240
Thanks for the excellent feedback!
Being a JSON rookie, what's the easiest way to verify this json against that schema?
We fake the score based on the named severity level (which is the only severity we provide), but I cannot fake a whole vector string. I don't think we can comply with this. I think it's a weakness in the schema. CVSS is a poor system and we don't play long in that game. We only have one of Low, Medium, High or Critical as level.. How do we convey that in the metadata? Should we just leave out severity and provide it as database_specific? Feels a bit lame.
Aren't both informative? (I actually didn't see "last_affected" before which is why it uses the wrong name)
I hope you pick another solution. We are not an ecosystem. This problem is not unique to us and it would be a pity if others in our situation would have to claim they too are ecosystems.
Neat. That's indeed a cool idea. I think we have this information for most/many issues but it is going to take some work for us to convert that into structured data that we can pull out and insert into JSON. I will work on it. I made #241 to fix some of the easy things first. |
Thinking further, I think I won't. We already provide exactly every single vulnerable version and it is easy for any git user to figure out if they are in a vulnerable range or not. |
It did not comply with the format anyway so just drop it. Ref: #240
Check out the README here https://github.com/ossf/osv-schema/tree/main/validation :)
We see "fixed" as being the most informative here. last_affected is redundant in such cases because we have the
Indeed it's not ideal, but the best solution we've come up with so far is to use repository URLs to achieve this. Most open source projects can leverage this "namespace/ecosystem" of repository URLs to identify themselves. I do understand that repository URLs aren't totally accurate for curl, but you still be agreeable to using that as an identifier?
Could I convince you otherwise? :) It's not trivial for git users to map the versions back to git tags/commits in an automated way. Also if the git metadata is provided explicitly, it enables our https://osv.dev API to index it and allow for queries such as:
To return any curl results that match. Thanks! |
We use vulnerability data and I can confirm that the low/medium/high/critical scales differ wildly between different vendors, the And regarding the I also do hope it doesn't come to defining ecosystems for libraries, the repository URL solution seems like a better one, I do agree with @oliverchang that it all comes back to the source. |
severity
Maybe, but the CVSS score is in no way better. NVD for example re-scores everything anyway so whatever we provide, they modify it anyway. The CVSS is also too one-dimensional and leaves out a lot of factors which tends to make the score higher than they would be when taken a larger take. Which is part of the reason why we in the curl project opt to not play the CVSS game. There simply is no objective way to establish the severity of an issue. There are but opinions of different parties. complianceBy not providing any info at all about severity or package the json object complies with the schema... Not sure how that makes the JSON better, but at least there is no complaints from the tool. git ranges
Perhaps. It's a lot of work, and there are now several pieces of metadata we cannot provide anyway in the JSON object so the question is how much value we add by adding data to a partially supported JSON object? Like this:
Doesn't that also need to tell which package it asks about? And you cannot ask about curl because it has no ecosystem so that field cannot be populated: so this command line cannot be run anyway. |
I've started to update the advisories (and thus the json objects) with specific git commit info: introduced and fixed. |
These are extracted and used to populate the JSON objects accordingly. Ref: #240
The |
How does the API know which project/git repo to check then? |
I have now gone through and provided git range information to what I believe is all curl CVEs filed since 2017 (current count: 85). Those advisories have been updated and the JSON objects are now populated automatically with metadata from those. |
How does the API know which project/git repo to check then?
The spec says the git commit hashes must be full-length, which makes them
unique across all projects, making the project or git repo unnecessary. I
notice that the ones in the curl JSON are truncated.
|
So it then scans through all projects to find a match? Didn't expect that.
It sounds like the schema needs to be updated because it does not object to shortened hashes... |
I made the script convert all hashes to the long format now. |
that turned out to be a good idea, since that also verifies that the hash is correctly entered so it helped me find a few mistakes! |
Are the old curl advisory IDs easily accessible (e.g. adv_20150108A)? Those are
listed in places under "vendor advisory" so would be good to have in the JSON
as an alias so they can be found that way.
|
We never used them as actual IDs in the project, we used them as URLs. We still have working redirects for those old URLs to end up on the appropriate current page. Is it really a good idea to upgrade those to "official IDs" now when we haven't used them for the last five years? Surely the only users of those are using the URLs? They could be resurrected from git, but then we would also need to invent a way to store it in the advisories and get it into the JSON. |
I suspected it might not be easy any longer. It's probably not worth the effort then.
|
Are there any more details left to fix now? The JSON seems to verify against the schema now. We now provide several different JSON sets:
|
Thank you all and @bagder for this! The JSON output looks pretty great now :) And we'll soon work on ingesting this into osv.dev to make this useful to all OSV users. The very last requests here are to:
Point 2 is also helpful for our own immediate osv.dev ingestion as well, since right now we only support ingesting OSV from git repos, or GCS buckets. We have an in-process FR to support ingesting from HTTP endpoints as well (such as your current feed), but that may take a little bit of time. |
Already done. Here's an example: https://curl.se/docs/CVE-2022-35252.json (there are links to these from the corresponding html version on the web) We also have curl version specific endpoints. Example: https://curl.se/docs/vuln-7.88.1.json - which contains all the CVEs that affect that specific curl release.
Everything that ends up in the JSON objects is data that is already present in git. The JSON files themselves are generated with a script that is also in git. The content for the JSON comes from each separate advisory markdown (example https://github.com/curl/curl-www/blob/master/docs/CVE-2021-22926.md) combined with the vuln.pm file (https://github.com/curl/curl-www/blob/master/docs/vuln.pm). The JSON files themselves are not in git because it seems much better to do it this way. Now we can easily regenerate them whenever we need to. If we want to update the JSON objects or if any metadata changes. |
Awesome!! Any way to get https://curl.se/docs/CURL-CVE-2022-35252.json to work as well? (to match the "id" field).
Sure, that sounds reasonable. |
You are already making me regret faking that Id. No. That's not a real Id. That's a fake one we put there only because the schema doesn't allow us to use the only Id we have for the issue: the CVE Id. I don't want to add infra around this fake Id to make it seem like its real. I made the "URL" key in the JSON use the URL to the CVE specific JSON instead of the HTML one like I had before. I think the official schema should have a URL field like this. |
It's sadly necessary when there are several vulnerability databases that export the same IDs. We need a way to disambiguate them. e.g. https://osv.dev/vulnerability/DLA-3288-1 also refers to the same CVE, but we know this is the Debian specific one because of the DLA ID. For the sake of consistency, https://curl.se/docs/CURL-CVE-2022-35252.json would make it easier for users to discover from a given OSV ID in an automated way. What if https://curl.se/docs/CURL-CVE-2022-35252.json redirected to https://curl.se/docs/CVE-2022-35252.json instead? |
But what would make anyone try accessing |
A few places I can think of:
|
Okay. It would be way better if you allowed the issues to provide working URL themselves instead of assuming how you can request them. But I've made |
Thank you for understanding! I'll close this for now, since I don't think there's any other changes needed on your end here. |
Hey! I'm from the OSV (and OSS-Fuzz :) ) team. Really excited to see #237.
It looks like the current export is not spec-compliant, and we'd like to help work together on making this more compliant and get this integrated into https://osv.dev.
We have a JSON schema available that performs some of this validation that you can run against the current entry, which gives:
Additionally, there are some issues where our JSON schema current fails to validate:
the "id" field is the CVE ID. Is there a CURL specific ID instead that we can use to disambiguate this from other OSV sources? The "CVE" id can go into the "aliases" field to help link it with other sources in OSV that have the same CVE. A cheap way to do this is to just prepend "CURL-" to the CVE ID or some variation of that.
Only one of "last_affected", or "fixed" should be specified, but not both.
Generally for fields that don't fit into OSV, you can also use
database_specific
:https://ossf.github.io/osv-schema/#database_specific-field to add your own fields that don't fit into the main OSV fields.
CC @dfandrich who also raised the issue on the
package
naming in ossf/osv-schema#94. Hopefully we can reach a resolution on that as well, (e.g. by defining a "Curl" ecosystem).The text was updated successfully, but these errors were encountered: