-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The JSON for SSVC "options" splits out keys into individual records #576
Comments
Hello Jay, Thanks for your feedback. I believe the intent in options being an array was for it to be an ordered list, the Decisions Points in schema are considered ordered. This makes it possible to arrange the Decision Point bundles and embed them both in a Computed schema as well the full Decision Tree schema. There was some discussion related to this earlier that may be relevant Some related information from JSON Schema discussions Google Group |
First, I do not understand the need for these to be an ordered list. The "decision tree" in SSVC is simply for visualizing. The ordering of variables will never change the outcome. But that discussion is not what this issue is about. Let's just assume they are ordered, it still does not mean that the individual variables need to be represented in the JSON as ordered. In other words, the representation/presentation can be different than the storage. In #290 it is stated:
Which I think is true, the CVSS vector representation is ordered, so take this example from CVE-2007-3484 where the CVSS is stored in JSON:
Notice how the I maintain that the way the |
I think there's some confusion here on what needs to be "ordered" in SSVC. The rationale is laid out in decision record 0008, but the gist is that within a single decision point (think CVSS vector element), the values must be ordered. So for That is not saying anything about how to represent multiple specific values across decision points, which seems to be what this thread is touching on. I realize this doesn't resolve the question, I just wanted to point out that the "ordering" line of argument might be a red herring to the issue at hand. I haven't had a chance to reexamine the json schema mentioned, so I'm not prepared to comment on the rest of the thread at the moment. |
Ah okay. I think your recommended schema will work. The current schema also works, but I think your request is to primarily optimize the schema? Will the records consuming tools fail to understand or process the schema? If you feel strongly the flattening of the schema is user, you can make a PR suggestion with the schema, and update the JavaScript library for us to run our tests and take it in. We only need that your PR be digitally signed and be evaluated by us before merging it. On a related topic: The reason for the Computed schema For example a valid SSVC computed schema options below states that Exploitation is either "poc" or "active" but NOT "none" and Mission Prevalence has NOT been evaluated by the Role of this metrics developer.
|
So ignoring the current schema for a moment, the CVSS example @jayjacobs mentioned might turn @sei-vsarvepalli's example into
But that would have the unfortunate side effect of having a dictionary whose keys are always strings but whose values could be strings or lists, which could lead to ambiguous parsing. So maybe
would be preferable? Every value is consistently a list of strings. The list could be of length 1 up to the entire decision point (implying that the record is asserting no information about that decision point) This seems logically consistent with what we say in https://certcc.github.io/SSVC/howto/bootstrap/use/#partial-or-incomplete-information
|
I've given this some thought and I cannot open a PR for the work required, I just do not have the time and you do not want me working on javascript. But for future reference, an array of objects is generally treated as a collection of separate objects (think of unique rows in a table), this is the only JSON I've come across (so far) where an array is used to order a single object - meaning each object in the JSON is actually a part of an object defined at the array level. It's just not standard JSON practice. Also, an array denotes a one-to-many relationship. If you what something like "Exploitation" to have a one-to-many relationship with its values, than the whole thing should be an array regardless if one or many values exist, do not store When someone needs to convert JSON data into columnar format that relationship needs to apply across all records and cannot exist in a per-record definition. This means that even simple records would be an array:
|
I think this is consistent with what I intended in my comment above -- that the values would always be an array, even if there is only a single element. However, due to the nearly coincident timing of our respective comments, I'm not sure whether you were responding to mine or whether we were both replying nearly simultaneously. |
Yes sorry, my last comment was for @sei-vsarvepalli and your comment came in right before I posted. I agree @ahouseholder, your comment and mine seem to be very much in line. |
The decision points don't need to be an ordered list; the order of the decision points in a tree is not material to the output. There are some display choices we make about ordering points with a tree's display, but that should be something that is ensured by the display tool, not the JSON format. I agree the values should always be an array for each decision point, just to keep us aligned with the ability to relay partial information (though we could determine that use case is overtaken by events and no one wants to do it). Thanks Jay and Allen for converging on a solution. |
Capturing some additional thoughts on a way forward based on internal discussions:
flowchart TD
SSVC_Computed --> Decision_Point_Group
SSVC_Provision --> Decision_Point_Group
Decision_Point_Group --> Decision_Point
|
I have a branch that tries to resolve this issue with a consistent schema that can be used for Decision Points and Decision Point Groups. https://github.com/sei-vsarvepalli/SSVC/tree/feature/issue_576 specifically take a look at https://github.com/sei-vsarvepalli/SSVC/blob/feature/issue_576/data/schema_examples/Computed-CVE-2014-0751-Coordinator.json if that comes close to what would be reliable way to parse and represent SSVC data via ADP. This requires a bit of coordination with CISA developers to ensure the new schema also gets adopted. |
This part is also resolved in the recent PR #588 |
There are some additional snags we ran into in trying to address this in PR #588:
|
Now we are pursuing #599 as the potential way to represent ADP data. If @jayjacobs has comments, speak now... |
Status summary: I believe the primary issue in this ticket---the need for a schema for a succinct list of decision point selections, which sits upstream of cisagov/vulnrichment#40
In the course of working on that, we tried which was a bit too big to be manageable, but that spawned
However, I believe everything in the above list is peripheral to the resolution offered in #599. |
Looks like this expanded way beyond what the original issue was. I just spent 15 minutes going through various issues and pull requests here and I could not figure out what to comment on. Any chance someone could paste an example of or link to what the |
Hello Jay, It will no longer be Ideally the ADP container adheres to this proposed schema, although the "id" field will be redundant. We are exploring other optionally things like a |
Somewhat off topic, where did |
I'm stuggling a bit with how to interpret multiple decision point values. I understand the rationale, but this seems to imply that the default value for a decsion point is "all the values" then analysis excludes some values, narrowing down the decision. Using Automatable as an example, does this mean the effective starting or defaults look like this? {
"namespace": "ssvc",
"name": "Automatable",
"key": "A",
"version": "2.0.0",
"values": [
["Yes"], ["No"]
]
} Are empty values ( Are the muliple values each an individual boolean/flag, the value is set or not? I don't think logical OR works for what appear to be boolean values, e.g., Automatable cannot IRL be Yes and No at the same time. For other decision points, some values are not mutually exclusive (Exploitation could be PoC and Active at the same time). |
Hello @zmanion Some clarifications
|
Okay, my original and only concern was that the JSON produced a sane data structure when automatically parsed. And it looks like it does now, so thanks. I do hope you don't do an array of arrays like |
Hmm.. I think the array of array is probably a typo error. |
Yes sorry my bad, I either saw that somewhere or accidentally added the internal brackets, should be an array of strings. |
This is mostly resolved. Waiting on CISA adoption of the new schema and models. |
The schema (with example in
data/schema_examples/Computed-CVE-2014-0751-Coordinator.json
) is being leveraged by CISA for Vulnrichment and creates some unfriendly JSON.I opened issue #40 on vulnrichment to talk about it and they suggested they are just following this schema.
Long story short, by specifying each key:value pair in it's own object under
options
, it is flattened (by tools) to be unique records, when all of those key:value pairs represent a single object. (see issue 40 in vulnrichment)It also does not make sense to have the
options
specified as an array if it is a single object tied to the singlecomputed
field in the same record.Fixed example:
The text was updated successfully, but these errors were encountered: