Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to draft 06 #24

Closed
binarylogic opened this issue Oct 29, 2017 · 41 comments
Closed

Update to draft 06 #24

binarylogic opened this issue Oct 29, 2017 · 41 comments

Comments

@binarylogic
Copy link

Draft 06 of the JSON schema has been released, here's a summary of the changes.

I might work on a pull request (time permitting), but wanted to get an issue up.

@TreyE
Copy link

TreyE commented Nov 3, 2017

@binarylogic
I've begun work in this direction in https://github.com/TreyE/ex_json_schema/tree/draft06. It also includes my already existing PR that moves to the official schema testing repository for draft04 as well.

I've got it down to about ~60 failures, which is a good improvement over nothing, however the code still needs cleanup to remove some duplication, and I haven't yet decided if we should split the validators between the draft versions. For now, I've implemented much of the newer functionality as transforms to the draft04 schema (which allows me to leave the old code intact, but makes the error messages slightly confusing).

I haven't seen any updates by @jonasschmidt in almost a year now, so do you know if this project is dead and seeking a maintainer?

@jonasschmidt
Copy link
Owner

Thanks a lot for starting to work on this. I will take a look at the new draft and your branch to get a sense of the scope of those changes.

To be honest, I just haven't found the time to work on the project recently in between work, family and moving cities this year.

On the other hand, the only bigger improvement that was requested multiple times and is still missing is changing the way that errors are returned from strings to a data type that makes it easier to process the errors further. I already did most of the work in https://github.com/jonasschmidt/ex_json_schema/tree/validation-errors but haven't even found the time to take care of the last few %. I will most likely get a chance to finish that work in the following weeks. And also look into the rest of the smaller bugs that were reported.

So in a nutshell, I will put in some time until the end of the year and will decide then whether I can find the necessary time to put int the project. Or if I have to find someone to take over as maintainer.

Thanks a lot for taking an interest and moving the project forward.

@sascha-wolf
Copy link

How is the progress regarding this issue? Is there anything one can do to help?

@handrews
Copy link

Note that draft-07 is out, and we have published some release migration notes explaining the differences.

draft-07 is backwards compatible with draft-06

@asummers
Copy link
Contributor

asummers commented Apr 20, 2018

Just a quick note -- I'm working on resolving the differences for version 6 then eventually version 7. It's likely diverged enough from master to be its own project, unless @jonasschmidt is willing to accept some rather large PRs. I'm down to 4 failures against the official suite for v6 with no skipped tests thanks to the work by @TreyE. I will update here when publicly available and update on further progress on v7.

@jonasschmidt
Copy link
Owner

I'm definitely open to accepting any work done on supporting draft06/07. Still haven't found the time to look into it more deeply, but as soon as someone comes up with a working version, I can help with getting it merged and released.

@asummers
Copy link
Contributor

asummers commented Apr 20, 2018

The tl;dr of what I'm doing is following along with the pattern established by moving the Properties, Format, etc. logic out to their own modules, and then going a little further and giving them control over their internals of even accepting the validation, versus matching in the outer scope (likely introducing a behaviour). From there, since they'll all be given the root schema, they can switch on behavior relative to specific versions. That should dovetail into the work for making better error messages, as they can individually start producing structs or whatever we go with for consumption by the main validation chain in validate.ex. The internals are a little hard to parse with remote refs in schema.ex, but that's the last blocker to getting 6 working. After that, backporting and switching on v4 behavior will happen, then based on the patterns established there, adding in v7 support.

I imagine the internals will look a lot like:

@behaviour ExJsonSchema.Validator

@impl ExJsonSchema.Validator
def validate(root = %{version: 4}, schema, {"$ref" => ref}, data) do
  validate_v4(root, ref, data)
end

Then the outer module can just reduce over a list of Validator modules provided explicitly, removing intimate internal knowledge from the dispatcher.

@asummers
Copy link
Contributor

That also allows non official extensions to be added, such as with the uuid format supported in AJV, as someone can just make a new module with that behaviour, and attach to a list of custom_validators or something.

@handrews
Copy link

@asummers supporting non-official extensions is great as a lot of draft-08 will be about making extensibility easier. And easier to detect than the current situation of "well, if it's not in the spec it's an extension, but who knows what it means?"

@asummers
Copy link
Contributor

@handrews do you have a ballpark timeline of when you expect draft-08 to be released? No real reason behind the question other than curiosity and knowing when the underlying tooling would need to be adapted.

@asummers
Copy link
Contributor

asummers commented Apr 21, 2018

@jonasschmidt master...asummers:v6-upgrades This is the current diff of my branch. I was wrong about only having four failures, there were several false test positives due to a bug I had in the test setup. It now has 16 failures for v6 that I'm working through, mostly in the ref resolution. It has the general shape of what I'd expect it to look like now, so please give it a look over and let me know if you have any fundamental problems with the approach. Otherwise I'll keep on keeping on.

Interesting notes:

  • https://github.com/jonasschmidt/ex_json_schema/compare/master...asummers:v6-upgrades?expand=1#diff-7b11e4a3423b62253a7fd16d8c4b1e1fR2 This file is significantly simpler now. I rather like how that turned out

  • I added the Validator behaviour in that module, but I have no problems moving that to not pollute that file if that makes sense.

  • Test cases are now in a CaseTemplate that you simply give the URL for the tests, and it will go and do the right thing. I removed the ignore tests behavior because I wanted to just truck through it all, and that functionality made that code a lot more complicated, but this could be added back in a simpler fashion if we need to.

  • All JSON Schema concepts now have their own validators so it suggests an entry point for all future spec compliant checks, and since the validator.ex is just given a list, we can augment it with any module that expresses a validate/4 function per the behaviour. Unsure what that API should ultimately look like, but it allows that flexibility.

  • I have not added moduledocs anywhere but will eventually.

  • I removed existing tests as the behavior being checked did not seem to match v6 official tests. I imagine I'll have to fix that behavior when I back in v4.

  • https://github.com/jonasschmidt/ex_json_schema/compare/master...asummers:v6-upgrades?expand=1#diff-f5b1af2dd6c9a42d56541733adb15260R32 This is kind of fancy. I was getting errors with my other approaches and this came out cleanly, I think.

  • I tag all tests now. I found myself needing to run either an explicit test or a whole suite of them, so I adjusted the runner to allow mix test --only json_schema_multipleOf. I needed the prefix because "type" is reserved, apparently. Additionally, I allow you to opt into a test you're looking at individually by expressing its name. This allows more targeted debugging for failures.

  • master...asummers:v6-upgrades#diff-6445d87a5a54d579bdb0848cf521ec3eR10 Shows how some backporting could be done. Not sure how much I love that exact implementation, but it illustrates the idea and passes the test suite.

This is very much a WIP, so I'll be sure to remove all debug statements and get Dialyzer happy before PRing.

@asummers
Copy link
Contributor

asummers commented Apr 22, 2018

Okay further updates:

master...asummers:v6-upgrades

I have added back in ignoring of tests. You can either ignore by suite ("if-then-else" as in v7), or by test name and description ("validation of regular expressions: a regular expression with unclosed parens is invalid"). With this, we have full ability to iterate on v4, v6 and v7 failures, and addition of v3 if we choose to add it. I have no such plans to add v3 to be honest, but the pattern is clear for if that were to be necessary for someone.

There are currently 79 skipped tests among the 3 suites being tested, but many are the same underlying errors around refs and some dependency tests.

With the ignoring of tests, all 3 test suites are passing, and I think this warrants PRing into some feature branch somewhere. It can be iterated on, and the remaining few failures can be moved to zero.

I began introducing moduledocs for the various validators, linking to their docs in the specs. This pattern is clear as to how to proceed forward. Don't know if that documentation is sufficient, but it begins the conversation there.

There is currently one failing Dialyzer error that I'll hopefully figure out in the next day or so.

I did a bunch of cleanup all around, and tried to simplify concepts where I saw them. Please let me know how I should proceed with getting this into the mainline branch.

@asummers
Copy link
Contributor

asummers commented May 6, 2018

@jonasschmidt Hey there, just a ping, wondering if you've had a chance to read through any of the above.

@jonasschmidt
Copy link
Owner

Sorry for losing track of this. I had this filed under "you're still working on it" and it completely slipped my mind.

The work you did so far is really impressive, thanks a lot! I will have some time today to look into the details and give you feedback, and then we can come up with a plan to move forward and get it merged.

@asummers
Copy link
Contributor

asummers commented May 8, 2018

No worries whatsoever! It's still WIP as far as ref management, but the lion's share of features are working as expected. We're using it in production for non-ref'd schemas and it works well. I'm still a bit unclear on the ref semantics, as they're not mentioned in the spec explicitly, but those seem discrete enough that they make sense as a separate PR to the branch, probably.

@jonasschmidt
Copy link
Owner

Yes, the whole ref handling business is a pain to wrap your head around. Especially when it comes to altering the resolution scope inside a schema with the id: http://json-schema.org/draft-04/json-schema-core.html#rfc.section.7.2

It also used to be a bit easier in an earlier implementation because I just used closures to resolve references, which were already bringing their scope with them. Unfortunately that did not work with Erlang releases though, so the whole thing became a bit more involved.

I will also have to look into what the later drafts have to say about refs. I seem to remember some discussions about dropping support for all kinds of complex and mind-boggling ref behaviour.

@asummers
Copy link
Contributor

asummers commented May 8, 2018

My understanding (and maybe @handrews can chime in here) is that Draft-08 is giving them proper treatment. There were several open issues on the spec repo around cleaning up their semantics.

@jonasschmidt
Copy link
Owner

I created a branch that you can send a PR to (https://github.com/jonasschmidt/ex_json_schema/tree/multi-draft-support).

It seems like a bunch of the test failures regarding refs are about those scope changes that I was referring to before. I think most of them fail against the current implementation as well, so they could definitely be handled separately.

If it's not too much effort, it would be great to fix them as part of this overhaul, to get to a clean slate. Otherwise there's the added difficulty of merging/rebasing if those fixes are made on master. I can definitely help with that.

@handrews
Copy link

@asummers Not quite sure of the draft-08 timeline, although I'm still actively working on it. I also just quit my job and am taking the summer off from the tech industry. So it depends on how much tech-related work I feel like doing (but here I am commenting on things so obviously I'm doing some :-) )

I'll comment separately on the other points you and @jonasschmidt raise above.

@handrews
Copy link

@asummers

I'm still a bit unclear on the ref semantics, as they're not mentioned in the spec explicitly

@jonasschmidt

I will also have to look into what the later drafts have to say about refs. I seem to remember some discussions about dropping support for all kinds of complex and mind-boggling ref behaviour.

While I don't like to speak ill of work done in good faith, the approach that draft-04 took to the concepts was very complicated and seemed to confuse many people. Austin Wright rewrote all of that for the following draft (reframing "canonical" vs "inline" in terms of "internal" and "external", although see below for further changes), and we've tweaked the wording every time since then.

In draft-handrews-json-schema-01 (which is still "draft-07" as it's just a bugfix/clarification of handrews-00) we reframed the whole mess again in terms of how you load schemas and dereference $ref for previously loaded vs unknown ref targets. This seems much more straightforward than earlier approaches.

We also added a lot of examples to clarify how $id and $ref work.

As far as I know, this is all essentially the same as draft-04, just disentangled and with the emphasis placed on different ways of thinking about things. Any odd cases that were "changed" between draft-04 and draft-07 we're pretty sure were bugs in the first place. It just boils down to totally normal RFC 3986 URI reference resolution, plus being able to use a previously-loaded schema if available.

Feedback on the latest text for this area is very much appreciated. The concepts aren't really that complicated, they've just been made complicated for... um... idek. Reasons.


$ref changes slightly in draft-08 to clarify its semantics. We're nailing down the language in PRs right now, actually.

Specifically, instead of treating $ref as (conceptually) being replaced by the reference target, we are changing it (again, conceptually) so that the $ref keyword simply has the same result as the reference target. This means that other schema keywords can be used alongside of $ref and combined just as any other keyword result is combined. TL;DR: it does what you'd expect, and what many people try to do already.

By "conceptually" I mean that implementations may actually do something different internally, like make a copy of the reference target, treat it as if it's stuffed into an allOf with the adjacent keywords, etc. As I said, we're sorting through this in PRs right now if anyone wants to comment. It's a significant change, but we think it gets rid of some seriously counter-intuitive behavior.

@handrews
Copy link

Also, modeling $ref as replacement produces some logical inconsistencies when the target and the source use different meta-schemas. I'll spare you the details as it's pretty mind-bending. Modeling it as "has the same result" means that you don't care what meta-schema the target is using. The source and target can be processed independently, with different meta-schemas if necessary.

@jonasschmidt
Copy link
Owner

Thanks @handrews for that in-depth explanation of what's happening with refs. That was really helpful. I have to say that the additional tests around the different ref behaviours in the official test suite help a lot as well. They weren't there when I came up with the original implementation so I had to make sense of the spec (and it turns out I got it slightly wrong). I always find it easier to implement against a set of tests than against a written spec (that official test suite is such an awesome idea in general, every standard should have that).

@jonasschmidt
Copy link
Owner

@jonasschmidt
Copy link
Owner

Huh, just found out that the script that's run on Travis didn't actually run the tests but only Dialyzer 🤦‍♂️ . Turns out that mix dialyzer --halt-exit-status actually halts and the coveralls.travis task that runs the tests is never executed. That behaviour must have changed recently.

Just found out that the build was actually broken since this commit here: 27d4020.

I reverted that commit and fixed the Dialyzer errors only. I hope that doesn't uncover other tests that were broken for an older Elixir version somewhere in between :)

@asummers
Copy link
Contributor

asummers commented Jun 1, 2018

Interesting. Fingers crossed!

@lud
Copy link

lud commented May 24, 2019

Hello. Sorry to jump aboard but I did not want to create an issue for that.
With the current master branch, wich supports dratf 4 as far as I know, will it fait if I use a draft 6/7 schema or will it just ignore the rules ?

Thank you.

@bradleygolden
Copy link

bradleygolden commented Oct 16, 2019

@jonasschmidt, is supporting multiple drafts still something worth pursuing?

@handrews
Copy link

Draft 2019-09 (formerly known as draft-08) is out, and is a major milestone solving numerous long-standing difficult problems. It also has a formal extensibility mechanism. OpenAPI is planning to adopt this draft in their 3.1 specification version, and we hope that there is only one more significant draft, which will be based on feedback to this one.

If you only implement one newer draft, implement this one. draft-06 and draft-07 support is nice if you can add it, but jumping to 2019-09 should be the main goal.

@jonasschmidt
Copy link
Owner

Yes, the problem was that there was some restructuring going on in the project while the multi-draft support PR was being worked on. I will have a look this week if it’s possible to rebase that PR, otherwise I will see how much work it would be to do it more manually.

@icehaunter
Copy link

Sorry to bump a thread, but is there any news about supporting 2019-09 draft? Is there anything that could be done to contribute?

@jonasschmidt
Copy link
Owner

Yes, thanks for bringing this on my radar again. The whole coronavirus didn't help with my work on that 😉
I will have the chance to look into it and hopefully make some progress on next Wednesday. I will update this issue with the status/progress then.

@handrews
Copy link

handrews commented Jul 9, 2020

Note that we will put out a new draft in conjunction with the OpenAPI Specification 3.1, such that OAS 3.1 and the new JSON Schema draft are fully compatible with each other. At this point I'd recommend targeting that rather than 2019-09 (it's not necessary to support every draft, and the purpose of 2019-09, along with OAS 3.1-RC0, has been to get feedback on new issues.

I'm informally keeping track of keyword changes across drafts. The bits on the next draft are not guaranteed accurate at all times, but at least were accurate recently (and should be accurate as of right now).

@jonasschmidt
Copy link
Owner

Nice, that's great to hear that those standards are converging. Supporting the "final" compatible standard would definitely be the goal once the multi-draft support is working for drafts 6/7. I will know more by the end of the day.

@jonasschmidt
Copy link
Owner

Thanks @albertored for fixing most of the filing tests. That was a great help 🙇

I fixed the remaining tests, added if/then/else support and now all the tests that are supposed to be passing for drafts 6 and 7 are green. There are still a few things to clean up around returned error structs, and I have to cover the actual errors that are returned for the new keywords with unit tests. But the bulk of the work is done and I should have a final release out within a few weeks (after my summer holidays 🌞).

@TreyE
Copy link

TreyE commented Jul 21, 2020

I know I haven't checked in here in dogs years, but I'd like to thank everybody involved for all the work.

This is one of the longest running and most complex issues I think I've ever seen tackled on an open source project and you all deserve a real kudos for the perseverance and dedication.

@icehaunter
Copy link

Hello, @jonasschmidt. Very sorry to bump the thread again, but is there any news for this? We've finally hit the point where we need the newer spec, so I was wondering what I could do to help it along?

@jonasschmidt
Copy link
Owner

Yes, thanks for bumping this again. I'm almost done, all the tests for draft 6 and 7 are passing and the error handling for new validation rules is in place. The only thing left is some performance improvements.

If you want to give it a try you can have a look at the https://github.com/jonasschmidt/ex_json_schema/tree/multi-draft-support-rebased branch. The public API doesn't change, so it would definitely help if you could verify everything works as expected.

@bradleygolden
Copy link

@jonasschmidt, I swapped out our custom implementation that we have of draft 7 with your branch and all the tests passed. This is a good indicator for us that it works but I would definitely want to do more thorough testing before completely switching over. Either way this is a good sign and I'm very excited for draft 7 support. 😄

@davidebriani
Copy link

Hi, checking in to see if this is still work in progress. @jonasschmidt any news here? Support for latest drafts would be awesome.
Nice work so far by the way :)

@jonasschmidt
Copy link
Owner

Hey, I made some progress in the mean time and will publish a release candidate today. I will update this thread once it's out.

@jonasschmidt
Copy link
Owner

Ok, version 0.8.0-rc1 is published, which adds full support for drafts 06 and 07. Please let me know if there are any issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants