Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Criteria for test relevancy #650

Closed
jugglinmike opened this issue May 26, 2016 · 3 comments
Closed

Criteria for test relevancy #650

jugglinmike opened this issue May 26, 2016 · 3 comments

Comments

@jugglinmike
Copy link
Contributor

From time to time, we receive suggestions for tests based on known bugs in existing implementations. These tests usually assert some expected behavior that is implicitly defined by the specification. In other words, one wouldn't expect them to fail unless they had some domain knowledge about specific implementation details.

This aspect concerns me somewhat because it can make it harder to set expectations for contributions. If there is a known bug in 3 major engines (e.g. gh-649), then the value of the test is clear. But if, for example, Espruino alone has some surprising behavior in a very specific case, should Test262 include tests for that?

A simple policy would be, "yes, we accept any test as long as it conforms to the ECMAScript standard." But this can make reviewing "greenfield" tests more difficult: as I mentioned elsewhere, an "anything goes" policy can make the review process more difficult. In the worst cases, the contributor would feel indentured to the arbitrary whim of their reviewer(s). I sometimes worry that @bterlson fell victim to this in the months-long review precess for async functions.

I have personally submitted implementation-specific tests in the past, but I've always had mixed feelings about it.

I think what would satisfy me would be a policy like this:

We accept tests for behaviors that are explicitly defined in the ECMAScript
specification. In addition, if a bug has been exhibited by two or more
independent implementations, we will accept tests asserting the correct
behavior.

Does this formalism seem helpful? Or is it just a case of my own pedantry run amok?

@littledan
Copy link
Member

I don't think we need a strict policy about which tests are irrelevant like this. It does seem reasonable for some test262 contributors to prioritize their work this way. But I don't see any reason to reject tests contributed which end up passing in 3/4 browsers. I don't know how one would determine whether a test is covering explicitly defined behavior or bugs--bugs mean that an implementation doesn't follow explicitly defined behavior, right? A vague definition sounds fine for you and others to prioritize your own work, but not as nice as a way to keep tests out.

@bterlson
Copy link
Member

bterlson commented Jun 1, 2016

I think it would be good to have rough guidelines, but it seems very hard to define in a rigorous enough way that we will remove a significant amount of subjectivity. Even in the realm of clear spec-targeting tests there is an almost infinite space of tests we could require (eg. testing with every sort of value, boundary value analysis, etc.). Eventually these sort of tests move from being useful to useless with the exception of those that are explicitly useful due to known implementation bugs.

Since the space of valid spec-targeting tests is infinite, it might actually be the case that contributors' time does an effective enough job of gating test contributions. If someone actually goes through the trouble of writing a test and submitting a PR, chances seem very good that it's a useful test for some reason or other.

In other words, documenting the minimal set of tests someone should write for a feature is very good, but I'm not convinced we should document what tests we don't want.

@jugglinmike
Copy link
Contributor Author

Thank you both; your perspective brought me back from the cliff of "eliminate
all subjectivity from everything," (I find myself there often).

When it comes to, "documenting the minimal set of tests someone should write
for a feature," I'm interested in helping with that. I think it will take some
more thought, and the discussion probably doesn't belong on this thread. I'll
close this issue, and I'll follow up with another when I have some time for
that additional documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants