Criteria for test relevancy #650

jugglinmike · 2016-05-26T16:11:37Z

From time to time, we receive suggestions for tests based on known bugs in existing implementations. These tests usually assert some expected behavior that is implicitly defined by the specification. In other words, one wouldn't expect them to fail unless they had some domain knowledge about specific implementation details.

This aspect concerns me somewhat because it can make it harder to set expectations for contributions. If there is a known bug in 3 major engines (e.g. gh-649), then the value of the test is clear. But if, for example, Espruino alone has some surprising behavior in a very specific case, should Test262 include tests for that?

A simple policy would be, "yes, we accept any test as long as it conforms to the ECMAScript standard." But this can make reviewing "greenfield" tests more difficult: as I mentioned elsewhere, an "anything goes" policy can make the review process more difficult. In the worst cases, the contributor would feel indentured to the arbitrary whim of their reviewer(s). I sometimes worry that @bterlson fell victim to this in the months-long review precess for async functions.

I have personally submitted implementation-specific tests in the past, but I've always had mixed feelings about it.

I think what would satisfy me would be a policy like this:

We accept tests for behaviors that are explicitly defined in the ECMAScript
specification. In addition, if a bug has been exhibited by two or more
independent implementations, we will accept tests asserting the correct
behavior.

Does this formalism seem helpful? Or is it just a case of my own pedantry run amok?

littledan · 2016-05-27T10:55:05Z

I don't think we need a strict policy about which tests are irrelevant like this. It does seem reasonable for some test262 contributors to prioritize their work this way. But I don't see any reason to reject tests contributed which end up passing in 3/4 browsers. I don't know how one would determine whether a test is covering explicitly defined behavior or bugs--bugs mean that an implementation doesn't follow explicitly defined behavior, right? A vague definition sounds fine for you and others to prioritize your own work, but not as nice as a way to keep tests out.

bterlson · 2016-06-01T14:39:14Z

I think it would be good to have rough guidelines, but it seems very hard to define in a rigorous enough way that we will remove a significant amount of subjectivity. Even in the realm of clear spec-targeting tests there is an almost infinite space of tests we could require (eg. testing with every sort of value, boundary value analysis, etc.). Eventually these sort of tests move from being useful to useless with the exception of those that are explicitly useful due to known implementation bugs.

Since the space of valid spec-targeting tests is infinite, it might actually be the case that contributors' time does an effective enough job of gating test contributions. If someone actually goes through the trouble of writing a test and submitting a PR, chances seem very good that it's a useful test for some reason or other.

In other words, documenting the minimal set of tests someone should write for a feature is very good, but I'm not convinced we should document what tests we don't want.

jugglinmike · 2016-06-02T13:18:06Z

Thank you both; your perspective brought me back from the cliff of "eliminate
all subjectivity from everything," (I find myself there often).

When it comes to, "documenting the minimal set of tests someone should write
for a feature," I'm interested in helping with that. I think it will take some
more thought, and the discussion probably doesn't belong on this thread. I'll
close this issue, and I'll follow up with another when I have some time for
that additional documentation.

jugglinmike mentioned this issue May 26, 2016

Are there tests against legacy magic properties that are nonconfigurable and nonwritable? #649

Closed

claudepache mentioned this issue May 28, 2016

Add tests for known violations of invariants #653

Closed

jugglinmike closed this as completed Jun 2, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Criteria for test relevancy #650

Criteria for test relevancy #650

jugglinmike commented May 26, 2016

littledan commented May 27, 2016

bterlson commented Jun 1, 2016

jugglinmike commented Jun 2, 2016

Criteria for test relevancy #650

Criteria for test relevancy #650

Comments

jugglinmike commented May 26, 2016

littledan commented May 27, 2016

bterlson commented Jun 1, 2016

jugglinmike commented Jun 2, 2016