Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failures on Ubuntu 20.04.2-desktop-amd64 (clang) #180

Closed
posguy99 opened this issue Feb 17, 2021 · 6 comments
Closed

Test failures on Ubuntu 20.04.2-desktop-amd64 (clang) #180

posguy99 opened this issue Feb 17, 2021 · 6 comments
Labels
regressfail Regression test failure

Comments

@posguy99
Copy link

posguy99 commented Feb 17, 2021

It fails:

[163] ubuntu $ bin/shtests types
#### Regression-testing /home/mwilson/src/ksh/arch/linux.i386-64/bin/ksh ####
test types begins at 2021-02-16+20:23:33
test types passed at 2021-02-16+20:23:33 [ 84 tests 0 errors ]
test types(C.UTF-8) begins at 2021-02-16+20:23:33
test types(C.UTF-8) passed at 2021-02-16+20:23:33 [ 84 tests 0 errors ]
test types(shcomp) begins at 2021-02-16+20:23:33
	shcomp-types.ksh[139]: z.r.s should be z.r.x
test types(shcomp) failed at 2021-02-16+20:23:33 with exit code 1 [ 84 tests 1 error ]
Total errors: 1
CPU time       user:      system:
main:      0m00.004s    0m00.004s
tests:     0m00.134s    0m00.054s

Then later it passes, and I didn't change anything or rebuild again.

[183] ubuntu $ bin/shtests types
#### Regression-testing /home/mwilson/src/ksh/arch/linux.i386-64/bin/ksh ####
test types begins at 2021-02-16+20:47:36
test types passed at 2021-02-16+20:47:36 [ 84 tests 0 errors ]
test types(C.UTF-8) begins at 2021-02-16+20:47:36
test types(C.UTF-8) passed at 2021-02-16+20:47:36 [ 84 tests 0 errors ]
test types(shcomp) begins at 2021-02-16+20:47:36
test types(shcomp) passed at 2021-02-16+20:47:36 [ 84 tests 0 errors ]
Total errors: 0
CPU time       user:      system:
main:      0m00.004s    0m00.004s
tests:     0m00.146s    0m00.039s

Then I rebuild it, and it fails the test again if I run the whole test suite:

test types(shcomp) begins at 2021-02-16+20:59:22
	shcomp-types.ksh[139]: z.r.s should be z.r.x
test types(shcomp) failed at 2021-02-16+20:59:22 with exit code 1 [ 84 tests 1 error ]

But if I run the test again standalone, it passes:

[187] ubuntu $ bin/shtests types
#### Regression-testing /home/mwilson/src/ksh/arch/linux.i386-64/bin/ksh ####
test types begins at 2021-02-16+20:59:44
test types passed at 2021-02-16+20:59:44 [ 84 tests 0 errors ]
test types(C.UTF-8) begins at 2021-02-16+20:59:44
test types(C.UTF-8) passed at 2021-02-16+20:59:44 [ 84 tests 0 errors ]
test types(shcomp) begins at 2021-02-16+20:59:44
test types(shcomp) passed at 2021-02-16+20:59:44 [ 84 tests 0 errors ]
Total errors: 0
CPU time       user:      system:
main:      0m00.000s    0m00.008s
tests:     0m00.134s    0m00.028s

And now if I re-run all the tests without re-building it, types passes again:

test types begins at 2021-02-16+21:10:02
test types passed at 2021-02-16+21:10:02 [ 84 tests 0 errors ]
test types(C.UTF-8) begins at 2021-02-16+21:10:02
test types(C.UTF-8) passed at 2021-02-16+21:10:02 [ 84 tests 0 errors ]
test types(shcomp) begins at 2021-02-16+21:10:02
test types(shcomp) passed at 2021-02-16+21:10:02 [ 84 tests 0 errors ]

A separate failure is this:

[188] ubuntu $ bin/shtests variables
#### Regression-testing /home/mwilson/src/ksh/arch/linux.i386-64/bin/ksh ####
test variables begins at 2021-02-16+21:04:20
	variables.sh[751]: warning: C library does not seem to verify locales: skipping LC_* tests
test variables passed at 2021-02-16+21:04:21 [ 150 tests 0 errors ]
test variables(C.UTF-8) begins at 2021-02-16+21:04:21
	variables.sh[751]: warning: C library does not seem to verify locales: skipping LC_* tests
test variables(C.UTF-8) passed at 2021-02-16+21:04:21 [ 150 tests 0 errors ]
test variables(shcomp) begins at 2021-02-16+21:04:21
	shcomp-variables.ksh[751]: warning: C library does not seem to verify locales: skipping LC_* tests
test variables(shcomp) passed at 2021-02-16+21:04:22 [ 150 tests 0 errors ]
Total errors: 0
CPU time       user:      system:
main:      0m00.007s    0m00.003s
tests:     0m02.350s    0m01.242s

Is the locales failure just a glibc-ism? The bad_LOCALE test also fails on CentOS 8 (sets errmsg to null) if I just test it in a shell prompt.

@McDutchie
Copy link

McDutchie commented Feb 17, 2021

The locales warning is not a failure, it's just a warning (note 0 errors). It's not a problem in ksh, it is exactly what the message says it is: the system does not verify if a valid locale is set, so ksh can't print a diagnostic message, so we can't test for that. I should probably remove the warning as it is evidently too alarming.

As for the types.sh failure, I'm stumped. Does it only occur in shcomp? The GitHub Ci runners also run Ubuntu amd64 and they never fail like this.

@posguy99
Copy link
Author

Re the locales test... no, leave the warning. Instead, I should learn how to read.

Whenever the types fail occurs, it only occurs in shcomp. So far.

It's a VM with 2x CPU cores and 4gb of RAM.

I'm going to set up a loop and have it run the test continuously and count, but it's not like that test is timing-related so I dunno.

@JohnoKing
Copy link

I haven't been able to get the same exact test failure, although when I compile with Clang 10 and set CCFLAGS='-O2 -D_std_malloc' the types test fails with memory faults:

test types begins at 2021-02-17+17:43:34
shtests: line 375: 27285: Memory fault
test types failed at 2021-02-17+17:43:34 with exit code 267 [ 84 tests (killed by SIGSEGV) ]
test types(C.UTF-8) begins at 2021-02-17+17:43:34
shtests: line 375: 27290: Memory fault
test types(C.UTF-8) failed at 2021-02-17+17:43:34 with exit code 267 [ 84 tests (killed by SIGSEGV) ]
test types(shcomp) begins at 2021-02-17+17:43:34
shtests: line 410: 27296: Memory fault
test types(shcomp) failed at 2021-02-17+17:43:34 with exit code 267 [ 84 tests (killed by SIGSEGV) ]
Total errors: 3

The failures shown above are caused by commit 096f46e. I can't get the test to fail after reverting that commit.

@McDutchie
Copy link

Cheers @JohnoKing -- that commit is a Solaris patch with no documentation on what it fixes or how or why, so I'm reverting that now. @posguy99, please test ksh after the revert and let me know if you can still reproduce the regression.

McDutchie added a commit that referenced this issue Feb 18, 2021
This reverts a Solaris patch (105-CR7032068) with no documentation
on what it fixes or how or why. There are reports about it causing
a crash and/or a regression:

#180 (comment)
@posguy99
Copy link
Author

posguy99 commented Feb 18, 2021 via email

@McDutchie
Copy link

sigchld.sh[89]: SIGCHLD trap queueing failed -- expected 'running=0 maxrunning=4', got 'running=1 maxrunning=4'

Urgh. I've never seen that one.

Variable/parameter expansion and signal handling are completely unrelated though, so it's unlikely to have anything to do with that patch being reverted.

The types.sh failure is very much related to variable expansion, though, so it would make sense if that one is gone now along with the crash that @JohnoKing experienced.

It’s not replicable so far, I run the test again and again and it doesn’t happen. Just putting it out there.

Yes, thanks. It's good to be aware that there must be some race condition in the signal handling code somewhere, causing a rare and intermittent failure. Fixing it is another matter though. Intermittent faults are very hard to fix, particularly in this code base which is inscrutable in many ways, and was written by authors who liked experimenting with shiny new ideas so much that their quality control standards pretty much hit rock bottom. :-/

Speaking of intermittent faults, today I had ksh crash in Terminal.app again, at yet another point in the job control code. <sigh>

I’ve had the Ubuntu build running in an infinite loop, it hasn’t failed types.sh yet.

Thanks. I'll close this issue for now, then. Please do comment here if it reoccurs.

@McDutchie McDutchie added the regressfail Regression test failure label Feb 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regressfail Regression test failure
Projects
None yet
Development

No branches or pull requests

3 participants