-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Process.environ() on *BSD family #1800
Conversation
Oh nice! I see some errors (errno 0): |
Hi, |
Sounds good. ;) |
0a768c7
to
1dc6963
Compare
09fc282
to
5a20c69
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial review. Little style request - please use:
if (something) {
...
}
Instead of:
if (something)
{
...
}
psutil/_psutil_bsd.c
Outdated
kd = kvm_openfiles(NULL, NULL, NULL, KVM_NO_FILES, errbuf); | ||
#endif | ||
if (!kd) { | ||
PyErr_SetString(PyExc_RuntimeError, errbuf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a utility function, currently exposed for OpenBSD only, which can convert this kvm error:
psutil/psutil/arch/openbsd/specific.c
Lines 50 to 61 in 994c429
static void | |
convert_kvm_err(const char *syscall, char *errbuf) { | |
char fullmsg[8192]; | |
sprintf(fullmsg, "(originated from %s: %s)", syscall, errbuf); | |
if (strstr(errbuf, "Permission denied") != NULL) | |
AccessDenied(fullmsg); | |
else if (strstr(errbuf, "Operation not permitted") != NULL) | |
AccessDenied(fullmsg); | |
else | |
PyErr_Format(PyExc_RuntimeError, fullmsg); | |
} |
I think it makes sense to move convert_kvm_err
out of psutil/arch/openbsd/specific.c
and put it into psutil/_psutil_common.c
(guarded by an #ifdef BSD...
) so that you can invoke it from this file, like this:
if (! kd) {
convert_kvm_err("kvm_openfiles", errbuf);
return NULL;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, nice, I've moved it.
psutil/_psutil_bsd.c
Outdated
|
||
if (!p) | ||
{ | ||
PyObject *ret = NoSuchProcess(kvm_geterr(kd)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not completely sure this is the correct action.
kvm_getprocs
doc is not clear on whether you can/should use kvm_geterr
, but I suppose you can. What I'm worried about here is that the error may not necessarily be a string like "no such process". Perhaps it can also be "access denied". Or something else (in which case we want RuntimeError
). So you should probably parse the string and decide what to do.
You could try a test: call this function for a PID which does not exist. If kvm_geterr
returns a string indicating that, then it means you can rely on kvm_geterr
.
In that case, what I recommend, is reusing and extending convert_kvm_error
utility function (check if string is something like "no such process"), after you move it into psutil/_psutil_common.c
(see my comment above).
The resulting code here should be something like this:
if (!p) {
convert_kvm_err("kvm_getprocs", kvm_geterr(kd));
kvm_close(kd);
return NULL;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked the libkvm
sources and /usr/src/bin/ps/ps.c
on all three BSD variants; they all use kvm_err()
if there's a failing kvm_*()
call.
I've written a tiny standalone test program to check for all return values (on all three platforms).
Fortunately, they all behave the same, they set errno
.
Unfortunately, OpenBSD returns different errnos than NetBSD and FreeBSD.
-
kvm_open*
isn't likely to fail withEACCESS
, as it doesn't open/dev/[k]mem
as in the old days.
It will only fail if the library cannot allocate internal memory, or cannot open the kernel boot file.errno
will be set appropriate.
The question is which exception should be raised -RuntimeError
orOSError
? -
kvm_getprocs
isn't likely to fail itself.
If the search for the PID fails,cnt
is set to0
. If the function itself returnsNULL
, it means that it was not able to allocate memory for the result, or the call parameters were invalid due to a programming error (&cnt
pointing to invalid storage,kd
invalid or NULL, meaning a previouskvm_open*
did not suceed).
We can raiseRuntimeError
orOSError
if the call itself failed.
If the call reportscnt == 0
, this maps toESRCH
("no such process"). This will also happen if the process with the given PID exists, butsysctl security.bsd.see_other_uids == 0
.
Unfortunately, the call behaviour is not consistent in all test cases ofpsutil/tests/test_process.py
.
I had to tweak error behaviour a bit so test results are consistent which what other platforms do, though I'm not perfect happy with that. On the other hand, a Python application which usespsutil
has generic error handling and should not need platform specific error handling for the same error situation.
E.g.TestProcess::test_halfway_terminated_process
results in different behaviour between*BSD
andLinux
platforms. I've tweaked return results in a way that the unit test passes. -
kvm_getenvv
may returnEPERM
if the process belongs to another user. It may returnENOMEM
if it cannot allocate room for the environment strings.
It may also returnEINVAL
(see below)
I understand your concern mapping system errors to different Python errors;
now from what I've tested, errno
is set for all failing libkvm
calls, thus mapping EINVAL
and ESRCH
to NoSuchProcess()
seems fine.
There is just one race which comes to my mind: When calling kvm_getprocs()
, it may return success.
But before our call to kvm_getenvv()
, we might have been preempted by the kernel, and the process in question might have terminated.
In this case, our later call to kvm_getenvv()
will return EINVAL
, and we probably should remap this to ESRCH
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kvm_open* isn't likely to fail with EACCESS, as it doesn't open /dev/[k]mem as in the old days.
It will only fail if the library cannot allocate internal memory, or cannot open the kernel boot file. errno will be set appropriate.
The question is which exception should be raised - RuntimeError or OSError?
If errno
is set then it's easy: just use PyErr_SetFromErrno(PyExc_OSError)
. My understanding though is that kvm_
functions do not set errno
, so we need to parse kvm_geterr
. As a general rule, if errno
is not set, parse kvm_geterr
result and:
- set
ESRCH
if the string indicates the process is gone (useNoSuchProcess("...");
) - set
EPERM
if the string indicates some lack of privileges (useAccessDenied("...");
) - raise/call
PyErr_NoMemory()
if the string indicates it could not allocate memory - raise
RuntimeError
in all other circumstances; if you can, also include the result ofkvm_geterr
This logic (3 if
s, 1 else
) should be handled by convert_kvm_err
utility function, and (if you feel like it, while you're at it =)) reused by all functions that use kvm_*
APIs.
Maybe convert_kvm_err
could have another if
condition, which is executed first:
if (errno != 0)
PyErr_SetFromErrno(PyExc_OSError)
...but this assumes that one of the previous kvm_
calls did set errno
, but again, my understanding is that they usually don't (the various man pages never mention errno
).
kvm_getprocs isn't likely to fail itself. If the search for the PID fails, cnt is set to 0 [...]
Then raise NoSuchProcess("...");
[...] If the function itself returns NULL, it means that it was not able to allocate memory for the result, or the call parameters were invalid due to a programming error (&cnt pointing to invalid storage, kd invalid or NULL, meaning a previous kvm_open* did not suceed). We can raise RuntimeError or OSError if the call itself failed.
Mmmm... if we have a way to differentiate with kvm_geterr
(is it ENOMEM? is it EINVAL? ....) then I would say to just stick with convert_kvm_err
utility fun.
If the call reports cnt == 0, this maps to ESRCH ("no such process"). This will also happen if the process with the given PID exists, but sysctl security.bsd.see_other_uids == 0.
Damn... This I don't know. =)
There is just one race which comes to my mind: When calling kvm_getprocs(), it may return success.
But before our call to kvm_getenvv(), we might have been preempted by the kernel, and the process in question might have terminated. In this case, our later call to kvm_getenvv() will return EINVAL, and we probably should remap this to ESRCH?
Mmm... hard call. I would say yes, let's map it to ESRCH
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest update of the pull requests contains the changes I've mentioned so far.
Now psutil/tests/test_process.py
passes all Process().environ()
related stuff on all *BSD
variants.
Unfortunately, CirrusCI is now not happy with these latest changes.
Interestingly, the version yesterday passed the checks, which makes me a bit puzzled.
ERROR: psutil.tests.test_contracts.TestFetchAllProcesses.test_all
File "/tmp/cirrus-ci-build/psutil/_psbsd.py", line 674, in environ
return cext.proc_environ(self.pid)
OSError: [Errno 22] Invalid argument
Is it possible that there is some incosistency what test_process.py checks, and what test_contracts.py checks? It looks that someone is expecting an OSError
exception where the other one expects a NoSuchProcess
exception?
I will have to check whether OSError
should not be raised altogether. With Linux as reference platform in mind, it will never raise this exception anyway, as the system interface is completely different (and open(/proc/<pid>/...)
will raise IOError, which is probably remapped later)
The whole test suite does not pass altogether, though it looks that it doesn't pass on the 5.7.2
release tag either?
That happens on all *BSD
platforms here..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding though is that kvm_ functions do not set errno, so we need to parse kvm_geterr
Actually, almost all library functions do some system calls, especially the kvm*
stuff.
So, after any system call which failed, there will be errno
set.
Usually library utility functions do not clear errno
if they return an error condition themself. They probably return addition higher-level error information, like kvm_geterr()
, which in this case is of no useful information, judging from the source code review of the kvm
library code: "cannot read zombproc", "nprocs corrupt", "cannot read nprocs", "out of memory", "cannot allocate memory", "unsupported libelf", "exec filename too long", "bad flags arg", "unsupported architecture", "invalid address", "kernel is not an ELF file". ;-)
This avoids inventing a dozen more library error codes if most stuff may be mapped on typical errno
codes anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, almost all library functions do some system calls, especially the kvm* stuff. So, after any system call which failed, there will be errno set.
Got it. Then I think it makes sense to do this:
#if defined(PSUTIL_FREEBSD) || defined(PSUTIL_OPENBSD) || defined(PSUTIL_NETBSD)
void
convert_kvm_err(const char *syscall, char *errbuf) {
char fullmsg[8192];
sprintf(fullmsg, "(originated from %s: %s)", syscall, errbuf);
if (errno != 0) // <- new
PyErr_SetFromOSErrnoWithSyscall(syscall); // <- new
else if (strstr(errbuf, "Permission denied") != NULL)
AccessDenied(fullmsg);
else if (strstr(errbuf, "Operation not permitted") != NULL)
AccessDenied(fullmsg);
else
PyErr_Format(PyExc_RuntimeError, fullmsg);
}
#endif
As for EINVAL
, it's not clear to me where it originates from in psutil_proc_environ
function. You should put printf()
s to figure that out. At that point I'd say you'll have to decide what to do: whether to raise NoSuchProcess
or AccessDenied
.
From C you can use psutil_pid_exists
to check if the PID exists. If it does raise AD, else NSP. It's done elsewhere in the C code: grep for psutil_pid_exists
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...e.g. here:
psutil/psutil/arch/osx/process_info.c
Lines 166 to 174 in 5e47e0b
if (sysctl(mib, 3, procargs, &argmax, NULL, 0) < 0) { | |
// In case of zombie process we'll get EINVAL. We translate it | |
// to NSP and _psosx.py will translate it to ZP. | |
if ((errno == EINVAL) && (psutil_pid_exists(pid))) | |
NoSuchProcess("sysctl"); | |
else | |
PyErr_SetFromErrno(PyExc_OSError); | |
goto error; | |
} |
Even if this is OSX, it may be that on BSD
kvm_getenvv
may raise EINVAL
in case of zombie process.
setup.py
Outdated
os_major = int(platform.release().split(".")[0]) | ||
if os_major >= 8: | ||
libraries.append("kvm") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really necessary? It wasn't thus far. I'm worried about the string parsing. If it fails, the whole installation breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reminder.
As mentioned, I actually more or less rebased the changes I made years ago, and for whatever reason it seemed neccessary then - probably I used a .0
FreeBSD version which had some library glitches.
With all the (FreeBSD) versions mentioned in the commit log, it is not neccessary to specific libkvm
at all, as libdevstat
pulls it already!
psutil/_psutil_bsd.c
Outdated
@@ -73,7 +74,11 @@ | |||
#include <utmp.h> // system users | |||
#else | |||
#include <utmpx.h> | |||
#endif | |||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to indent =)
83b0db3
to
c08876b
Compare
Judging from the latest test failure it seems you cannot rely on So maybe, something like this (not tested)? #if defined(PSUTIL_NETBSD)
envs = kvm_getenvv2(kd, p, 0);
#else
envs = kvm_getenvv(kd, p, 0);
#endif
if (!envs) {
err = kvm_geterr(kd);
if (err) { // probably wrong, want to check if `strlen() > 0`
convert_kvm_err("kvm_getenvv", err);
goto error;
}
if (psutil_pid_exists(pid))
py_err = AccessDenied("kvm_getenvv");
else
py_err = NoSuchProcess("kvm_getenvv");
goto error;
} |
Well, let's have some fun ;-) For a moment, let's assume that the code is correct,
So, why there are failures then? Well, probably because of good old [nightmare] reasons:
For example: PLATFORM A: > gmake clean install test
======================================================================
FAIL: psutil.tests.test_posix.TestSystemAPIs.test_disk_usage
Ran 542 tests in 7.023s
FAILED (failures=1, skipped=220) Oh well, one test probably broken. Again: > gmake test
======================================================================
FAIL: psutil.tests.test_posix.TestSystemAPIs.test_disk_usage
AssertionError: 0.0 != 100.0 within 1 delta (100.0 difference)
======================================================================
FAIL: psutil.tests.test_bsd.BSDTestCase.test_disks
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/ag/repos/git-repos/giampaolo-psutil/psutil/tests/test_bsd.py", line 116, in test_disks
self.assertEqual(usage.total, total)
AssertionError: 17994129408 != 17993093120
----------------------------------------------------------------------
Ran 542 tests in 6.367s
FAILED (failures=2, skipped=220) Hm, well, now two test failing. Interesting! So let's just run tests in a loop to see what's happening! > while :
do
gmake test 2>&1 | fgrep "FAILED "
done
FAILED (failures=3, skipped=220)
FAILED (failures=2, skipped=220)
FAILED (failures=2, skipped=220)
FAILED (failures=3, errors=1, skipped=220)
FAILED (failures=3, skipped=220) Okay, so now we have at least one more test constantly failing.. uhm, probably related to our redirected stdout?! Now let's be more brave, another platform with the same OS and Python version! PLATFORM B: Sources: origin https://github.com/giampaolo/psutil.git gmake test 2>&1 | fgrep FAIL
psutil.tests.test_contracts.TestFetchAllProcesses.test_all ... FAIL
psutil.tests.test_posix.TestSystemAPIs.test_disk_usage ... FAIL
psutil.tests.test_process.TestProcess.test_terminal ... FAIL
psutil.tests.test_bsd.BSDTestCase.test_disks ... FAIL
psutil.tests.test_bsd.FreeBSDSystemTestCase.test_cpu_stats_ctx_switches ... FAIL
psutil.tests.test_bsd.FreeBSDSystemTestCase.test_cpu_stats_interrupts ... FAIL
FAIL
FAIL: psutil.tests.test_contracts.TestFetchAllProcesses.test_all
FAIL: test_exe pid=1426, ret='/usr/local/sbin/cupsd'
FAIL: psutil.tests.test_posix.TestSystemAPIs.test_disk_usage
FAIL: psutil.tests.test_process.TestProcess.test_terminal
FAIL: psutil.tests.test_bsd.BSDTestCase.test_disks
FAIL: psutil.tests.test_bsd.FreeBSDSystemTestCase.test_cpu_stats_ctx_switches
FAIL: psutil.tests.test_bsd.FreeBSDSystemTestCase.test_cpu_stats_interrupts
FAIL: psutil.tests.test_bsd.FreeBSDSystemTestCase.test_cpu_stats_syscalls
FAILED (failures=7, errors=2, skipped=223) Same platform, same OS versions, same python, a server with some more long-running processes, but now 7 failures? Oh heck.. Okay, well. Now let's try a different BSD! OpenBSD openbsd6.localdomain 6.7 GENERIC.MP#182 amd64 Sources: origin https://github.com/giampaolo/psutil.git gmake clean all install
psutil.tests.test_process.TestProcess.test_terminal ... FAIL
psutil.tests.test_system.TestCpuAPIs.test_cpu_times_comparison ... FAIL
psutil.tests.test_connections.TestSystemWideConnections.test_it ... FAIL
FAIL
psutil.tests.test_contracts.TestSystemAPITypes.test_net_connections ... FAIL
FAIL: psutil.tests.test_process.TestProcess.test_terminal
FAIL: psutil.tests.test_system.TestCpuAPIs.test_cpu_times_comparison
FAIL: psutil.tests.test_connections.TestSystemWideConnections.test_it
FAIL: psutil.tests.test_contracts.TestFetchAllProcesses.test_all
FAIL: test_connections pid=0
FAIL: psutil.tests.test_contracts.TestSystemAPITypes.test_net_connections
FAILED (failures=5, errors=2, skipped=266) Okay, 5 failures on recent OpenBSD version. Now let's see if the failure rate keeps stable: while :; do make test 2>&1 | fgrep --line-buffered "FAIL "; done No output anymore! Output redirection now causes the test suite to hang.. oh well. script -c "while :; do gmake test; done" /tmp/make.out
~~~ wait 5 minutes ~~~
CTRL-C
fgrep "FAILED " /tmp/make.out
FAILED (failures=4, errors=2, skipped=266)
FAILED (failures=4, errors=2, skipped=266)
FAILED (failures=4, errors=2, skipped=266)
FAILED (failures=4, errors=2, skipped=266)
FAILED (failures=4, errors=2, skipped=266) Okay, on this platform there seems to be no timing or other race in either the code under test, or the test suite. Now let's move on! Back to the laptop, now with the new sources: Let's check what's Cirrus CI says! (no idea why it's labelled "freebsd_13_py2", actually we shall not fool ourself, actually it's Cirrus CI / freebsd_13_py3 failed 14 hours ago in 2m 59s
Task Summary
Instruction main failed in 00:09
Details
00:08 clone
02:39 install
x 00:09 main
return fun(self, *args, **kwargs)
File "/tmp/cirrus-ci-build/psutil/_psbsd.py", line 673, in environ
return cext.proc_environ(self.pid)
ProcessLookupError: [Errno 3] No such process (originated from kvm_getenvv)
During handling of the above exception, another exception occurred:
psutil.NoSuchProcess: psutil.NoSuchProcess process no longer exists (pid=1, name='init')
During handling of the above exception, another exception occurred:
AssertionError: NoSuchProcess not raised by Process
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/tmp/cirrus-ci-build/psutil/tests/test_contracts.py", line 397, in test_all
----------------------------------------------------------------------
Ran 619 tests in 5.078s
FAILED (failures=1, errors=2, skipped=240)
FAILED
*** Error code 1 Okay, then now let's run this test suite locally! Sources: origin [email protected]:ArminGruner/psutil.git gmake clean install
(.venv)git-repos/psutil - [master●] » py psutil/tests/test_contracts.py
psutil.tests.test_contracts.TestAvailConstantsAPIs.test_PROCFS_PATH ... OK
psutil.tests.test_contracts.TestAvailConstantsAPIs.test_linux_ioprio_linux ... OK
psutil.tests.test_contracts.TestAvailConstantsAPIs.test_linux_ioprio_windows ... OK
psutil.tests.test_contracts.TestAvailConstantsAPIs.test_linux_rlimit ... OK
psutil.tests.test_contracts.TestAvailConstantsAPIs.test_win_priority ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_cpu_affinity ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_cpu_num ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_environ ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_gids ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_io_counters ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_ionice ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_memory_maps ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_num_fds ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_num_handles ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_rlimit ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_terminal ... OK
psutil.tests.test_contracts.TestAvailProcessAPIs.test_uids ... OK
psutil.tests.test_contracts.TestAvailSystemAPIs.test_battery ... OK
psutil.tests.test_contracts.TestAvailSystemAPIs.test_cpu_freq ... OK
psutil.tests.test_contracts.TestAvailSystemAPIs.test_sensors_fans ... OK
psutil.tests.test_contracts.TestAvailSystemAPIs.test_sensors_temperatures ... OK
psutil.tests.test_contracts.TestAvailSystemAPIs.test_win_service_get ... OK
psutil.tests.test_contracts.TestAvailSystemAPIs.test_win_service_iter ... OK
psutil.tests.test_contracts.TestFetchAllProcesses.test_all ... OK
psutil.tests.test_contracts.TestProcessWaitType.test_negative_signal ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_boot_time ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_cpu_count ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_cpu_freq ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_cpu_percent ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_cpu_times ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_cpu_times_percent ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_disk_io_counters ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_disk_partitions ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_net_connections ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_net_if_addrs ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_net_if_stats ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_net_io_counters ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_sensors_fans ... skipped: not supported
psutil.tests.test_contracts.TestSystemAPITypes.test_sensors_temperatures ... OK
psutil.tests.test_contracts.TestSystemAPITypes.test_users ... OK
----------------------------------------------------------------------
Ran 40 tests in 1.276s
OK (skipped=1)
SUCCESS
(.venv)git-repos/psutil - [master●] » py psutil/tests/test_process.py
----------------------------------------------------------------------
Ran 80 tests in 3.280s
OK (skipped=10)
SUCCESS
(.venv)git-repos/psutil - [master●] » py3 --version
Python 3.7.8
(.venv)git-repos/psutil - [master●] » uname -a
FreeBSD ikarus.fritz.box 12.1-RELEASE-p5 FreeBSD 12.1-RELEASE-p5 GENERIC amd64 Okay. So, where do we go from here? (-: To be continued... (the only thing which comes to my mind: We don't know the exact FreeBSD platform on Cirrus CI; actually they state in their docs it's some standard google cloud FreeBSD box, but actually I have no access to one to check; probably Cirrus CI is just confused because I've updated |
Hehe. Keeping psutil unit tests stable has always been like a never ending battle that I never really managed to win. There's always something that breaks occasionally. The "matrix" (different OSes, OS versions, CI services, python versions, etc.) is very complex, and 95% of the times I'm on Linux (my laptop) so I'm not sure how stable tests are on other platforms. With that said, for this specific PR feel free to ignore any failure which is unrelated with this change (memory, disk, etc...).
I'm pretty sure the test is legitimate (aka not broken). What is broken is the implementation. I think that's because this part is not correct: if (!envs) {
if (errno == EPERM)
py_err = AccessDenied("kvm_getenvv");
else
py_err = NoSuchProcess("kvm_getenvv");
goto error;
} I think a possible solution is what I suggested in my previous comment #1800 (comment). I didn't try it myself so I can be wrong (but I can try myself after pulling your branch if you're still in trouble - not sure when though). I think we're basically there though. |
7d1128e
to
2378208
Compare
You're surely right.
You know that I'm a great lover of *BSD. So:
Well, no, and ... yes ;-) If You're right, my code assumed that it always fits, because This actually indicates to the library that it should allocate whatever how much it takes (quoting Mario Draghi ;-)),
No, this is clearly stated in the man page. It states if the result does not fit into the allocated buffer, it will be truncated.
No, this is not true, because if ... and that's exactly what Cirrus CI shows with the latest pull request update. I will have to investigate later, now for some less serious work, but vital to earn some money... ;-) Thanks for all your feedback so far!
Actually the experience I reported clearly show how vital it is to do continuous testing on all supported platforms. |
64bc731
to
7aae715
Compare
945a7a8
to
68c65a8
Compare
Good evening .. well, what do you think? -> https://cirrus-ci.com/task/5900109814693888 |
Hey Armin. Mmm... since it seems we cannot avoid if (!envs) {
/* Map the most obvious stuff to general high-level exceptions exported by our "psutil" */
if (errno == EPERM)
AccessDenied("kvm_getenvv");
else if (errno == ESRCH)
NoSuchProcess("kvm_getenvv");
else if (errno == ENOMEM) {
psutil_debug("kvm_getenvv -> ENOMEM, turned into EPERM");
AccessDenied("kvm_getenvv -> ENOMEM, turned into EPERM");
}
else {
sprintf(errbuf, "kvm_getenvv(pid=%d)", pid);
PyErr_SetFromOSErrnoWithSyscall(errbuf);
}
goto error;
} |
5038c40
to
f761464
Compare
Hi Giampaolo, I've filed a bug report on FreeBSD (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248537), let's see if they confirm that there is a problem or how to handle the situation [**] (meanwhile I'm pretty sure there is; I also find stuff like » fgrep getenvv /var/log/messages
Jul 23 15:03:47 ikarus console-kit-daemon[1625]: WARNING: kvm_getenvv failed: in syslog files) I need to check the other *BSD platforms if they are still ok. The CI/CD builds/tests are all green now (well some macosx build seems to be flaky, but it's not related to my changes) [**] » cat clearenv.c
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
int
main(int argc, char **argv, char **env)
{
fprintf(stderr, "pid=%d\n", getpid());
*env = NULL;
sleep(9999);
} » ./clearenv &
pid=40980 » python -c 'from psutil import Process; print(Process(40980).environ())'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ag/repos/git-repos/psutil/psutil/__init__.py", line 862, in environ
return self._proc.environ()
File "/home/ag/repos/git-repos/psutil/psutil/_psbsd.py", line 551, in wrapper
return fun(self, *args, **kwargs)
File "/home/ag/repos/git-repos/psutil/psutil/_psbsd.py", line 673, in environ
return cext.proc_environ(self.pid)
OSError: [Errno 0] No error: 0 (originated from kvm_getenvv(pid=40980)) » cat overflowenv.c
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
int
main(int argc, char **argv, char **env)
{
int i;
int arg_max = sysconf(_SC_ARG_MAX);
char *v = malloc(arg_max * 2);
assert(v != NULL);
for (i = 0; i < arg_max * 2; i++)
v[i] = 'z';
*env = v;
fprintf(stderr, "pid=%d\n", getpid());
sleep(9999);
}
» ./overflowenv &
pid=48755
» procstat -e 48755
PID COMM ENVIRONMENT
procstat: sysctl(kern.proc.env): Cannot allocate memory
48755 overflowenv - |
1e783a3
to
c090ca1
Compare
Hey Armin. Thanks for investigating this issue so thoroughly. I seem to remember there was a similar problem with "corrupted env" on other platforms, but I don't remember the exact details (or maybe it was the cmdline(), anyway, I remember a similar problem occurring when a process env or cmdline was changed after the process was started). AFAICT, given how thorny this issue appears to be, I think it would be fine to just raise |
430662a
to
986c5ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Armin. I added a couple more (minor) comments. The rest LGTM.
psutil/_psutil_bsd.c
Outdated
#include <sys/param.h> | ||
#include <sys/sysctl.h> | ||
#include <sys/user.h> | ||
#define PSUTIL_PROC_SKIP(p) (!((p)->ki_flag & P_INMEM) || ((p)->ki_flag & P_SYSTEM)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? If it is, I would rather call it PSUTIL_PROC_ENV_SKIP
psutil/_psutil_bsd.c
Outdated
@@ -83,6 +89,7 @@ | |||
#include <sys/file.h> | |||
#undef _KERNEL | |||
#include <sys/sched.h> // for CPUSTATES & CP_* | |||
#define PSUTIL_PROC_SKIP(p) ((p)->p_flag & P_SYSTEM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
psutil/_psutil_bsd.c
Outdated
@@ -93,6 +100,7 @@ | |||
#ifndef DTYPE_VNODE | |||
#define DTYPE_VNODE 1 | |||
#endif | |||
#define PSUTIL_PROC_SKIP(p) ((p)->p_stat == SZOMB) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
psutil/_psutil_bsd.c
Outdated
p = kvm_getproc2(kd, KERN_PROC_PID, pid, sizeof(*p), &cnt); | ||
#endif | ||
if (!p) { | ||
py_err = NoSuchProcess("kvm_getprocs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can avoid using py_err = ...
. Just Calling NSP/AD will set the exception.
psutil/_psutil_bsd.c
Outdated
* On NetBSD, we cannot call kvm_getenvv2() for a zombie process. | ||
* | ||
* To make unittest suite happy, return an empty environment. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor style complaint: could you please use //
comments and limit the length of the string to 80 chars?
psutil/_psutil_bsd.c
Outdated
Py_XDECREF(py_value); | ||
Py_XDECREF(py_retdict); | ||
kvm_close(kd); | ||
return py_err; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just return NULL;
and get rid of py_err
Are we adding support for FreeBSD 7 and/or other BSD platforms? (asking for the HISTORY.rst file entry) |
Tested-on: FreeBSD-13-CURRENT (r363681), FreeBSD-12.1, FreeBSD-11.4, FreeBSD-11.3, FreeBSD-10.4, FreeBSD-9.3, FreeBSD-8.4, FreeBSD-7.1, OpenBSD-6.7, OpenBSD-5.7, NetBSD-9.0, NetBSD-8.2
Well, the code compiles under FreeBSD 7.1, and: > uname -a
FreeBSD FreeBSD-7.1-RELEASE-amd64 7.1-RELEASE FreeBSD 7.1-RELEASE #0: Thu Jan 1 08:58:24 UTC 2009 [email protected]:/usr/obj/usr/src/sys/GENERIC amd64
> make PYTHON=python2.7 test
----------------------------------------------------------------------
Ran 559 tests in 7.601s
FAIL: psutil.tests.test_system.TestMiscAPIs.test_users
FAIL: psutil.tests.test_misc.TestMisc.test_serialization
FAIL: psutil.tests.test_process.TestProcess.test_long_cmdline
FAIL: psutil.tests.test_contracts.TestSystemAPITypes.test_users
FAIL: psutil.tests.test_posix.TestSystemAPIs.test_users
FAILED (failures=5, errors=30, skipped=241) For NetBSD and OpenBSD, I was able to run the test suite, and the |
Merged. Thanks a lot for the great work Armin. It was a good patch. If you want to fix other BSD-related stuff you're very welcome. ;-) BSD* definitively need more maintenance. |
Hi Giampaolo, thanks for merging! I'm still in evaluation for a CI runner and automated CI/CD test setup for the other *BSD family members. |
To be honest, having / maintaining extra CI for Open/NetBSD would be an additional effort which I'd prefer to avoid. I'd rather fix whatever is broken on those platforms and then leave them alone. The code for Open/NetBSD changes pretty rarely anyway. |
Hi, I've added support for FreeBSD almost two years ago, but forgot to create a pull request ;-)
While I've been there, I also added support for NetBSD and OpenBSD lately.
Also, there are some fixes to make the C extension compile again on older FreeBSD, and recent NetBSD versions.
This closes #893
Greetings from Munich,
Armin