Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix cpu count #2211

Closed
wants to merge 1 commit into from
Closed

fix cpu count #2211

wants to merge 1 commit into from

Conversation

NanoNabla
Copy link
Contributor

@NanoNabla NanoNabla commented May 12, 2022

Using os.cpu_count()) can result too high numbers when using cluster management software e.g. SLURM.
Therefore, more processes than available cpus are started which result in low performance.

Besides cluster management software environments with linux standard taskset are also affected.

I replaced os.cpu_count() with len(psutil.Process().cpu_affinity()) which sould return a real value of available processors.

An example is

taskset -c 0 ./main.py
# main.py
import os
import psutil
print(os.cpu_count())
print(len(os.sched_getaffinity(0)))
print(len(psutil.Process().cpu_affinity()))

More information about this issue can be found here

@CLAassistant
Copy link

CLAassistant commented May 12, 2022

CLA assistant check
All committers have signed the CLA.

@wasertech
Copy link
Collaborator

I have introduced this function with #2249

def available_cpu_count():
    """Number of available virtual or physical CPUs on this system, i.e.
    user/real as output by time(1) when called with an optimally scaling
    userspace-only program
    See this https://stackoverflow.com/a/1006301/13561390"""

    # cpuset
    # cpuset may restrict the number of *available* processors
    try:
        m = re.search(r"(?m)^Cpus_allowed:\s*(.*)$", open("/proc/self/status").read())
        if m:
            res = bin(int(m.group(1).replace(",", ""), 16)).count("1")
            if res > 0:
                return res
    except IOError:
        pass

    # Python 2.6+
    try:
        import multiprocessing

        return multiprocessing.cpu_count()
    except (ImportError, NotImplementedError):
        pass

    # https://github.com/giampaolo/psutil
    try:
        import psutil

        return psutil.cpu_count()  # psutil.NUM_CPUS on old versions
    except (ImportError, AttributeError):
        pass

    # POSIX
    try:
        res = int(os.sysconf("SC_NPROCESSORS_ONLN"))

        if res > 0:
            return res
    except (AttributeError, ValueError):
        pass

    # Windows
    try:
        res = int(os.environ["NUMBER_OF_PROCESSORS"])

        if res > 0:
            return res
    except (KeyError, ValueError):
        pass

    # jython
    try:
        from java.lang import Runtime

        runtime = Runtime.getRuntime()
        res = runtime.availableProcessors()
        if res > 0:
            return res
    except ImportError:
        pass

    # BSD
    try:
        sysctl = subprocess.Popen(["sysctl", "-n", "hw.ncpu"], stdout=subprocess.PIPE)
        scStdout = sysctl.communicate()[0]
        res = int(scStdout)

        if res > 0:
            return res
    except (OSError, ValueError):
        pass

    # Linux
    try:
        res = open("/proc/cpuinfo").read().count("processor\t:")

        if res > 0:
            return res
    except IOError:
        pass

    # Solaris
    try:
        pseudoDevices = os.listdir("/devices/pseudo/")
        res = 0
        for pd in pseudoDevices:
            if re.match(r"^cpuid@[0-9]+$", pd):
                res += 1

        if res > 0:
            return res
    except OSError:
        pass

    # Other UNIXes (heuristic)
    try:
        try:
            dmesg = open("/var/run/dmesg.boot").read()
        except IOError:
            dmesgProcess = subprocess.Popen(["dmesg"], stdout=subprocess.PIPE)
            dmesg = dmesgProcess.communicate()[0]

        res = 0
        while "\ncpu" + str(res) + ":" in dmesg:
            res += 1

        if res > 0:
            return res
    except OSError:
        pass

    raise Exception("Can not determine number of CPUs on this system")

Maybe we could use it stt-wide?

@NanoNabla
Copy link
Contributor Author

Since your code seems to work with taskset on Linux I'm fine with using it stt-wide. But since this PR seems to be of no interest I do not have time to take any effort into this anymore.

@wasertech
Copy link
Collaborator

wasertech commented Jul 5, 2022

since this PR seems to be of no interest I do not have time to take any effort into this anymore.

That's because your code didn't pass the lint pre-commit checks, my guess is that you didn't hooked pre-commit correctly like described in the Code Linting section of CONTRIBUTING.rst.

I'll try to take some time to make it happen then.

@wasertech
Copy link
Collaborator

#2253 should do nicely.

@wasertech wasertech mentioned this pull request Jul 5, 2022
@wasertech
Copy link
Collaborator

We can probably also close this PR in favor of #2253

@NanoNabla NanoNabla closed this Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants