Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sometimes doesn't link when compiling cuda programs #6089

Closed
tie-pilot-qxw opened this issue Jan 17, 2025 · 10 comments
Closed

Sometimes doesn't link when compiling cuda programs #6089

tie-pilot-qxw opened this issue Jan 17, 2025 · 10 comments
Labels
Milestone

Comments

@tie-pilot-qxw
Copy link

Xmake Version

v2.9.6+20241218

Operating System Version and Architecture

Ubuntu 20.04

Describe Bug

Sometimes when compiling the Cuda program, the xmake won't link the program, and I have to call xmake again to let it do the linking.

Expected Behavior

The xmake should link the program at the first run

Project Configuration

target("bench-mont")
set_languages(("c++17"))
add_cugencodes("native")
add_options("-lineinfo")
add_options("--expt-relaxed-constexpr")
add_files("mont/tests/bench.cu")

Additional Information and Error Logs

[ 50%]: compiling.release mont/tests/bench.cu
/usr/local/cuda/bin/nvcc -c -Xcompiler -fPIE -O3 -I/usr/local/cuda/include --std c++17 -m64 -rdc=true -ccbin=gcc-11 -gencode arch=compute_80,code=sm_80 -DNDEBUG -o build/.objs/bench-mont/linux/x86_64/release/mont/tests/bench.cu.o mont/tests/bench.cu
checking for the cuda linker (culd) ... nvcc
[ 75%]: devlinking.release bench-mont_gpucode.cu.o
/usr/local/cuda/bin/nvcc -o build/.objs/bench-mont/linux/x86_64/release/rules/cuda/devlink/bench-mont_gpucode.cu.o build/.objs/bench-mont/linux/x86_64/release/mont/tests/bench.cu.o -L/usr/local/cuda/lib64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -m64 -ccbin=gcc-11 -gencode arch=compute_80,code=sm_80 -dlink
checking for the linker (ld) ... g++
checking for flags (-Wl,-rpath=/usr/local/cuda/lib64) ... ok

g++ "-Wl,-rpath=/usr/local/cuda/lib64" "-m64" "-m64"
[ 75%]: linking.release bench-mont
/usr/bin/g++ -o build/linux/x86_64/release/bench-mont build/.objs/bench-mont/linux/x86_64/release/mont/tests/bench.cu.o build/.objs/bench-mont/linux/x86_64/release/rules/cuda/devlink/bench-mont_gpucode.cu.o -m64 -L/usr/local/cuda/lib64 -Wl,-rpath=/usr/local/cuda/lib64 -s -lcudadevrt -lcudart_static -lrt -lpthread -ldl

build cache stats:
cache directory: build/.build_cache
cache hit rate: 0%
cache hit: 0
cache hit total time: 0.000s
cache miss: 0
cache miss total time: 0.000s
new cached files: 0
remote cache hit: 0
remote new cached files: 0
preprocess failed: 0
compile fallback count: 0
compile total time: 0.000s

[100%]: build ok, spent 5.724s

@star-hengxing
Copy link
Contributor

Try set_kind("binary")

@tie-pilot-qxw
Copy link
Author

Try set_kind("binary")

It didn't work. I've tried your suggestion, but sometimes I still need two calls to enable linking.

Image

@waruqi
Copy link
Member

waruqi commented Jan 20, 2025

It works for me.

ruki@73e655426012:/mnt/xmake/tests/projects/cuda/console$ xmake -r
[ 50%]: compiling.release src/main.cu
[ 75%]: devlinking.release cuda_console_gpucode.cu.o
[ 75%]: linking.release cuda_console
[100%]: build ok, spent 2.594s
ruki@73e655426012:/mnt/xmake/tests/projects/cuda/console$ xmake -r
[ 50%]: compiling.release src/main.cu
[ 75%]: devlinking.release cuda_console_gpucode.cu.o
[ 75%]: linking.release cuda_console
[100%]: build ok, spent 2.425s
ruki@73e655426012:/mnt/xmake/tests/projects/cuda/console$ xmake -r
[ 50%]: compiling.release src/main.cu
[ 75%]: devlinking.release cuda_console_gpucode.cu.o
[ 75%]: linking.release cuda_console
[100%]: build ok, spent 2.437s
ruki@73e655426012:/mnt/xmake/tests/projects/cuda/console$ xmake -r
[ 50%]: compiling.release src/main.cu
[ 75%]: devlinking.release cuda_console_gpucode.cu.o
[ 75%]: linking.release cuda_console
[100%]: build ok, spent 2.423s
ruki@73e655426012:/mnt/xmake/tests/projects/cuda/console$ xmake -r
[ 50%]: compiling.release src/main.cu
[ 75%]: devlinking.release cuda_console_gpucode.cu.o
[ 75%]: linking.release cuda_console
[100%]: build ok, spent 2.487s
ruki@73e655426012:/mnt/xmake/tests/projects/cuda/console$ xmake -r
[ 50%]: compiling.release src/main.cu
[ 75%]: devlinking.release cuda_console_gpucode.cu.o
[ 75%]: linking.release cuda_console
[100%]: build ok, spent 2.503s
ruki@73e655426012:/mnt/xmake/tests/projects/cuda/console$ xmake -r
[ 50%]: compiling.release src/main.cu
[ 75%]: devlinking.release cuda_console_gpucode.cu.o
[ 75%]: linking.release cuda_console
[100%]: build ok, spent 2.72s
ruki@73e655426012:/mnt/xmake/tests/projects/cuda/c

@tie-pilot-qxw
Copy link
Author

tie-pilot-qxw commented Jan 21, 2025

I think that there may be some difference between xmake -r ... and xmake build .... When I use the former, linking occurs properly, but the latter will sometimes result in failure.

Image

Image

@waruqi
Copy link
Member

waruqi commented Jan 21, 2025

This should be a problem with incremental compilation, although I can't reproduce it here.

ruki@73e655426012:/mnt/xmake/tests/projects/cuda/console$ xmake
checking for Cuda SDK directory ... /usr/local/cuda
[ 50%]: compiling.release src/main.cu
[ 75%]: devlinking.release cuda_console_gpucode.cu.o
[ 75%]: linking.release cuda_console
[100%]: build ok, spent 11.005s
ruki@73e655426012:/mnt/xmake/tests/projects/cuda/console$ xmake
[100%]: build ok, spent 0.058s
ruki@73e655426012:/mnt/xmake/tests/projects/cuda/console$ xmake
[100%]: build ok, spent 0.052s
ruki@73e655426012:/mnt/xmake/tests/projects/cuda/console$ xmake
[100%]: build ok, spent 0.08s

you can run xmake f --policies=diagnosis.check_build_deps to check it.

or debug

function is_changed(dependinfo, opt)

@tie-pilot-qxw
Copy link
Author

Image

To reproduce the problem, I think maybe you can use the following file structure:
a.cuh a.cu where a.cuh is included in a.cu. Then try to add some meaningless modification to a.cuh (e.g. additional spaces) then rerun xmake build for several times. Or you can just use our project https://github.com/danzou1ge6/zk0.99c on dev branch and try xmake build test-ntt-recompute and add some modifications to ntt/src/recompute_ntt.cuh then run again until the problem can occur

@waruqi
Copy link
Member

waruqi commented Jan 22, 2025

https://github.com/danzou1ge6/zk0.99c

Image

To reproduce the problem, I think maybe you can use the following file structure: a.cuh a.cu where a.cuh is included in a.cu. Then try to add some meaningless modification to a.cuh (e.g. additional spaces) then rerun xmake build for several times. Or you can just use our project https://github.com/danzou1ge6/zk0.99c on dev branch and try xmake build test-ntt-recompute and add some modifications to ntt/src/recompute_ntt.cuh then run again until the problem can occur

I cannot build it.

[ 24%]: cache compiling.release ntt/src/inplace_transpose/common/gcd.cpp
[ 25%]: cache compiling.release runtime/tests/simple_json.cpp
[ 26%]: compiling.release ntt/tests/test-4step.cu
[ 26%]: compiling.release ntt/tests/test-big.cu
[ 26%]: compiling.release ntt/src/inplace_transpose/cuda/introspect.cu
[ 26%]: compiling.release ntt/src/inplace_transpose/common/reduced_math.cu
[ 26%]: compiling.release ntt/src/inplace_transpose/cuda/timer.cu
[ 26%]: compiling.release ntt/tests/test-int.cu
error: /usr/local/cuda/include/cuda/std/barrier:15:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up."
   15 | #  error "CUDA synchronization primitives are only supported for sm_70 and up."
      |    ^~~~~
In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/barrier:459,
                 from /usr/local/cuda/include/cuda/std/barrier:30:
/usr/local/cuda/include/cuda/std/__cuda/barrier.h:17:4: error: #error "CUDA synchronization primitives are only supported for sm_70 and up."
   17 | #  error "CUDA synchronization primitives are only supported for sm_70 and up."
      |    ^~~~~
  > in ntt/tests/test-big.cu
warning: failed to find cuda devices: cudaErrorInsufficientDriver (CUDA driver version is insufficient for CUDA runtime version)

Did you provide a minimal example project?

@tie-pilot-qxw
Copy link
Author

May be you can try this by modifying a.cuh a little each time and rerun xmake build

// a.cuh
#pragma once
#include <cuda_runtime.h>
#include <stdio.h>

__global__ void kernel(int a, int b) {
    printf("%d\n", a+b);
}   
// a.cu
#include "a.cuh"

int main() {
    kernel<<<1, 1>>>(1, 2);
    cudaDeviceSynchronize();
    return 0;
}
// xmake.lua
target("test")
    add_files("a.cu")
    set_kind("binary")

Image

@waruqi
Copy link
Member

waruqi commented Jan 22, 2025

try this patch. #6104

xmake update -s dev
ruki@73e655426012:/tmp/testcu$ rm -rf build/; xmake; sleep 1; touch a.cuh; sleep 1; xmake    
[ 50%]: compiling.release a.cu
[ 75%]: devlinking.release test_gpucode.cu.o
[ 75%]: linking.release test
[100%]: build ok, spent 1.147s
[ 50%]: compiling.release a.cu
[ 75%]: devlinking.release test_gpucode.cu.o
[ 75%]: linking.release test
[100%]: build ok, spent 1.216s
ruki@73e655426012:/tmp/testcu$ rm -rf build/; xmake; sleep 1; touch a.cuh; sleep 1; xmake
[ 50%]: compiling.release a.cu
[ 75%]: devlinking.release test_gpucode.cu.o
[ 75%]: linking.release test
[100%]: build ok, spent 1.292s
[ 50%]: compiling.release a.cu
[ 75%]: devlinking.release test_gpucode.cu.o
[ 75%]: linking.release test
[100%]: build ok, spent 1.273s
ruki@73e655426012:/tmp/testcu$ rm -rf build/; xmake; sleep 1; touch a.cuh; sleep 1; xmake
[ 50%]: compiling.release a.cu
[ 75%]: devlinking.release test_gpucode.cu.o
[ 75%]: linking.release test
[100%]: build ok, spent 1.24s
[ 50%]: compiling.release a.cu
[ 75%]: devlinking.release test_gpucode.cu.o
[ 75%]: linking.release test
[100%]: build ok, spent 1.245s
ruki@73e655426012:/tmp/testcu$ touch a.cuh
ruki@73e655426012:/tmp/testcu$ xmake
[ 50%]: compiling.release a.cu
[ 75%]: devlinking.release test_gpucode.cu.o
[ 75%]: linking.release test
[100%]: build ok, spent 1.227s
ruki@73e655426012:/tmp/testcu$ touch a.cuh
ruki@73e655426012:/tmp/testcu$ xmake
[ 50%]: compiling.release a.cu
[ 75%]: devlinking.release test_gpucode.cu.o
[ 75%]: linking.release test
[100%]: build ok, spent 1.241s

@waruqi waruqi added this to the v2.9.8 milestone Jan 22, 2025
waruqi added a commit that referenced this issue Jan 22, 2025
@tie-pilot-qxw
Copy link
Author

Thanks! It worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants