Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cmake] CUDA improved gencode (new attempt) #109

Merged
merged 12 commits into from
Aug 12, 2019
Merged

Conversation

griwodz
Copy link
Member

@griwodz griwodz commented Aug 2, 2019

Choose the CUDA Compute Capabilities that are compiled in one of three ways:
(1) default option: compile for all CCs that are supported by the current SDK (a string that will need future maintenance)
(2) compile only for the CC of the machine that runs the cmake script
(3) specify a list yourself on the command line, in the cache or in cmake-gui

In contrast to the older #95 , this PR uses the CUDA_SELECT_NVCC_ARCH_FLAGS command from FindCUDA.cmake. The manual list creation for step (1) could not be replaced by CUDA_SELECT_NVCC_ARCH_FLAGS(All) because that command does not support giving a lower limit for the supported CCs.

@griwodz griwodz force-pushed the betterCudaGencode branch from fb80ad8 to 5ccf4e7 Compare August 2, 2019 08:52
@simogasp
Copy link
Member

simogasp commented Aug 2, 2019

he manual list creation for step (1) could not be replaced by CUDA_SELECT_NVCC_ARCH_FLAGS(All) because that command does not consider the currently used SDK (i.e. CUDA_VERSION).

Is that so? from the code it seems to check the compatibility with the cuda version
https://github.com/Kitware/CMake/blob/master/Modules/FindCUDA/select_compute_arch.cmake

@griwodz
Copy link
Member Author

griwodz commented Aug 2, 2019

Well, when I specify "All", I get CUDA version below 3.5 because I cannot see how to specify a lower limit. When I specify a list of numbers myself, it goes straight to the end of that file and accepts everything I give it without checking for an upper limit.

If the code would separate the string that turns a version list into gencode strings, it would be feasible to use list(REMOVE_ITEM,...) between the steps, but it seems quite ugly to get rid of the CCs that we don't support after the list of gencode strings has been generated.

@simogasp simogasp added this to the v1.0.0 milestone Aug 2, 2019
@simogasp
Copy link
Member

simogasp commented Aug 2, 2019

Well, when I specify "All", I get CUDA version below 3.5 because I cannot see how to specify a lower limit. When I specify a list of numbers myself, it goes straight to the end of that file and accepts everything I give it without checking for an upper limit.

If the code would separate the string that turns a version list into gencode strings, it would be feasible to use list(REMOVE_ITEM,...) between the steps, but it seems quite ugly to get rid of the CCs that we don't support after the list of gencode strings has been generated.

Ok i didn't get that we have a lower bound for the CC.
Well, maybe removing at the end the undesired one is not a bad option, as we don't get to maintain the code with the list of supported architectures. (cons: of course we need to upgrade cmake each time they provide new versions for cuda then)
After all the code would boil down to something like

# the list of CCs not supported by CCTag
set((CCTAG_UNSUPPORTED_ARCHS 2.0 2.1 3.0 3.2)

if(CCTAG_CUDA_CC_CURRENT_ONLY)
    set(CCTAG_CUDA_CC_LIST_BASIC Auto)
else()
    set(CCTAG_CUDA_CC_LIST_BASIC All)
endif()

#get the flags from findCuda
CUDA_SELECT_NVCC_ARCH_FLAGS(ARCH_FLAGS ${CCTAG_CUDA_CC_LIST_BASIC})

# remove the unsupported ones
foreach (ARCH ${(CCTAG_UNSUPPORTED_ARCHS})
    list(REMOVE_ITEM ARCH_FLAGS ${ARCH})
endforeach ()

# check there is at least one supported architecture
list(LENGTH ARCH_FLAGS NUM_ARCH)
if(${NUM_ARCH} EQUAL 0)
    message(SEND_ERROR "No supported architectures for the current hardware")
endif()

LIST(APPEND CUDA_NVCC_FLAGS ${ARCH_FLAGS})

But I don't know, the pros/cons here is code maintainability Vs relying always on the most recent cmake version (at least the one with the latest cuda updates)

CMakeLists.txt Outdated
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS};-gencode=arch=compute_61,code=sm_61")
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS};-gencode=arch=compute_62,code=sm_62")
set(CCTAG_CUDA_CC_LIST_INIT0 3.5 3.7 5.0 5.2 5.3)
if( ( CUDA_VERSION VERSION_EQUAL "9.0" ) OR ( CUDA_VERSION VERSION_GREATER "9.0" ) )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we are now using version 3.13 u can use VERSION_GREATER_EQUAL
https://cmake.org/cmake/help/v3.13/command/if.html

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@griwodz
Copy link
Member Author

griwodz commented Aug 2, 2019

foreach (ARCH ${(CCTAG_UNSUPPORTED_ARCHS})
    list(REMOVE_ITEM ARCH_FLAGS ${ARCH})
endforeach ()

If we could it like this I would have chosen the option. But the ARCH_FLAGS results has the form -gencode arch=compute_20,code=compute_20. It would be a little bit annoying to generate the arch= string for each unsupported arch, but doable. However, I have not the slightest idea how to remove the -gencode strings - because according to the list() semantics of CMake, they are separate items. How to find and delete the ones that belong to the arch= items that we have removed?

@griwodz griwodz force-pushed the betterCudaGencode branch from 3c7439e to cd19124 Compare August 5, 2019 07:36
@griwodz
Copy link
Member Author

griwodz commented Aug 5, 2019

I don't understand what's wrong with the CUDA 8.0 version of Travis. I've just installed CUDA 8.0 and a matching OpenCV version and that works without problems. CudaGetErrorString is a core CUDA function that has been there forever, it makes no sense that this fails, but I wanted to confirm that the error isn't really in the code.

And I simply don't understand AppVeyor's problem.

@griwodz griwodz force-pushed the betterCudaGencode branch from cb2f72a to cd19124 Compare August 5, 2019 09:49
Copy link
Member

@simogasp simogasp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we could it like this I would have chosen the option. But the ARCH_FLAGS results has the form -gencode arch=compute_20,code=compute_20. It would be a little bit annoying to generate the arch= string for each unsupported arch, but doable. However, I have not the slightest idea how to remove the -gencode strings - because according to the list() semantics of CMake, they are separate items. How to find and delete the ones that belong to the arch= items that we have removed?

Ah I thought it was a simpler list. So let's keep it that way. Sometimes CMake is really disappointing...

@simogasp
Copy link
Member

simogasp commented Aug 5, 2019

I don't understand what's wrong with the CUDA 8.0 version of Travis. I've just installed CUDA 8.0 and a matching OpenCV version and that works without problems. CudaGetErrorString is a core CUDA function that has been there forever, it makes no sense that this fails, but I wanted to confirm that the error isn't really in the code.

@griwodz could it be this
https://gitlab.kitware.com/cmake/cmake/issues/17185
i.e. a problem with sm53? Indeed, if i'm not wrong we were not using 53 before this PR
https://github.com/alicevision/CCTag/pull/109/files#diff-af3b638bc2a3e6c650974192a53c7291L133

@griwodz
Copy link
Member Author

griwodz commented Aug 5, 2019

could it be this
https://gitlab.kitware.com/cmake/cmake/issues/17185
i.e. a problem with sm53?

That makes a lot of sense! 3.2, 5.3, 6.2 and 7.2 are only for Tegras, which may not have the same library functions. I remove those CC.

@griwodz
Copy link
Member Author

griwodz commented Aug 5, 2019

vcpkg fails for both AppVeyor checks and the reason seems to a version problem between vcpkg and the available packages. I've googled and have pushed an appveyor.yml. It contains in comments an upgrade command that I've found on two websites. It would have to be uncommented once, and removed afterwards.
@simogasp
Does that seem like the right thing to do?

@simogasp
Copy link
Member

simogasp commented Aug 5, 2019

I have no idea, I don't understand why it fails. I tried those commands in my fork
https://ci.appveyor.com/project/simogasp/cctag
and even worse it does not manage to compile the latest version of vcpkg.
Normally appveyor ships a given version of the code of vcpkg, maybe it's not a good idea to mess with that.

@griwodz
Copy link
Member Author

griwodz commented Aug 5, 2019

I have no idea, I don't understand why it fails. I tried those commands in my fork
https://ci.appveyor.com/project/simogasp/cctag
and even worse it does not manage to compile the latest version of vcpkg.
Normally appveyor ships a given version of the code of vcpkg, maybe it's not a good idea to mess with that.

Oh well, that does probably mean that this PR is going to hang in this state because it seems to be fine otherwise.

@griwodz
Copy link
Member Author

griwodz commented Aug 7, 2019

@fabiencastan AppVeyor in CCTag complains that vcpkg must be updated. Do you know if it safe to do that with those commands that I added as a comment in the appveyor.yml in this PR #109 ? Or in some other way?

@simogasp
Copy link
Member

simogasp commented Aug 8, 2019

LGTM, same here, if someone manages to build on Windoze just to be sure, then we can merge.

@griwodz griwodz merged commit 027c3b3 into develop Aug 12, 2019
@griwodz griwodz deleted the betterCudaGencode branch August 12, 2019 06:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants