Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何降低gpu的初始化时间 #318

Open
keke444 opened this issue Nov 27, 2024 · 1 comment
Open

如何降低gpu的初始化时间 #318

keke444 opened this issue Nov 27, 2024 · 1 comment

Comments

@keke444
Copy link

keke444 commented Nov 27, 2024

Hi:
使用benchmark测试同一模型,分别在cpu和gpu上的耗时,log 如下:

###########################################################################
/benchmark_mind --modelFile=ai-mfnr_v2.ms --enableFp16=true --device=CPU <
ModelPath = ai-mfnr_v2.ms
ModelType = MindIR
InDataPath =
GroupInfoFile =
ConfigFilePath =
InDataType = bin
LoopCount = 10
DeviceType = CPU
AccuracyThreshold = 0.5
CosineDistanceThreshold = -1.1
WarmUpLoopCount = 3
NumThreads = 2
InterOpParallelNum = 1
Fp16Priority = 1
EnableParallel = 0
calibDataPath =
EnableGLTexture = 0
cpuBindMode = HIGHER_CPU
CalibDataType = FLOAT
start unified benchmark run
PrepareTime = 40.809 ms
Running warm up loops...
Running benchmark loops...
Model = ai-mfnr_v2.ms, NumThreads = 2, MinRunTime = 377.393005 ms, MaxRuntime = 384.092987 ms, AvgRunTime = 380.451996 ms
Run Benchmark ai-mfnr_v2.ms Success.

##########################################################################
/benchmark_mind --modelFile=ai-mfnr_v2.ms --enableFp16=true --device=GPU <
ModelPath = ai-mfnr_v2.ms
ModelType = MindIR
InDataPath =
GroupInfoFile =
ConfigFilePath =
InDataType = bin
LoopCount = 10
DeviceType = GPU
AccuracyThreshold = 0.5
CosineDistanceThreshold = -1.1
WarmUpLoopCount = 3
NumThreads = 2
InterOpParallelNum = 1
Fp16Priority = 1
EnableParallel = 0
calibDataPath =
EnableGLTexture = 0
cpuBindMode = HIGHER_CPU
CalibDataType = FLOAT
start unified benchmark run
PrepareTime = 2743.9 ms
Running warm up loops...
Running benchmark loops...
Model = ai-mfnr_v2.ms, NumThreads = 2, MinRunTime = 630.763977 ms, MaxRuntime = 632.585999 ms, AvgRunTime = 631.505981 ms
Run Benchmark ai-mfnr_v2.ms Success.
##################################################################################

CPU 上prepare time 只有40.8ms, gpu 要2.8s

请问有没有方法可以降低gpu的初始化耗时?

谢谢

@zhouyifeng888
Copy link

你这是什么平台的GPU,如果服务器英伟达的GPU的话,可以尝试用下动态图,会启动比较快,动态图不需要一下子图编译;如果是用的mslite调用的端侧设备、手机等一些非英伟达GPU的话,貌似目前也没啥好的法子,离线推理也没有动态图这说法;
不过不论什么环境,初始化缓慢也就是在首次加载模型启动时比较慢,所以基本都是可以在应用启动时就初始化好的,后续应用运行过程中都是可以迅速调用推理的,所以通常的应用需求也不会影响使用体验

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants