[Dy2Stat] Refine PartialProgramLayer logic #33796

Aurelius84 · 2021-06-28T02:47:36Z

Others

Others

优化了 valid_vars 的Fake_vars逻辑，强制为empty的force_cpu tensor，避免引入额外 cuda memcpysyc（会阻塞kernel拉起）
新增入口函数Tensor的stop_gradient判断，减少grad_op的计算量（如conv2d_grad_op）
优化了temp_scope_var的创建逻辑，放在__init__，仅创建一次，减少foward的开销
优化了grad_var的valid_vars逻辑
移除对nn.Layer的继承，改为__call__直接调用

paddle-bot-old · 2021-06-28T02:47:40Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

zhhsplendid

LGTM

refine temp_scope_vec logic

d67c56e

Aurelius84 added 5 commits June 28, 2021 06:29

polish partial_program

fc5f24e

fix fake var

bfa5913

add stop_gradient in spec

392d0bb

fix fake_var

d7d945b

fix unittest

961ef87

Aurelius84 requested a review from zhhsplendid June 29, 2021 12:30

Aurelius84 changed the title ~~[Dy2Stat] Refine temp_scope_vec logic~~ [Dy2Stat] Refine PartialProgramLayer logic Jun 29, 2021

zhhsplendid approved these changes Jun 30, 2021

View reviewed changes

Aurelius84 merged commit 97f86d8 into PaddlePaddle:develop Jun 30, 2021

Aurelius84 mentioned this pull request Jul 12, 2021

Upgrade Executor into ParallelExcutor to apply Graph Optimization in @to_static #32283

Merged

Provide feedback