-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PaddlePaddle源代码解析之:Parameter切分逻辑分析 #1994
Comments
@jacquesqiao In our mind, we think the parameters of a model consists of some tensors. A question is: when we split the parameters into block, would we split tensors, or we just group tensors into blocks? |
prepare data 详解在调用sendAndReceiveParameter之前,首先要调用prepareSendData,preparedata负责切分好parameter数据,并分配到合适的pserver上。下面是dense parameter的过程: real* buf =
sendingPara ? parameter->getBuf(parameterType)->getPoint(0) : nullptr;
uint64_t endDim = 0;
for (uint64_t beginDim = 0; beginDim < paraSize; beginDim = endDim) {
endDim = std::min<int64_t>(beginDim + blockSize, paraSize);
int64_t blockId = beginDim / blockSize;
int serverId = std::abs((blockId + nameHash) % serviceNum_);
auto& request = sendJob->parallelRequests[serverId];
ParameterBlock* block = request.add_blocks();
block->set_para_id(segments.id);
block->set_block_id(blockId);
block->set_begin_pos(beginDim);
block->set_block_size(endDim - beginDim);
if (buf) {
sendJob->parallelInputIovs[serverId].push_back(
{buf + beginDim, sizeof(real) * ((size_t)(endDim - beginDim))});
}
} 最后的parallelInputIovs是做切分的地方,SendJob这个数据结构中,包含了具体的数据parallelInputIovs,上面的for循环中,通过beginDim和endDim来控制buf的地址,实现切分。 struct SendJob {
/// store parameters related blocks data
InputIovs parallelInputIovs;
/// store protobuf request
SendRequest parallelRequests;
/// store data, such as features for metric learning
SendDataRequestVec parallelDataRequests;
}; preparedata生成的sendJob,会被放入sendJobQueue_ `sendJobQueue_[i]->enqueue(sendJob);` 最终会调用ProtoClient的send发送出去 clients_[i].send("sendParameter",
recvJob->parallelRequests[i],
recvJob->parallelInputIovs[i]); 结论:parameter切分是把parameter当成一个buf,按blocksize切分。 |
pserver如何organize/save/update parameterParameterServer2::addGradient |
相关参数
--parameter_block_size
- 参数服务器的参数分块大小。如果未设置,将会自动计算出一个合适的值.--parameter_block_size_for_sparse
- 参数服务器稀疏更新的参数分块大小。如果未设置,将会自动计算出一个合适的值.相关数据结构
parameter_block_size
本地以ParameterSegments为单位。
ParameterSegments和parameter是一一对应的
parameter block切分逻辑代码位置
详细的计算过程在下面
prepareSendData
中。void ParameterClient2::sendAndReceiveParameter()
void ParameterClient2::prepareSendData()
serverId的计算:
nameHash:由parameterName决定
blockId:由parameter_block_size和当前所取到的block偏移有关系。
int serverId = std::abs((blockId + nameHash) % serviceNum_);
The text was updated successfully, but these errors were encountered: