Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PaddlePaddle源代码解析之:Parameter切分逻辑分析 #1994

Closed
jacquesqiao opened this issue May 3, 2017 · 3 comments
Closed

PaddlePaddle源代码解析之:Parameter切分逻辑分析 #1994

jacquesqiao opened this issue May 3, 2017 · 3 comments

Comments

@jacquesqiao
Copy link
Member

jacquesqiao commented May 3, 2017

相关参数

  • --parameter_block_size - 参数服务器的参数分块大小。如果未设置,将会自动计算出一个合适的值.

    • 类型: int32 (默认: 0).
  • --parameter_block_size_for_sparse - 参数服务器稀疏更新的参数分块大小。如果未设置,将会自动计算出一个合适的值.

    • 类型: int32 (默认: 0).

相关数据结构

message ParameterBlock {
  // it accurately means parameter id.  
  required uint64 para_id = 1;
  // global sparse row or dense block for each block in parameter 
  required uint64 block_id = 2;
  // offset in (local) storage  
  required uint64 begin_pos = 3;
  // actual size of block, size for last block is [endDim -beginDim],  // others is parameter_block_size in ParameterConfig  
  required uint64 block_size = 4;
}

parameter_block_size

本地以ParameterSegments为单位。

struct ParameterSegments {
  std::string name;  // name of the parameter  
  size_t id;         // id of the parameter
};

ParameterSegments和parameter是一一对应的

parameter block切分逻辑代码位置

详细的计算过程在下面prepareSendData中。
void ParameterClient2::sendAndReceiveParameter()
void ParameterClient2::prepareSendData()

serverId的计算:

nameHash:由parameterName决定
blockId:由parameter_block_size和当前所取到的block偏移有关系。
int serverId = std::abs((blockId + nameHash) % serviceNum_);

@helinwang
Copy link
Contributor

@jacquesqiao In our mind, we think the parameters of a model consists of some tensors. A question is: when we split the parameters into block, would we split tensors, or we just group tensors into blocks?

@jacquesqiao
Copy link
Member Author

jacquesqiao commented May 4, 2017

prepare data 详解

在调用sendAndReceiveParameter之前,首先要调用prepareSendData,preparedata负责切分好parameter数据,并分配到合适的pserver上。下面是dense parameter的过程:

      real* buf =
          sendingPara ? parameter->getBuf(parameterType)->getPoint(0) : nullptr;
      uint64_t endDim = 0;
      for (uint64_t beginDim = 0; beginDim < paraSize; beginDim = endDim) {
        endDim = std::min<int64_t>(beginDim + blockSize, paraSize);
        int64_t blockId = beginDim / blockSize;
        int serverId = std::abs((blockId + nameHash) % serviceNum_);

        auto& request = sendJob->parallelRequests[serverId];
        ParameterBlock* block = request.add_blocks();
        block->set_para_id(segments.id);
        block->set_block_id(blockId);
        block->set_begin_pos(beginDim);
        block->set_block_size(endDim - beginDim);
        if (buf) {
          sendJob->parallelInputIovs[serverId].push_back(
              {buf + beginDim, sizeof(real) * ((size_t)(endDim - beginDim))});
        }
      }

最后的parallelInputIovs是做切分的地方,SendJob这个数据结构中,包含了具体的数据parallelInputIovs,上面的for循环中,通过beginDim和endDim来控制buf的地址,实现切分。

  struct SendJob {
    /// store parameters related blocks data
    InputIovs parallelInputIovs;
    /// store protobuf request
    SendRequest parallelRequests;
    /// store data, such as features for metric learning
    SendDataRequestVec parallelDataRequests;
  };

preparedata生成的sendJob,会被放入sendJobQueue_

`sendJobQueue_[i]->enqueue(sendJob);`

最终会调用ProtoClient的send发送出去

clients_[i].send("sendParameter",
                         recvJob->parallelRequests[i],
                         recvJob->parallelInputIovs[i]);

结论:

parameter切分是把parameter当成一个buf,按blocksize切分。

@jacquesqiao jacquesqiao changed the title paddle parameter切分逻辑分析 PaddlePaddle源代码解析之:Parameter切分逻辑分析 May 4, 2017
@jacquesqiao
Copy link
Member Author

jacquesqiao commented May 4, 2017

pserver如何organize/save/update parameter

ParameterServer2::addGradient

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants