Skip to content
This repository has been archived by the owner on Dec 20, 2024. It is now read-only.

Proposal: Dragonfly supports seed pattern to solve massive proxy requests which may cause huge pressure on rss or supernode. #1284

Open
wangforthinker opened this issue Apr 15, 2020 · 0 comments
Labels
kind/proposal any proposals for project

Comments

@wangforthinker
Copy link
Contributor

wangforthinker commented Apr 15, 2020

Backgrounds

Now the clients of Dragonfly should communicate with supernode when proxying every request. Thus, when massive machines of a cluster request a large file, supernode may be the key node which influences the continuous improvement of better performance.

On the other word, every proxy request should at least cost four RTT, such as register, peer wait, getPeerInfo, proxy from target peer. The first three RTT could be considered as the extra RTT. In our case, there are some demand for low latency. And every request may be small range request and the costs of extra RTT could not be ignored.

So a solution is needed to reduce the extra RTT and the pressure of supernode.

Idea

Seed pattern provides the asynchronously communication with supernode, in which peer could asynchronously fetch the p2p network info from supernode and self simply schedule the target peer to get data.

A seed represents a file which is always defined by taskUrl. A peer could be selected as a seed, which provides the seed file to be downloaded by other peers. And only the seed node could request the resource data from rss, by the way which reduces the pressure of rss.

Feature

  • long live peerserver in dfdaemon.
  • the control stream between peer and supernode is asynchronization.
  • supernode selects the seed nodes.
  • a peer could be selected as various seeds which represent different url.
  • non-seed peer downloads the range bytes of file from peer nodes.
  • seed allows concurrent download requests even if seed file is not already cached.
  • seed info of peer is persistent even if peer restarts.

Architecture Diagram

This diagram describes the control flow and part data flow.

image

  • The communication between supernode and peer is asynchronization, which includes HeartBeat,Report,Fetch.
  • HeartBeat reports the peer base info to supernode every serval seconds. Thus supernode could update the aliveness of peers.
  • Report allows peer to report local seeds info to supernode.
  • Fetch allows peer to fetch the seeds info of p2p network, so peer knows the location of seed file to get data from seed node.

Seed Data Flow

This diagram describes how the seed node provides the data services.

image

  • Downloader requests the data of seed file from rss.
  • Uploader provides the data service to be downloaded by other peers.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/proposal any proposals for project
Projects
None yet
Development

No branches or pull requests

2 participants