Skip to content
This repository has been archived by the owner on Dec 20, 2024. It is now read-only.

Pull image again encounter 502 error #702

Closed
cd1989 opened this issue Jul 22, 2019 · 11 comments
Closed

Pull image again encounter 502 error #702

cd1989 opened this issue Jul 22, 2019 · 11 comments
Assignees
Labels
areas/images kind/bug This is bug report for project

Comments

@cd1989
Copy link
Contributor

cd1989 commented Jul 22, 2019

Ⅰ. Issue Description

The following error occurs when I:

  • pull one image
  • remove it with docker rmi -f
  • pull the image again
error pulling image configuration: received unexpected HTTP status: 502 Bad Gateway

Logs of dfclient:

2019-07-22 05:31:14.288 INFO sign:175-1563773474.195 : do register to 192.168.21.16:8002, res:{"code":200,"msg":"success","data":{"taskId":"8a991426954e1ad419c72ef55679648b00cc862cfa74dc7189ae30acedc03dec","fileLength":7553,"pieceSize":4194304}} error:<nil>
2019-07-22 05:31:14.288 INFO sign:175-1563773474.195 : do register result:{"code":200,"msg":"success","data":{"taskId":"8a991426954e1ad419c72ef55679648b00cc862cfa74dc7189ae30acedc03dec","fileLength":7553,"pieceSize":4194304}} and cost:0.088s
2019-07-22 05:31:14.289 INFO sign:175-1563773474.195 : downloading piece:{"taskID":"8a991426954e1ad419c72ef55679648b00cc862cfa74dc7189ae30acedc03dec","superNode":"192.168.21.16:8002","dstCid":"","range":"","result":502,"status":700,"pieceSize":0,"pieceNum":0}
2019-07-22 05:31:14.296 INFO sign:175-1563773474.195 : pull rate result:5242880 cost:2.544586ms
2019-07-22 05:31:16.196 WARN sign:164-1563773474.115 : get item timeout(2s) from queue.
2019-07-22 05:31:16.201 WARN sign:158-1563773474.110 : get item timeout(2s) from queue.
2019-07-22 05:31:16.286 WARN sign:169-1563773474.189 : get item timeout(2s) from queue.
2019-07-22 05:31:16.296 WARN sign:175-1563773474.195 : get item timeout(2s) from queue.
2019-07-22 05:31:18.196 WARN sign:164-1563773474.115 : get item timeout(2s) from queue.
2019-07-22 05:31:18.202 WARN sign:158-1563773474.110 : get item timeout(2s) from queue.
2019-07-22 05:31:18.288 WARN sign:169-1563773474.189 : get item timeout(2s) from queue.
2019-07-22 05:31:18.296 WARN sign:175-1563773474.195 : get item timeout(2s) from queue.

Ⅱ. Describe what happened

Image pull failed with 502 error, and wait some time (1 min), pull again, it succeeded.

Ⅲ. Describe what you expected to happen

Image pulled successfully anytime I want.

Ⅳ. How to reproduce it (as minimally and precisely as possible)

  1. pull one image
  2. remove it with docker rmi -f
  3. pull the image again

Ⅴ. Anything else we need to know?

use proxy mode

Ⅵ. Environment:

  • dragonfly version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@pouchrobot pouchrobot added the kind/bug This is bug report for project label Jul 22, 2019
@sxauyhz
Copy link

sxauyhz commented Jul 23, 2019

me too!!,版本0.4.2

1,a b 两个节点,先在a节点执行docker pull完成后,再在b节点执行docker pull如下报错:

error pulling image configuration: received unexpected HTTP status: 503 Service Unavailable
再次执行docker pull是可OK的。

2,dfdaemon节点log:

2019-07-23 13:30:18.061 INFO sign:4615-1563859818.058 : do register result:{"code":200,"msg":"success","data":{"taskId":"07ed3d418973e74784b7c23c1d83447152e005e2840c601196cc1eddd0434b33","fileLength":129,"pieceSize":4194304}} and cost:0.001s
2019-07-23 13:30:18.061 INFO sign:4615-1563859818.058 : downloading piece:{"taskID":"07ed3d418973e74784b7c23c1d83447152e005e2840c601196cc1eddd0434b33","superNode":"10.10.1.1:8002","dstCid":"","range":"","result":502,"status":700,"pieceSize":0,"pieceNum":0}
2019-07-23 13:30:18.062 INFO sign:4615-1563859818.058 : pull rate result:17476267 cost:352.28µs
2019-07-23 13:30:19.907 WARN sign:4445-1563859807.844 : get item timeout(2s) from queue.
2019-07-23 13:30:19.964 WARN sign:4595-1563859817.959 : get item timeout(2s) from queue.
2019-07-23 13:30:20.062 WARN sign:4615-1563859818.058 : get item timeout(2s) from queue.
2019-07-23 13:30:21.907 WARN sign:4445-1563859807.844 : get item timeout(2s) from queue.
2019-07-23 13:30:21.964 WARN sign:4595-1563859817.959 : get item timeout(2s) from queue.
2019-07-23 13:30:22.062 WARN sign:4615-1563859818.058 : get item timeout(2s) from queue.
2019-07-23 13:30:23.907 WARN sign:4445-1563859807.844 : get item timeout(2s) from queue.
2019-07-23 13:30:23.964 WARN sign:4595-1563859817.959 : get item timeout(2s) from queue.
2019-07-23 13:30:24.063 WARN sign:4615-1563859818.058 : get item timeout(2s) from queue.
2019-07-23 13:30:25.907 WARN sign:4445-1563859807.844 : get item timeout(2s) from queue.
2019-07-23 13:30:25.964 WARN sign:4595-1563859817.959 : get item timeout(2s) from queue.
2019-07-23 13:30:26.063 WARN sign:4615-1563859818.058 : get item timeout(2s) from queue.
2019-07-23 13:30:27.907 WARN sign:4445-1563859807.844 : get item timeout(2s) from queue.
2019-07-23 13:30:27.962 ERRO sign:4595-1563859817.959 : download timeout(10.000s)
2019-07-23 13:30:27.963 INFO sign:4595-1563859817.959 : download FAIL cost:10.004s length:610 reason:0
2019-07-23 13:30:28.061 ERRO sign:4615-1563859818.058 : download timeout(10.000s)
2019-07-23 13:30:28.061 INFO sign:4615-1563859818.058 : download FAIL cost:10.004s length:129 reason:0
2019-07-23 13:30:29.908 WARN sign:4445-1563859807.844 : get item timeout(2s) from queue.
2019-07-23 13:30:31.908 WARN sign:4445-1563859807.844 : get item timeout(2s) from queue.
2019-07-23 13:30:33.908 WARN sign:4445-1563859807.844 : get item timeout(2s) from queue.
2019-07-23 13:30:35.908 WARN sign:4445-1563859807.844 : get item timeout(2s) from queue.
2019-07-23 13:30:37.406 WARN sign:4445-1563859807.844 : client range:37748736-41943039 cost:29.501 from peer:10.2.100.20, readCost:29.501, length:4194304
2019-07-23 13:30:37.420 WARN sign:4445-1563859807.844 : client range:29360128-33554431 cost:29.514 from peer:10.2.100.20, readCost:29.514, length:4194304
2019-07-23 13:30:37.420 INFO sign:4445-1563859807.844 : downloading piece:{"taskID":"89e525c591ec662c6fef5892049712b1a64d3630e5d0fe5f74d46eaad8b9c218","superNode":"10.10.1.1:8002","dstCid":"对方ip-3887-1563859755.085","range":"29360128-33554431","result":503,"status":701,"pieceSize":4194304,"pieceNum":7}
2019-07-23 13:30:37.788 ERRO sign:4445-1563859807.844 : read piece cont error:dial tcp 本节点自己IP:27851: connect: connection refused from dst:10.2.100.20:27851, wait 20 ms
。。。。。。

3,supernode的日志:

2019-07-23 13:30:07.902 INFO sign:9 : success to register peer &{IP:pull报错的节点iP HostName:pull报错的节点名称 Port:27866 Version:0.4.2}
2019-07-23 13:30:07.902 INFO sign:9 : success to register peer &{IP:pull报错的节点iP HostName:pull报错的节点名称 Port:27866 Version:0.4.2}
2019-07-23 13:30:07.903 INFO sign:9 : success to register peer &{IP:pull报错的节点iP HostName:pull报错的节点名称 Port:27866 Version:0.4.2}
2019-07-23 13:30:07.903 INFO sign:9 : CDN(SUCCESS) is running or has been downloaded successfully for taskID: f2f98ce3284f5d4ae995c491ffcbcd8fced00c924897a07c8dbd83d920a400e5
2019-07-23 13:30:07.903 INFO sign:9 : CDN(SUCCESS) is running or has been downloaded successfully for taskID: 89e525c591ec662c6fef5892049712b1a64d3630e5d0fe5f74d46eaad8b9c218
2019-07-23 13:30:07.903 INFO sign:9 : CDN(SUCCESS) is running or has been downloaded successfully for taskID: e68bd7fdb0ef53a3f8ae291f4b777648a49aecc5639716ee9b0eb704ee743487
2019-07-23 13:30:07.903 INFO sign:9 : success to register task &{CID:pull报错的节点iP-4445-1563859807.844 CallSystem: Dfdaemon:true Filter:[] Headers:map[Connection:close User-Agent:docker/1.13.1 go/go1.10.3 kernel/4.19.12-1.el7.elrepo.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 (linux))] Identifier: Md5: Path:/peer/file/6ba80063-902f-4ab7-b098-3df14dc4dfe7-4445-1563859807.844 PeerID:pull报错的节点名称-pull报错的节点iP-1563859807903033348 RawURL:http://docker-registy.com/v2/k8s/blobs/sha256:2409c3878ba187544bb24e61591e6888e81e01aad3ac711b852813275a9958b0 TaskURL:http://docker-registy.com/v2/k8s/blobs/sha256:2409c3878ba187544bb24e61591e6888e81e01aad3ac711b852813275a9958b0}
2019-07-23 13:30:07.903 INFO sign:9 : success to register task &{CID:pull报错的节点iP-4444-1563859807.843 CallSystem: Dfdaemon:true Filter:[] Headers:map[Connection:close User-Agent:docker/1.13.1 go/go1.10.3 kernel/4.19.12-1.el7.elrepo.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 (linux))] Identifier: Md5: Path:/peer/file/adf6e34d-9896-40fd-bd37-9dcab3ac4403-4444-1563859807.843 PeerID:pull报错的节点名称-pull报错的节点iP-1563859807902781986 RawURL:http://docker-registy.com/v2/k8s/blobs/sha256:2f50f5f71a1b46804f94818db7a320e30c2af97d7c01f6c9a9eb8fa901c34fe3 TaskURL:http://docker-registy.com/v2/k8s/blobs/sha256:2f50f5f71a1b46804f94818db7a320e30c2af97d7c01f6c9a9eb8fa901c34fe3}
2019-07-23 13:30:07.903 INFO sign:9 : success to register task &{CID:pull报错的节点iP-4447-1563859807.843 CallSystem: Dfdaemon:true Filter:[] Headers:map[Connection:close User-Agent:docker/1.13.1 go/go1.10.3 kernel/4.19.12-1.el7.elrepo.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 (linux))] Identifier: Md5: Path:/peer/file/0484960f-cb98-4e1b-98d3-f541cdfc5665-4447-1563859807.843 PeerID:pull报错的节点名称-pull报错的节点iP-1563859807902775383 RawURL:http://docker-registy.com/v2/k8s/blobs/sha256:0dd33c27203433ab6529420a5ce6d4fa9a3cd2aaf795ddd6fa2ff990a962d264 TaskURL:http://docker-registy.com/v2/k8s/blobs/sha256:0dd33c27203433ab6529420a5ce6d4fa9a3cd2aaf795ddd6fa2ff990a962d264}
2019-07-23 13:30:07.904 INFO sign:9 : success update dfgetTask status to RUNNING with taskID: e68bd7fdb0ef53a3f8ae291f4b777648a49aecc5639716ee9b0eb704ee743487 clientID: pull报错的节点iP-4444-1563859807.843
2019-07-23 13:30:07.904 INFO sign:9 : success update dfgetTask status to RUNNING with taskID: 89e525c591ec662c6fef5892049712b1a64d3630e5d0fe5f74d46eaad8b9c218 clientID: pull报错的节点iP-4445-1563859807.844
2019-07-23 13:30:07.904 INFO sign:9 : success update dfgetTask status to RUNNING with taskID: f2f98ce3284f5d4ae995c491ffcbcd8fced00c924897a07c8dbd83d920a400e5 clientID: pull报错的节点iP-4447-1563859807.843
2019-07-23 13:30:08.001 INFO sign:9 : success to register peer &{IP:pull报错的节点iP HostName:pull报错的节点名称 Port:27866 Version:0.4.2}
2019-07-23 13:30:08.001 INFO sign:9 : CDN(SUCCESS) is running or has been downloaded successfully for taskID: 07ed3d418973e74784b7c23c1d83447152e005e2840c601196cc1eddd0434b33
2019-07-23 13:30:08.001 INFO sign:9 : success to register task &{CID:pull报错的节点iP-4446-1563859807.845 CallSystem: Dfdaemon:true Filter:[] Headers:map[Connection:close User-Agent:docker/1.13.1 go/go1.10.3 kernel/4.19.12-1.el7.elrepo.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 (linux))] Identifier: Md5: Path:/peer/file/d9c20b06-d0d9-4dfb-a503-851702dd7413-4446-1563859807.845 PeerID:pull报错的节点名称-pull报错的节点iP-1563859808001617270 RawURL:http://docker-registy.com/v2/k8s/blobs/sha256:e74ce3a8a46f518b1d4eeac5af09ced86d18fcef7d153438d45306a0a501bb42 TaskURL:http://docker-registy.com/v2/k8s/blobs/sha256:e74ce3a8a46f518b1d4eeac5af09ced86d18fcef7d153438d45306a0a501bb42}
2019-07-23 13:30:08.002 INFO sign:9 : success update dfgetTask status to RUNNING with taskID: 07ed3d418973e74784b7c23c1d83447152e005e2840c601196cc1eddd0434b33 clientID: pull报错的节点iP-4446-1563859807.845
2019-07-23 13:30:17.961 INFO sign:9 : success to register peer &{IP:pull报错的节点iP HostName:pull报错的节点名称 Port:27866 Version:0.4.2}
2019-07-23 13:30:17.961 INFO sign:9 : CDN(SUCCESS) is running or has been downloaded successfully for taskID: f2f98ce3284f5d4ae995c491ffcbcd8fced00c924897a07c8dbd83d920a400e5
2019-07-23 13:30:17.961 INFO sign:9 : success to register task &{CID:pull报错的节点iP-4595-1563859817.959 CallSystem: Dfdaemon:true Filter:[] Headers:map[Connection:close User-Agent:docker/1.13.1 go/go1.10.3 kernel/4.19.12-1.el7.elrepo.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 (linux))] Identifier: Md5: Path:/peer/file/0aed2f29-8306-4e79-b4b1-0b107d850d95-4595-1563859817.959 PeerID:pull报错的节点名称-pull报错的节点iP-1563859817961662086 RawURL:http://docker-registy.com/v2/k8s/blobs/sha256:0dd33c27203433ab6529420a5ce6d4fa9a3cd2aaf795ddd6fa2ff990a962d264 TaskURL:http://docker-registy.com/v2/k8s/blobs/sha256:0dd33c27203433ab6529420a5ce6d4fa9a3cd2aaf795ddd6fa2ff990a962d264}
2019-07-23 13:30:17.962 INFO sign:9 : success update dfgetTask status to RUNNING with taskID: f2f98ce3284f5d4ae995c491ffcbcd8fced00c924897a07c8dbd83d920a400e5 clientID: pull报错的节点iP-4595-1563859817.959
2019-07-23 13:30:18.060 INFO sign:9 : success to register peer &{IP:pull报错的节点iP HostName:pull报错的节点名称 Port:27866 Version:0.4.2}
2019-07-23 13:30:18.060 INFO sign:9 : CDN(SUCCESS) is running or has been downloaded successfully for taskID: 07ed3d418973e74784b7c23c1d83447152e005e2840c601196cc1eddd0434b33
2019-07-23 13:30:18.060 INFO sign:9 : success to register task &{CID:pull报错的节点iP-4615-1563859818.058 CallSystem: Dfdaemon:true Filter:[] Headers:map[Connection:close User-Agent:docker/1.13.1 go/go1.10.3 kernel/4.19.12-1.el7.elrepo.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 (linux))] Identifier: Md5: Path:/peer/file/7593de92-33f5-4d65-8746-736c661caa72-4615-1563859818.058 PeerID:pull报错的节点名称-pull报错的节点iP-1563859818060604390 RawURL:http://docker-registy.com/v2/k8s/blobs/sha256:e74ce3a8a46f518b1d4eeac5af09ced86d18fcef7d153438d45306a0a501bb42 TaskURL:http://docker-registy.com/v2/k8s/blobs/sha256:e74ce3a8a46f518b1d4eeac5af09ced86d18fcef7d153438d45306a0a501bb42}
2019-07-23 13:30:18.061 INFO sign:9 : success update dfgetTask status to RUNNING with taskID: 07ed3d418973e74784b7c23c1d83447152e005e2840c601196cc1eddd0434b33 clientID: pull报错的节点iP-4615-1563859818.058
2019-07-23 13:30:37.852 WARN sign:9 : peerID: pull报错的节点名称-pull报错的节点iP-1563859807903033348 got errors for 6 times which reaches error limit: 5 for taskID(89e525c591ec662c6fef5892049712b1a64d3630e5d0fe5f74d46eaad8b9c218)
2019-07-23 13:30:37.872 WARN sign:9 : peerID: pull报错的节点名称-pull报错的节点iP-1563859807903033348 got errors for 7 times which reaches error limit: 5 for taskID(89e525c591ec662c6fef5892049712b1a64d3630e5d0fe5f74d46eaad8b9c218)
2019-07-23 13:30:37.997 WARN sign:9 : peerID: pull报错的节点名称-pull报错的节点iP-1563859807903033348 got errors for 6 times which reaches error limit: 5 for taskID(89e525c591ec662c6fef5892049712b1a64d3630e5d0fe5f74d46eaad8b9c218)
2019-07-23 13:30:38.300 WARN sign:9 : peerID: pull报错的节点名称-pull报错的节点iP-1563859807903033348 got errors for 6 times which reaches error limit: 5 for taskID(89e525c591ec662c6fef5892049712b1a64d3630e5d0fe5f74d46eaad8b9c218)

@cd1989
Copy link
Contributor Author

cd1989 commented Jul 23, 2019

Set a longer timeout by decrease minrate in dfget.yml seems to work for me.

 minRate: 32k

@yeya24
Copy link
Collaborator

yeya24 commented Jul 23, 2019

@sxauyhz Can you have a try to set minRate in /etc/dragonfly/dfget.yml as @cd1989 said?
Actually you should set minRate to int values, otherwise it will unmarshal fail.

minRate: 20480

This is an example to set minimal rate as 20k.

@sxauyhz
Copy link

sxauyhz commented Jul 25, 2019

starnop/supernode:0.4.3_pre2
starnop/dfclient:0.4.3_pre2 测试暂时未复现

@sxauyhz
Copy link

sxauyhz commented Jul 25, 2019

2019-07-22 05:31:14.289 INFO sign:175-1563773474.195 : downloading piece:{"taskID":"8a991426954e1ad419c72ef55679648b00cc862cfa74dc7189ae30acedc03dec","superNode":"192.168.21.16:8002","dstCid":"","range":"","result":502,"status":700,"pieceSize":0,"pieceNum":0}
此处dstCid为空,表示任务开始,是一种正常状态。

@sxauyhz
Copy link

sxauyhz commented Jul 25, 2019

dfget 的默认的--timeout参数是干啥的,和minRate: 20480区别是啥 ?
@yeya24

@yeya24
Copy link
Collaborator

yeya24 commented Jul 25, 2019

The --timeout specifies the max time for one task to run. If your pulling process exceeds this time and then downloading will fail.
The minRate specifies the minimum download rate, if timeout isn't specified from dfget flag, supernode will calculate timeout using file length / minRate . So the minRate you set smaller, the download timeout you get will be larger.
Actually these two parameters work similarly, all used to specify timeout. But currently we can only add --timeout to dfget. As for the situation of pulling docker images, currently we don't support passing --timeout to dfdaemon, so in order to configure timeout, we can only configure minRate in config files now.

@sxauyhz
Copy link

sxauyhz commented Jul 25, 2019

我感觉这个配置逻辑怪怪的,假如都封装给dfdaemon,透传给dfget就容易理解了。本来dfdaemon日志超时,但是要配置dfget.yaml ,感觉写到dfdaemon.yaml更合理。 @yeya24

@yeya24
Copy link
Collaborator

yeya24 commented Jul 25, 2019

Yes. It is quite confusing. Now we are planning to refactor this part. You can check out dragonflyoss/dragonfly#687 for more details.

@starnop
Copy link
Contributor

starnop commented Oct 30, 2019

Maybe this issue is about docker/cli#1143.

@supereagle
Copy link

Maybe this issue is about docker/cli#1143.

docker/cli#1143 is about the problem of Docker hub, not a private registry.

@lowzj lowzj closed this as completed May 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
areas/images kind/bug This is bug report for project
Projects
None yet
Development

No branches or pull requests

8 participants