Releases: dragonflyoss/dragonfly
v2.2.0
Dragonfly v2.2.0 is released! 🎉🎉🎉 Thanks the contributors who made this release happend and welcome you to visit d7y.io website.
Features
Client written in Rust
The client is written in Rust, offering advantages such as ensuring memory safety, improving performance, etc. The client is a submodule of Dragonfly, refer to dragonflyoss/client.
Client supports bandwidth rate limiting for prefetching
Client now supports rate limiting for prefetch requests, which can prevent network overload and reduce competition with other active download tasks, thereby enhancing overall system performance. Refer to the documentation to configure the proxy.prefetchRateLimit
option.
The following diagram illustrates the usage of download rate limit, upload rate limi, and prefetch rate limi for the client.
Client supports leeching
If the user configures the client to disable sharing, it will become a leech.
Optimize client's performance for handling a large number of small I/Os by Nydus
- Add the
X-Dragonfly-Prefetch
HTTP header. IfX-Dragonfly-Prefetch
is set to true and it is a range request, the client will prefetch the entire task. This feature allows Nydus to control which requests need prefetching. - The client's HTTP proxy adds an independent cache to reduce requests to the gRPC server, thereby reducing request latency.
- Increase the memory cache size in RocksDB and enable prefix search for quickly searching piece metadata.
- Use the
CRC-32-Castagnoli
algorithm with hardware acceleration to reduce the hash calculation cost for piece content. - Reuse the gRPC connections for downloading and optimize the download logic.
Defines the V2 of the P2P transfer protocol
Define the V2 of the P2P transfer protocol to make it more standard, clearer, and better performing, refer to dragonflyoss/api.
Enhanced Harbor Integration with P2P Preheating
Dragonfly improves its integration with Harbor v2.13 for preheating images, includes the following enhancements:
- Support for preheating multi architecture images.
- User can select the preheat scope for multi-granularity preheating. (Single Seed Peer, All Seed Peers, All Peers)
- User can specify the scheduler cluster ids for preheating images to the desired Dragonfly clusters.
Refer to documentation for more details.
Task Manager
User can search all peers of cached task by task ID or download URL, and delete the cache on the selected peers, refer to the documentation.
Peer Manager
Manager will regularly synchronize peers' information and also allows for manual refreshes. Additionally, it will display peers' information on the Manager Console.
Add hostname regexes and CIDRs to cluster scopes for matching clients.
When the client starts, it reports its hostname and IP to the Manager. The Manager then returns the best matching cluster (including schedulers and seed peers) to the client based on the cluster scopes configuration.
Supports distributed rate limiting for creating jobs across different clusters
User can configure rate limiting for job creation across different clusters in the Manager Console.
Support preheating images using self-signed certificates
Preheating requires calling the container registry to parse the image manifest and construct the URL for downloading blobs. If the container registry uses a self-signed certificate, user can configure the self-signed certificate in the Manager's config for calling to the container registry.
Support mTLS for gRPC calls between services
By setting self-signed certificates in the configurations of the Manager, Scheduler, Seed Peer, and Peer, gRPC calls between services will use mTLS.
Observability
Dragonfly is recommending to use prometheus for monitoring. Prometheus and grafana configurations are maintained in the dragonflyoss/monitoring repository.
Grafana dashboards are listed below:
Name | ID | Link | Description |
---|---|---|---|
Dragonfly Manager | 15945 | https://grafana.com/grafana/dashboards/15945 | Grafana dashboard for dragonfly manager. |
Dragonfly Scheduler | 15944 | https://grafana.com/grafana/dashboards/15944 | Granafa dashboard for dragonfly scheduler. |
Dragonfly Client | 21053 | https://grafana.com/grafana/dashboards/21053 | Grafana dashboard for dragonfly client and dragonfly seed client. |
Dragonfly Seed Client | 21054 | https://grafana.com/grafana/dashboards/21054 | Grafana dashboard for dragonfly seed client. |
Nydus
Nydus v2.3.0 is released, refer to Nydus Image Service v2.3.0 for more details.
- builder: support --parent-bootstrap for merge.
- builder/nydusd: support batch chunks mergence.
- nydusify/nydus-snapshotter: support OCI reference types.
- nydusify: support export/import for remote images.
- nydusify: support --push-chunk-size for large size image.
- nydusd/nydus-snapshotter: support basic failover and hot upgrade.
- nydusd: support overlay writable mount for fusedev.
Console
Console v0.2.0 is released, featuring a redesigned UI and an improved interaction flow. Additionally, more functional pages have been added, such as preheating, task manager, PATs(Personal Access Tokens) manager, etc. Refer to the documentation for more details.
Document
Refactor the website documentation to make Dragonfly simpler and more practical for users, refer to d7y.io.
Significant bug fixes
The following content only highlights the significant bug fixes in this release.
- Fix the thread safety issue that occurs when constructing the DAG(Directed Acyclic Graph) during scheduling.
- Fix the memory leak caused by the OpenTelemetry library.
- Avoid hot reload when dynconfig refresh data from Manager.
- Prevent concurrent download requests from causing failures in state machine transitions.
- Use
context.Background()
to avoid stream cancel by dfdaemon. - Fix the database performance issue caused by clearing expired jobs when there are too many job records.
- Reuse the gRPC connection pool to prevent redundant request construction.
AI Infrastructure
Model Spec
The Dragonfly community is collaboratively defining the OCI Model Specification. OCI Model Specification aims to provide a standard way to package, distribute and run AI models in a cloud native environment. The goal of this specification is to package models in an OCI artifact to take advantage of OCI distribution and ensure efficient model deployment, refer to CloudNativeAI/model-spec for more details.
Support accelerated distribution of AI models in Hugging Face Hub(Git LFS)
Distribute larg...
v2.1.67
v2.1.66
Changelog
- d0e8039 chore(deps): bump actions/cache from 4.1.2 to 4.2.0 (#3694)
- a7e986b chore(deps): bump actions/setup-go from 5.1.0 to 5.2.0 (#3707)
- 6116513 chore(deps): bump codecov/codecov-action from 5.0.7 to 5.1.1 (#3696)
- 9b648fc chore(deps): bump docker/build-push-action from 6.7.0 to 6.10.0 (#3693)
- 53f2ce4 chore(deps): bump github.com/bits-and-blooms/bitset from 1.16.0 to 1.18.0 (#3692)
- b6814f3 chore(deps): bump github.com/bits-and-blooms/bitset from 1.18.0 to 1.19.1 (#3701)
- 46d3d17 chore(deps): bump github.com/go-redsync/redsync/v4 from 4.8.1 to 4.13.0 (#3702)
- 39fb5f3 chore(deps): bump github.com/onsi/ginkgo/v2 from 2.20.1 to 2.22.0 (#3688)
- 83b3135 chore(deps): bump github.com/onsi/gomega from 1.35.1 to 1.36.1 (#3704)
- 2587a3c chore(deps): bump github/codeql-action from 3.27.5 to 3.27.6 (#3695)
- c1a6da3 chore(deps): bump github/codeql-action from 3.27.6 to 3.27.9 (#3708)
- 11974d9 chore(deps): bump go.opentelemetry.io/otel/sdk from 1.29.0 to 1.32.0 (#3689)
- 4e2c25b chore(deps): bump go.opentelemetry.io/otel/trace from 1.32.0 to 1.33.0 (#3705)
- 50cbd6f chore(deps): bump golang.org/x/crypto from 0.28.0 to 0.30.0 (#3691)
- 5511c07 chore(deps): bump golang.org/x/crypto from 0.30.0 to 0.31.0 (#3710)
- 6ea7c0b chore: update client version to v0.1.125 (#3687)
- 6e4b473 chore: update console verison to v0.1.41 (#3700)
- 62b6c37 feat: add AllSeedPeersScope for preheating (#3698)
- a370da1 feat: add client version for MakeSchedulersKeyForPeerInManager (#3711)
- 79a845e feat: load empty return false in persistentcache (#3697)
- 063a20d feat: when the redis is disabled, AnnounceHost need to skip store redis (#3712)
v2.1.65
Changelog
- b83a4c9 chore(deps): bump codecov/codecov-action from 5.0.2 to 5.0.7 (#3671)
- fd51bee chore(deps): bump github.com/bits-and-blooms/bitset from 1.13.0 to 1.16.0 (#3667)
- 68c957d chore(deps): bump github.com/schollz/progressbar/v3 from 3.17.0 to 3.17.1 (#3665)
- 95d5370 chore(deps): bump github/codeql-action from 3.27.4 to 3.27.5 (#3670)
- c91cfb2 chore: update rust client version (#3685)
- d0e41b5 enhance: support syncpeers by service and optimize the merge logic (#3637)
- a97584a feat: delete jobs in batches (#3682)
- aa78396 feat: optimize implement of the sync peers (#3677)
- c220a60 feat: remove deploy without docker compose (#3672)
- 49ae448 feat: reuse connections and limit the number of connections for preheating (#3683)
- 49f52a0 feat: support CRC-32-Castagnoli algorithm (#3664)
v2.1.64
Changelog
- 1b1014f chore(deps): bump codecov/codecov-action from 4.6.0 to 5.0.2 (#3659)
- c05bff9 chore(deps): bump github.com/gammazero/deque from 0.2.1 to 1.0.0 (#3657)
- 3aea1e6 chore(deps): bump github.com/swaggo/swag from 1.16.3 to 1.16.4 (#3655)
- 34e343a chore(deps): bump github/codeql-action from 3.27.0 to 3.27.1 (#3646)
- 2a3e846 chore(deps): bump github/codeql-action from 3.27.1 to 3.27.4 (#3658)
- 97e73cb chore(deps): bump go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin from 0.56.0 to 0.57.0 (#3643)
- 11d6564 chore(deps): bump go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc from 0.56.0 to 0.57.0 (#3644)
- 1737eaa chore(deps): bump golang.org/x/oauth2 from 0.23.0 to 0.24.0 (#3647)
- 55a3bd0 chore(deps): bump golang.org/x/sync from 0.8.0 to 0.9.0 (#3654)
- 0125102 chore(deps): bump google.golang.org/api from 0.199.0 to 0.205.0 (#3645)
- fb74a23 chore(deps): bump google.golang.org/protobuf from 1.35.1 to 1.35.2 (#3653)
- aca4d14 chore(deps): bump goreleaser/goreleaser-action from 6.0.0 to 6.1.0 (#3648)
- b370415 chore(deps): bump k8s.io/component-base from 0.29.2 to 0.31.2 (#3649)
- 386c91b chore: update client submodule (#3661)
- b2c8e76 feat: add disk bandwidth information for host (#3652)
- 555a132 feat: add garbage collection for persistent cache host (#3642)
- 9376c5d feat: optimize api for shceduling (#3660)
- 8711108 feat: store persistent cache host by announce host api (#3640)
v2.1.63
Changelog
- 83fec0b chore(deps): bump actions/cache from 4.1.0 to 4.1.2 (#3616)
- c1ed128 chore(deps): bump actions/checkout from 4.2.1 to 4.2.2 (#3618)
- 39dbfe2 chore(deps): bump actions/setup-go from 5.0.2 to 5.1.0 (#3615)
- 5040c1e chore(deps): bump github.com/golang-jwt/jwt/v4 from 4.5.0 to 4.5.1 (#3632)
- a9301db chore(deps): bump github.com/onsi/gomega from 1.34.1 to 1.35.1 (#3631)
- b19bed5 chore(deps): bump github.com/schollz/progressbar/v3 from 3.14.6 to 3.17.0 (#3610)
- bd08063 chore(deps): bump github/codeql-action from 3.26.12 to 3.27.0 (#3617)
- 5a3e70c chore(deps): bump go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin from 0.53.0 to 0.56.0 (#3612)
- a718fd7 chore(deps): bump go.uber.org/mock from 0.4.0 to 0.5.0 (#3630)
- d5ab25a chore(deps): bump golang.org/x/time from 0.6.0 to 0.7.0 (#3627)
- 10062d7 chore: update console and rust client version (#3639)
- 071ab91 chore: update console submodule (#3641)
- 4a791c7 chore: update golang version to v1.23.0 (#3609)
- 19b38a1 feat: add filtered query params of the containerd (#3621)
- b31e5be feat: add rate limit for job open api by cluster (#3638)
- e826d72 feat: implement delete peer and task in persistent cache (#3623)
- 58959be feat: implement delete persistent cache task in scheduler (#3619)
- 4f8fb8f feat: implement upload persistent cache task (#3620)
- ab11fbf feat: optimize error message of preheating (#3622)
- da8eab8 fix: generate wrong sql with gorm (#3626)
v2.1.62
Changelog
- dc10763 Add tests for ListHosts() and DeleteHost() (#3604)
- d6981cd chore(deps): bump actions/checkout from 4.2.0 to 4.2.1 (#3600)
- e0e0c47 chore(deps): bump actions/upload-artifact from 4.3.6 to 4.4.3 (#3599)
- 0fc7304 chore(deps): bump github.com/prometheus/client_golang from 1.20.4 to 1.20.5 (#3598)
- 37178c6 chore(deps): bump go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc from 0.54.0 to 0.56.0 (#3594)
- 4c50a0d chore: update client version to v0.1.113 (#3593)
- 2793851 feat: add CreateJobCount and CreateJobSuccessCount metrics (#3588)
- 90d2017 feat: add DefaultFilteredQueryParams for job (#3608)
- 071072f feat: add E2E tests for cases that peers going offline (#3524)
- eb4e101 feat: add interface of the persistent cache resource (#3602)
- 2e1a6b5 feat: add peer manager for persistent cache task (#3592)
- 713d5aa feat: change job gc interval in manager (#3591)
- 770d6c9 feat: implement StatPersistentCachePeerRequest and StatPersistentCacheTaskRequest for persistent cache (#3603)
- 5a51a61 feat: support searching task by url for GetTask and DeleteTask (#3607)
v2.1.61
Changelog
- da799c6 chore(deps): bump actions/cache from 4.0.2 to 4.1.0 (#3561)
- 6c363eb chore(deps): bump actions/checkout from 4.1.7 to 4.2.0 (#3549)
- d14c7f7 chore(deps): bump codecov/codecov-action from 4.5.0 to 4.6.0 (#3559)
- 385a813 chore(deps): bump github.com/docker/docker from 27.1.1+incompatible to 27.3.1+incompatible (#3556)
- 734c7e6 chore(deps): bump github/codeql-action from 3.26.8 to 3.26.12 (#3565)
- fa03cde chore(deps): bump google.golang.org/api from 0.197.0 to 0.199.0 (#3554)
- 1bad8a8 chore: generate SBOM for release artifacts (#3585)
- 030f337 chore: generate sbom for release artifacts (#3587)
- 5ab6450 chore: update go version to v1.22.4 (#3580)
- da10972 chore: update rust client and console submodule (#3567)
- 1198c98 feat: add fsm for persistent cache peer (#3563)
- 5c52a02 feat: add self-signed certs for mTLS (#3583)
- e3b8583 feat: support set self-signed cert for service (#3568)
v2.1.60
Changelog
- 8b6e40f chore: update client-rs version (#3562)
- 4a7ae85 feat: add auto switch scheduler e2e test (#3486)
- 4d2e929 feat: add downloadRate and uploadRate for host (#3548)
- bd8ecfb feat: add host manager for persistent cache (#3546)
- 53f5e9c feat: add persistent cache task for scheduler (#3545)
- 7253f0f feat: increase interval of the preheat polling (#3544)
- 8d956eb feat: removed network topology (#3547)
- 688b9d7 feat: rename
scheduler/resource
toscheduler/resource/standard
(#3542) - 9cd6f41 feat: update new task type(TaskType_STANDARD, TaskType_PERSISTENT, TaskType_PERSISTENT_CACHE) (#3540)
v2.1.59
Changelog
- 4de427a chore(deps): bump actions/checkout from 4.1.1 to 4.1.7 (#3529)
- 6d75fb9 chore(deps): bump github.com/prometheus/client_golang from 1.19.0 to 1.20.4 (#3532)
- cb30b04 chore(deps): bump github/codeql-action from 3.26.2 to 3.26.8 (#3530)
- e5440fb chore(deps): bump sigstore/cosign-installer from 3.5.0 to 3.6.0 (#3528)
- c9180af chore: update api version to v2.0.158 and update helm chart (#3527)
- 1afe79e feat: add metrics for grpc api of the cache task (#3539)
- b1875df feat: fixed lint in manager sync_peers.go (#3536)
- 3e73231 feat: seed max concurrent (#3482)
- ea850f7 feat: support preheat with self-signed certs (#3541)
- 61c3cf4 feat: support set max threads (#3537)
- b226996 fix(dfget): Change file path (#3519)
- 5056506 fix: make e2e test (#3487)
- 80717c7 fix: update get and delete task unit test and e2e test. (#3525)
- 820e719 select all peers in one scheduler_cluster (#3503)