-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 421: Adding manifests for various cluster installation flavours #422
Merged
Merged
Changes from 1 commit
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -43,11 +43,11 @@ controller: | |
replicas: 1 | ||
resources: | ||
requests: | ||
cpu: 1000m | ||
cpu: 500m | ||
memory: 1Gi | ||
limits: | ||
cpu: 2000m | ||
memory: 3Gi | ||
cpu: 1000m | ||
memory: 2Gi | ||
## service type and annotations are ignored if external access is disabled | ||
service: | ||
## used to override the service type for controller | ||
|
@@ -64,16 +64,16 @@ segmentStore: | |
resources: | ||
requests: | ||
cpu: 1000m | ||
memory: 3Gi | ||
memory: 4Gi | ||
limits: | ||
cpu: 2000m | ||
memory: 5Gi | ||
memory: 4Gi | ||
## service type and annotations are ignored if external access is disabled | ||
service: | ||
## used to override the service type for segmentStore | ||
type: | ||
annotations: {} | ||
jvmOptions: [] | ||
jvmOptions: ["-Xmx2g", "-XX:MaxDirectMemorySize=2g"] | ||
|
||
storage: | ||
|
||
|
@@ -92,10 +92,10 @@ storage: | |
## ecs is used to configure a Dell EMC ECS system as the long term storage backend | ||
## considered only if storage.longtermStorage.type = ecs | ||
ecs: {} | ||
# configUri: "" | ||
# bucket: "" | ||
# prefix: "" | ||
# credentials: "" | ||
# configUri: https://object.ecstestdrive.com?namespace=namespace%26identity=user%26secretKey=password | ||
# bucket: "bucket" | ||
# prefix: "prefix" | ||
# credentials: ecs-credentials | ||
|
||
## hdfs is used to configure an HDFS system as long term storage backend | ||
## considered only if storage.longtermStorage.type = hdfs | ||
|
@@ -109,4 +109,58 @@ storage: | |
className: standard | ||
|
||
options: | ||
# bookkeeper.ack.quorum.size: "3" | ||
## If you add more Bookies to the cluster and you want to exploit them for IO, then you need increase this number. The Bookkeeper ensemble defines | ||
## the number of Bookies that should be available or healthy to allow Bookkeeper to work. In production, we recommend this value to be: | ||
## bookkeeper.ensemble.size = bookieReplicas - F, | ||
## where F is the number of Bookie failures we can tolerate to continue working. For instance, if we instantiate 6 Bookies and we want to tolerate | ||
## up to F = 2 failures, then the bookkeeper.ensemble.size = 4. | ||
bookkeeper.ensemble.size: "3" | ||
## By default, we want to keep 3 replicas of the data in Bookkeeper to ensure durability. | ||
bookkeeper.write.quorum.size: "3" | ||
## By default, we set bookkeeper.write.quorum.size == bookkeeper.ack.quorum.size, so we wait for the ACK of all the Bookies on a write to proceed to the next one. This | ||
## has been proven as a factor that makes Pravega much more stable in the long run for IO heavy workloads, at the cost of sacrificing performance. | ||
## If we set bkWriteQuorumSize > bkAckQuorumSize, we can improve performance, but a temporarily "slow" Bookie may lead to OOM errors at both the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here, unnecessary explanations. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
## Segment Store or Bookkeeper sides, given that Pravega continues to write to the "slow" Bookie at the same pace as for the "fast" ones. For more | ||
## context, please see: https://github.com/pravega/pravega/issues/4058 and https://github.com/pravega/pravega/issues/4057 | ||
bookkeeper.ack.quorum.size: "3" | ||
## We need to increase this timeout to tolerate environments in which Tier 1 storage gets really saturated and Bookkeeper exhibits high latency. If | ||
## not, an IO workload that saturates Bookkeeper can induce the Segment Store into an state of unstability. | ||
bookkeeper.write.timeout.milliseconds: "60000" | ||
## This parameter will start inducing the max delay to Tier 1 operations if the Segment Store reaches 32MB of queued data waiting to be written to | ||
## Bookkeeper. We have tested that in a slow IO infrastructure like vSAN, this setting helps to protect the system from unstability. | ||
bookkeeper.write.outstanding.bytes.max: "33554432" | ||
## In Pravega +0.7 we use an in-memory cache. We need to be sure that the size of the cache is always lower than the direct memory set in JVM options. | ||
pravegaservice.cache.size.max: "1073741824" | ||
pravegaservice.cache.time.seconds.max: "600" | ||
hdfs.block.size: "67108864" | ||
## Despite AWS support team has suggested to set 1MB block sizes to write to Tier 2 to improve performance when using NFS, we found that a larger block | ||
## size significantly improves the throughput of a single segment against EFS. | ||
writer.flush.threshold.bytes: "67108864" | ||
writer.flush.size.bytes.max: "67108864" | ||
## The number of containers depends on the Segment Stores in the cluster (or vice-versa). To distributed the load of Segment Containers across Segment | ||
## Stores, we recommend to set 4 to 8 Segment Containers per Segment Store. Clearly, determining this number will depend on a provisioning decision. | ||
## Note that both containerCount values should be always the same between the Segment Store and the Controller service. | ||
pravegaservice.container.count: "8" | ||
controller.container.count: "8" | ||
## Increase the default Controller bucket count to distribute load across multiple instances. | ||
controller.retention.bucket.count: "4" | ||
controller.service.asyncTaskPool.size: "20" | ||
controller.retention.thread.count: "4" | ||
log.level: "INFO" | ||
## The following parameters are only useful if you are going to deploy metrics in this cluster. | ||
# metrics.dynamicCache.size: "100000" | ||
# metrics.statistics.enable: "true" | ||
# metrics.statsD.reporter.enable: "false" | ||
# metrics.statsD.connect.host: "telegraph.default" | ||
# metrics.statsD.connect.port: "8125" | ||
# metrics.influxDB.reporter.enable: "true" | ||
# metrics.output.frequency.seconds: "10" | ||
# metrics.influxDB.connect.uri: "http://INFLUXDB-IP:8086" | ||
# controller.metrics.dynamicCache.size: "100000" | ||
# controller.metrics.statistics.enable: "true" | ||
# controller.metrics.statsD.reporter.enable: "false" | ||
# controller.metrics.statsD.connect.host: "telegraph.default" | ||
# controller.metrics.statsD.connect.port: "8125" | ||
# controller.metrics.influxDB.reporter.enable: "true" | ||
# controller.metrics.output.frequency.seconds: "10" | ||
# controller.metrics.influxDB.connect.uri: "http://INFLUXDB-IP:8086" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
controller: | ||
replicas: 2 | ||
resources: | ||
requests: | ||
cpu: 500m | ||
memory: 1Gi | ||
limits: | ||
cpu: 2000m | ||
memory: 4Gi | ||
## service type and annotations are ignored if external access is disabled | ||
service: | ||
## used to override the service type for controller | ||
type: | ||
annotations: {} | ||
jvmOptions: [] | ||
|
||
segmentStore: | ||
replicas: 6 | ||
secret: {} | ||
# name: | ||
# path: | ||
env: | ||
resources: | ||
requests: | ||
cpu: 2000m | ||
memory: 16Gi | ||
limits: | ||
cpu: 4000m | ||
memory: 16Gi | ||
## service type and annotations are ignored if external access is disabled | ||
service: | ||
## used to override the service type for segmentStore | ||
type: | ||
annotations: {} | ||
jvmOptions: ["-Xmx4g", "-XX:MaxDirectMemorySize=12g"] | ||
|
||
options: | ||
## If you add more Bookies to the cluster and you want to exploit them for IO, then you need increase this number. The Bookkeeper ensemble defines | ||
## the number of Bookies that should be available or healthy to allow Bookkeeper to work. In production, we recommend this value to be: | ||
## bookkeeper.ensemble.size = bookieReplicas - F, | ||
## where F is the number of Bookie failures we can tolerate to continue working. For instance, if we instantiate 6 Bookies and we want to tolerate | ||
## up to F = 2 failures, then the bookkeeper.ensemble.size = 4. | ||
bookkeeper.ensemble.size: "3" | ||
## By default, we want to keep 3 replicas of the data in Bookkeeper to ensure durability. | ||
bookkeeper.write.quorum.size: "3" | ||
## By default, we set bookkeeper.write.quorum.size == bookkeeper.ack.quorum.size, so we wait for the ACK of all the Bookies on a write to proceed to the next one. This | ||
## has been proven as a factor that makes Pravega much more stable in the long run for IO heavy workloads, at the cost of sacrificing performance. | ||
## If we set bkWriteQuorumSize > bkAckQuorumSize, we can improve performance, but a temporarily "slow" Bookie may lead to OOM errors at both the | ||
## Segment Store or Bookkeeper sides, given that Pravega continues to write to the "slow" Bookie at the same pace as for the "fast" ones. For more | ||
## context, please see: https://github.com/pravega/pravega/issues/4058 and https://github.com/pravega/pravega/issues/4057 | ||
bookkeeper.ack.quorum.size: "3" | ||
## We need to increase this timeout to tolerate environments in which Tier 1 storage gets really saturated and Bookkeeper exhibits high latency. If | ||
## not, an IO workload that saturates Bookkeeper can induce the Segment Store into an state of unstability. | ||
bookkeeper.write.timeout.milliseconds: "60000" | ||
## This parameter will start inducing the max delay to Tier 1 operations if the Segment Store reaches 32MB of queued data waiting to be written to | ||
## Bookkeeper. We have tested that in a slow IO infrastructure like vSAN, this setting helps to protect the system from unstability. | ||
bookkeeper.write.outstanding.bytes.max: "33554432" | ||
## In Pravega +0.7 we use an in-memory cache. We need to be sure that the size of the cache is always lower than the direct memory set in JVM options. | ||
pravegaservice.cache.size.max: "11811160064" | ||
pravegaservice.cache.time.seconds.max: "600" | ||
hdfs.block.size: "67108864" | ||
## Despite AWS support team has suggested to set 1MB block sizes to write to Tier 2 to improve performance when using NFS, we found that a larger block | ||
## size significantly improves the throughput of a single segment against EFS. | ||
writer.flush.threshold.bytes: "67108864" | ||
writer.flush.size.bytes.max: "67108864" | ||
## The number of containers depends on the Segment Stores in the cluster (or vice-versa). To distributed the load of Segment Containers across Segment | ||
## Stores, we recommend to set 4 to 8 Segment Containers per Segment Store. Clearly, determining this number will depend on a provisioning decision. | ||
## Note that both containerCount values should be always the same between the Segment Store and the Controller service. | ||
pravegaservice.container.count: "48" | ||
controller.container.count: "48" | ||
## Increase the default Controller bucket count to distribute load across multiple instances. | ||
controller.retention.bucket.count: "10" | ||
controller.service.asyncTaskPool.size: "20" | ||
controller.retention.thread.count: "4" | ||
log.level: "INFO" | ||
## The following parameters are only useful if you are going to deploy metrics in this cluster. | ||
# metrics.dynamicCache.size: "100000" | ||
# metrics.statistics.enable: "true" | ||
# metrics.statsD.reporter.enable: "false" | ||
# metrics.statsD.connect.host: "telegraph.default" | ||
# metrics.statsD.connect.port: "8125" | ||
# metrics.influxDB.reporter.enable: "true" | ||
# metrics.output.frequency.seconds: "10" | ||
# metrics.influxDB.connect.uri: "http://INFLUXDB-IP:8086" | ||
# controller.metrics.dynamicCache.size: "100000" | ||
# controller.metrics.statistics.enable: "true" | ||
# controller.metrics.statsD.reporter.enable: "false" | ||
# controller.metrics.statsD.connect.host: "telegraph.default" | ||
# controller.metrics.statsD.connect.port: "8125" | ||
# controller.metrics.influxDB.reporter.enable: "true" | ||
# controller.metrics.output.frequency.seconds: "10" | ||
# controller.metrics.influxDB.connect.uri: "http://INFLUXDB-IP:8086" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
controller: | ||
replicas: 2 | ||
resources: | ||
requests: | ||
cpu: 500m | ||
memory: 1Gi | ||
limits: | ||
cpu: 1000m | ||
memory: 2Gi | ||
## service type and annotations are ignored if external access is disabled | ||
service: | ||
## used to override the service type for controller | ||
type: | ||
annotations: {} | ||
jvmOptions: [] | ||
|
||
segmentStore: | ||
replicas: 2 | ||
secret: {} | ||
# name: | ||
# path: | ||
env: | ||
resources: | ||
requests: | ||
cpu: 2000m | ||
memory: 8Gi | ||
limits: | ||
cpu: 4000m | ||
memory: 8Gi | ||
## service type and annotations are ignored if external access is disabled | ||
service: | ||
## used to override the service type for segmentStore | ||
type: | ||
annotations: {} | ||
jvmOptions: ["-Xmx4g", "-XX:MaxDirectMemorySize=4g"] | ||
|
||
options: | ||
## If you add more Bookies to the cluster and you want to exploit them for IO, then you need increase this number. The Bookkeeper ensemble defines | ||
## the number of Bookies that should be available or healthy to allow Bookkeeper to work. In production, we recommend this value to be: | ||
## bookkeeper.ensemble.size = bookieReplicas - F, | ||
## where F is the number of Bookie failures we can tolerate to continue working. For instance, if we instantiate 6 Bookies and we want to tolerate | ||
## up to F = 2 failures, then the bookkeeper.ensemble.size = 4. | ||
bookkeeper.ensemble.size: "3" | ||
## By default, we want to keep 3 replicas of the data in Bookkeeper to ensure durability. | ||
bookkeeper.write.quorum.size: "3" | ||
## By default, we set bookkeeper.write.quorum.size == bookkeeper.ack.quorum.size, so we wait for the ACK of all the Bookies on a write to proceed to the next one. This | ||
## has been proven as a factor that makes Pravega much more stable in the long run for IO heavy workloads, at the cost of sacrificing performance. | ||
## If we set bkWriteQuorumSize > bkAckQuorumSize, we can improve performance, but a temporarily "slow" Bookie may lead to OOM errors at both the | ||
## Segment Store or Bookkeeper sides, given that Pravega continues to write to the "slow" Bookie at the same pace as for the "fast" ones. For more | ||
## context, please see: https://github.com/pravega/pravega/issues/4058 and https://github.com/pravega/pravega/issues/4057 | ||
bookkeeper.ack.quorum.size: "3" | ||
## We need to increase this timeout to tolerate environments in which Tier 1 storage gets really saturated and Bookkeeper exhibits high latency. If | ||
## not, an IO workload that saturates Bookkeeper can induce the Segment Store into an state of unstability. | ||
bookkeeper.write.timeout.milliseconds: "60000" | ||
## This parameter will start inducing the max delay to Tier 1 operations if the Segment Store reaches 32MB of queued data waiting to be written to | ||
## Bookkeeper. We have tested that in a slow IO infrastructure like vSAN, this setting helps to protect the system from unstability. | ||
bookkeeper.write.outstanding.bytes.max: "33554432" | ||
## In Pravega +0.7 we use an in-memory cache. We need to be sure that the size of the cache is always lower than the direct memory set in JVM options. | ||
pravegaservice.cache.size.max: "3221225472" | ||
pravegaservice.cache.time.seconds.max: "600" | ||
hdfs.block.size: "67108864" | ||
## Despite AWS support team has suggested to set 1MB block sizes to write to Tier 2 to improve performance when using NFS, we found that a larger block | ||
## size significantly improves the throughput of a single segment against EFS. | ||
writer.flush.threshold.bytes: "67108864" | ||
writer.flush.size.bytes.max: "67108864" | ||
## The number of containers depends on the Segment Stores in the cluster (or vice-versa). To distributed the load of Segment Containers across Segment | ||
## Stores, we recommend to set 4 to 8 Segment Containers per Segment Store. Clearly, determining this number will depend on a provisioning decision. | ||
## Note that both containerCount values should be always the same between the Segment Store and the Controller service. | ||
pravegaservice.container.count: "16" | ||
controller.container.count: "16" | ||
## Increase the default Controller bucket count to distribute load across multiple instances. | ||
controller.retention.bucket.count: "10" | ||
controller.service.asyncTaskPool.size: "20" | ||
controller.retention.thread.count: "4" | ||
log.level: "INFO" | ||
## The following parameters are only useful if you are going to deploy metrics in this cluster. | ||
# metrics.dynamicCache.size: "100000" | ||
# metrics.statistics.enable: "true" | ||
# metrics.statsD.reporter.enable: "false" | ||
# metrics.statsD.connect.host: "telegraph.default" | ||
# metrics.statsD.connect.port: "8125" | ||
# metrics.influxDB.reporter.enable: "true" | ||
# metrics.output.frequency.seconds: "10" | ||
# metrics.influxDB.connect.uri: "http://INFLUXDB-IP:8086" | ||
# controller.metrics.dynamicCache.size: "100000" | ||
# controller.metrics.statistics.enable: "true" | ||
# controller.metrics.statsD.reporter.enable: "false" | ||
# controller.metrics.statsD.connect.host: "telegraph.default" | ||
# controller.metrics.statsD.connect.port: "8125" | ||
# controller.metrics.influxDB.reporter.enable: "true" | ||
# controller.metrics.output.frequency.seconds: "10" | ||
# controller.metrics.influxDB.connect.uri: "http://INFLUXDB-IP:8086" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
controller: | ||
replicas: 1 | ||
resources: | ||
requests: | ||
cpu: 250m | ||
memory: 512Mi | ||
limits: | ||
cpu: 500m | ||
memory: 1Gi | ||
jvmOptions: [] | ||
|
||
segmentStore: | ||
replicas: 1 | ||
resources: | ||
requests: | ||
cpu: 500m | ||
memory: 1Gi | ||
limits: | ||
cpu: 500m | ||
memory: 2Gi | ||
jvmOptions: [] | ||
|
||
options: | ||
bookkeeper.ack.quorum.size: "1" | ||
bookkeeper.write.quorum.size: "1" | ||
bookkeeper.ensemble.size: "1" |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, remove some of these explanations, I don't think they are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done