-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 312 making ss cache volumes optional #342
Conversation
Signed-off-by: prabhaker24 <[email protected]>
Signed-off-by: prabhaker24 <[email protected]>
Signed-off-by: prabhaker24 <[email protected]>
Signed-off-by: prabhaker24 <[email protected]>
Signed-off-by: prabhaker24 <[email protected]>
err = r.syncSegmentStoreSize(p) | ||
if err != nil { | ||
return err | ||
/*this condition is to stop syncSegmentstore version from running when we are updapting the ss from version below 07 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just say in the comment " We skip calling syncSegmentStoreSize() during upgrade/rollback from version 07"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
err = r.syncStatefulSetPvc(sts) | ||
if err != nil { | ||
return fmt.Errorf("failed to sync pvcs of stateful-set (%s): %v", sts.Name, err) | ||
// this check is to avoid calling syncStatefulSetPvc() over ss with version above 07 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
stsAbove07 := &appsv1.StatefulSet{} | ||
name := util.StatefulSetNameForSegmentstoreAbove07(p.Name) | ||
err := r.client.Get(context.TODO(), types.NamespacedName{Name: name, Namespace: p.Namespace}, stsAbove07) | ||
if err != nil && errors.IsNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if client.Get() returns error, but it is not a "NotFound" error? Please add a check similar to :
https://github.com/pravega/pravega-operator/blob/master/pkg/controller/pravegacluster/pravegacluster_controller.go#L92
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All errors must be logged with log level "Error". We're not logging anything here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have changed it
if (r.IsClusterRollbackingFrom07(p) && newsts.Status.ReadyReplicas == *newsts.Spec.Replicas) || (oldsts.Status.ReadyReplicas+newsts.Status.ReadyReplicas == p.Spec.Pravega.SegmentStoreReplicas && newsts.Status.ReadyReplicas == *newsts.Spec.Replicas) { | ||
//this check is run till the value of old sts replicas is greater than 0 and will increase two replicas of the new sts and delete 2 replicas of the old sts | ||
if *oldsts.Spec.Replicas > 2 { | ||
*newsts.Spec.Replicas = *newsts.Spec.Replicas + 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Encapsulate the changes to newSTS in a function and invoke that here.
Ditto for old STS, so its easier to follow what is happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
} else { | ||
//here we remove the pvc's attached with the old sts and deleted it when old sts replicas have become 0 | ||
*newsts.Spec.Replicas = p.Spec.Pravega.SegmentStoreReplicas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again encapsulate changes to old STS in a method and invoke the same....
One more thing, we need to increment the newSTS first and then do stuff for deleting the old sts, so that we err on side of caution.... and can have more redundancy than less....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/util/pravegacluster.go
Outdated
if ver == "" { | ||
return true | ||
} | ||
first3 := strings.Trim(ver, "\t \n")[0:3] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we use CompareVersions here? Lets try to reuse existing methods wherever possible, Its is not good to use only way of doing things here and another way in the webhook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed it
pkg/util/pravegacluster.go
Outdated
return false | ||
} | ||
|
||
//this function will return true only in case of upgrading from a version below 0.7 to a version above 0.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the comment, "to pravega version 0.7 or later"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Signed-off-by: prabhaker24 <[email protected]>
Signed-off-by: prabhaker24 <[email protected]>
Signed-off-by: prabhaker24 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address all review comments. I still see some earlier comments open and unaddressed.
} | ||
|
||
//TO detect upgade/rolback faiure | ||
if oldsts.Status.ReadyReplicas+newsts.Status.ReadyReplicas != p.Spec.Pravega.SegmentStoreReplicas { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!= check should be changed to < as discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have changed it
} | ||
|
||
//checking if sts below07 exsists | ||
stsBelow07 := &appsv1.StatefulSet{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we dealing with situations where one STS is deleted and other is present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that I have dealt with as one the above07 sts is there and below07 sts is not there I am checking the version, so in case the below07sts is not there and the version is below 07, in that case, I am calling the syncSegmentStoreVersionTo07() which will remove the above07 pods and add the below07 sts
in case the below07sts is there and above07 sts is not there I am returning false and syncSegmentStoreVersion() will get called which will handle this situation
} | ||
|
||
//this function will increase two replicas of the new sts and delete 2 replicas of the old sts everytime it's called | ||
func (r *ReconcilePravegaCluster) incrementWhenReplicasMoreThan2(p *pravegav1alpha1.PravegaCluster, newsts *appsv1.StatefulSet, oldsts *appsv1.StatefulSet) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we call this method scaleSegmentStoreSTS()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it
} | ||
|
||
//This function will remove the pvc's attached with the old sts and deleted it when old sts replicas have become 0 | ||
func (r *ReconcilePravegaCluster) incrementWhenReplicasLessThan2(p *pravegav1alpha1.PravegaCluster, newsts *appsv1.StatefulSet, oldsts *appsv1.StatefulSet) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a better method name that is indicative of what the method does..... Perhaps transitionToNewSTS() ???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it
Signed-off-by: prabhaker24 <[email protected]>
…are versions Signed-off-by: prabhaker24 <[email protected]>
Signed-off-by: prabhaker24 <[email protected]>
Signed-off-by: prabhaker24 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the earlier comments are not addressed still and I have provided new comments.
if err != nil { | ||
return fmt.Errorf("failed to sync pvcs of stateful-set (%s): %v", sts.Name, err) | ||
/*We skip calling syncStatefulSetPvc() during upgrade/rollback from version 07*/ | ||
if !util.IsClusterUpgradingTo07(p) && !r.isRollbackTriggered(p) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Why can't we use "IsClusterRollbackingFrom07()"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed it
if err != nil { | ||
return err | ||
/*We skip calling syncSegmentStoreSize() during upgrade/rollback from version 07*/ | ||
if !util.IsClusterUpgradingTo07(p) && !r.isRollbackTriggered(p) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be checking if rollback from 07 is triggered. Here we're checking if any rollback is triggered. Why can't we use "IsClusterRollbackingFrom07()"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed it
return true | ||
} | ||
|
||
//To handle upgrade/rollback from Pravega version < 0.7 to Pravega Version >= 0.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong comment on method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
if util.IsClusterUpgradingTo07(p) || r.IsClusterRollbackingFrom07(p) { | ||
return r.syncSegmentStoreVersionTo07(p) | ||
} | ||
//for all other cases of upgrades and rollback this function is called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment not removed yet.
controllerutil.SetControllerReference(p, newsts, r.scheme) | ||
err = r.client.Get(context.TODO(), types.NamespacedName{Name: newsts.Name, Namespace: p.Namespace}, newsts) | ||
//this check is to see if the newsts is present or not if it's not present it will be created here | ||
if err != nil && errors.IsNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don;t see the error being logged????
Signed-off-by: prabhaker24 <[email protected]>
return false, err | ||
} | ||
} | ||
if !errors.IsNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if is not needed, please remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: prabhaker24 <[email protected]>
…into issue-312-making-ss-cache-volume-optional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
//if version is above or equals to 0.7 this name will be assigned | ||
func StatefulSetNameForSegmentstoreAbove07(name string) string { | ||
return fmt.Sprintf("%s-pravega-segment-store", name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change just broke the upgrade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.7.0
pravega clusters historically were deployed with 0.4.x
p-operator and had the stateful sets named using the pravega-segmentstore
substring.
Signed-off-by: prabhaker24 [email protected]
Change log description
From Pravega 0.7, the Segment Store cache is completely in-memory. This means that the PVC that the operator creates is not necessary anymore. We need to make the deployment of this volume optional, depending on whether the version of Pravega is older or newer than 0.7, thus these changes will deploy as well as upgrade the version accordingly so as to not include pvc's incase we are deploying or upgrading to a Pravega version 0.7 or above and remove the old ones deployed by a version below 0.7 in case of an upgrade.
In Case of a rollback when an upgrade from a version above 0.7 to a version below 0.7 fails, it should remove the pods of the segment store above version 0.7 and should deploy all segment store pods of version below 0.7 along with their pvc's
Purpose of the change
Fixes #312
What the code does
These changes will deploy the segment store with the pvc's for a version below 0.7 and will deploy the segment store without pvc's for a version 0.7 or above, it will also take care of upgrades in case the upgrades are from a version below 0.7 to a version above or equal to 0.7 and in this case it will remove the pvc's created by the older version while deploying the segment store for the version above 0.7.
It will also handle the rollback in case of an upgrade failure from a version below 0.7 to version above 0.7 and will display appropriate upgrade failed message and when the user changes the version to the last version(below 0.7 from which he was upgrading) it will rollback it to this below 0.7 version and remove the old pods of the segment store having version above 0.7.
It will also display appropriate user message incase the above rollback also fails.
How to verify it