-
Notifications
You must be signed in to change notification settings - Fork 39.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for Premature iSCSI logout #39202. #41196
Conversation
Hi @cristianpop. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA. Once you've signed, please reply here (e.g. "I signed it!") and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/approve |
@k8s-bot ok to test |
@cristianpop have you signed CLA yet? |
pkg/volume/iscsi/iscsi_util.go
Outdated
return path.Join(host.GetPluginDir(iscsiPluginName), portal+"-"+iqn+"-lun-"+lun) | ||
// make a directory like /var/lib/kubelet/plugins/kubernetes.io/iscsi/iface_name/portal-some_iqn-lun-lun_id | ||
func makePDNameInternal(host volume.VolumeHost, portal string, iqn string, lun string, iface string) string { | ||
return path.Join(host.GetPluginDir(iscsiPluginName), iface, portal+"-"+iqn+"-lun-"+lun) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be a problem for kubelet upgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it may be problematic but I needed to somehow obtain the iSCSI iface in DettachDisk to logout only on the given iface in case there are multiple connections to the same target using different ifaces. The use of multiple connections on different ifaces might not be common, but that's the setup I have. The servers are booted using iSCSI and if the persistent volumes end up on the same target as the boot image, the disk dettaches would end up closing the iSCSI connection that imported the boot image as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand you need to make the code more robust, but it should work out of the box when updated from older Kubernetes release without killing pods. So you should "adopt" iSCSI LUNs mounted to the old directory too and unmount+"logoff" them when the pods are deleted.
Looking at the code, it seems it could work, however I'd like someone to confirm this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would work but it would not log out of any interface, supposing the directory name is not a legit iSCSI interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could use the plugin directory to determine whether the path is an old one or not and return the "default" interface or an empty string in extractIface and act accordingly, either log out on the default iface or do not log out at all to avoid logging out of sessions that may be used by other pods.
@@ -185,14 +192,20 @@ func (util *ISCSIUtil) DetachDisk(c iscsiDiskUnmounter, mntPath string) error { | |||
refCount, err := getDevicePrefixRefCount(c.mounter, prefix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also see this #41041 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about changing the HasPrefix call to Contains but I wanted to account for the iface as well and log out only if there are no other pods that use that specific iface. I know it's not foul proof especially when there are multiple ifaces using the same transport to the same target as the devices that get mounted are not necessarily the ones that were imported on the given iface, but I could not find a better solution until now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a solid solution to me, at least for the moment. In the long-term, I'm concerned about cases where sessions that exist independently of Kubernetes get logged off because a pod used a volume connected to the target using the same interface (I think this is the same concern @cristianpop cited just now, but I'm not totally sure). Solving that might be doable with something like resolving all of the symlinks in /dev/disk/by-path to their devices and then checking whether any of the devices that belong to the session are still mounted; I don't know if this would behave well with multipathing, though.
At any rate, this is a better fix than just changing HasPrefix to Contains.
I haven't signed the CLA yet, but I'm working on it (the company I'm working for has to sign it). |
@k8s-bot gci gce e2e test this @cristianpop do you know when you will be able to get the CLA signed? I know that it can be difficult with many companies. Are you willing to say that the contribution was created in whole by you and that you have the right to submit it under the Apache License Version 2.0? Do you also understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information you submited with it) is maintained indefinitely and may be redistributed consistent with this project or the license involved? |
@eparis I'm sorry for the delay. The management is handling it and I've been told it will be signed today or tomorrow at the latest. Yes, the contribution is mine, I have the right to submit it and I understand the contribution is public. |
@cristianpop can you rebase it? @saad-ali PTAL |
…ect the changes brought by the kubernetes#39202 fix.
@rootfs I've rebased it. |
The CLA has been signed. Sorry for the delay. |
@k8s-bot gci gce e2e test this |
LGTM |
pkg/volume/iscsi/iscsi_util.go
Outdated
return device, prefix, nil | ||
} | ||
|
||
func extractIface(mntPath string) (string, error) { | ||
ind := strings.LastIndex(mntPath, "/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use path.Dir()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to keep it consistent with the existing code but I can change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I use path.Base and path.Dir to extract the interface, I would lose the malformed mount path checks. I could keep the checks as they are and use path.Base and path.Dir but it would not make any sense to not use the indexes, or I could rewrite them to use path.Match or check for empty strings and ".", or scrap the checks. I would prefer to keep the checks, although they may end up not failing ever. What do you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's get the check work correctly as the first priority.
pkg/volume/iscsi/iscsi_util.go
Outdated
return "", fmt.Errorf("iscsi detach disk: malformatted mnt path: %s", mntPath) | ||
} | ||
|
||
iface := baseMntPath[(ind + 1):] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
path.Dir()?
here is my 2 cents dealing with backward compatiblity. In In |
Sounds good to me. Better than searching for the volume plugin directory in the path. |
…t upgrade. The detachDisk behavior is now preserved for pods that were created before the kubelet upgrade.
[APPROVALNOTIFIER] This PR is APPROVED The following people have approved this PR: CristianPop, rootfs Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
@k8s-bot test this |
/lgtm |
@k8s-bot kops aws e2e test this |
Automatic merge from submit-queue |
Would it be possible to get this cherry-picked to 1.5? |
CC @mwielgus 1.5 release branch manager for cherry pick approval |
@cristianpop: Once this is approved for cherry pick, execute |
@saad-ali Anything that needs to be done to start the cherry pick approval process? |
What this PR does / why we need it:
Modifies the iSCSI volume plugin code to prevent premature iSCSI logouts and the establishment of multiple iSCSI connections to the same target in certain cases.
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #39202, fixes #41041, fixes #40941Special notes for your reviewer:
The existing iSCSI connections are now rescanned on every AttachDisk call to discover newly created LUNs.
The disk mount points now contain an additional directory in the path corresponding to the disk iface that is later used for iSCSI logout.
The device prefixes that are used to count the existing references to the portal-target pair now contain the whole path including the mount point until the lun index.
Release note: