Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with "Smart remove" in "Run in background on remote Host. EXPERIMENTAL!" when to much snapshots to delete #1359

Open
buhtz opened this issue Nov 8, 2022 · 5 comments
Assignees
Labels
Bug Discussion decision or consensus needed External depends on others/upstream

Comments

@buhtz
Copy link
Member

buhtz commented Nov 8, 2022

Thanks to @aryoda comment in PR #1351 I found that issue.

BIT has an "EXPERIMENTAL" feature where the "smart remove" is done in background via SSH.
image

Usually it isn't done in background. And based on my observations it isn't done with SSH even if an SSH snapshot profiles is used. The remote is mounted and then a simple rm (more precise shutil.rmtree()) is done. No SSH is involed.

The problem is that the feature seems really to be experimental. In my test scenario it creates a command more than 10.000 characters long which seems to be to long for SSH. Here you see just a snipped example.

['ssh', '-o', 'ServerAliveInterval=240', '-o', 'LogLevel=Error', '-o', 'IdentityFile=/home/user/.ssh/id_rsa', '-p', '22', 'user@localhost', 'screen -d -m bash -c "(TMP=\\$(mktemp -d); test -z \\"\\$TMP\\" && exit 1; test -n \\"\\$(ls \\$TMP)\\" && exit 1; logger -t \\"backintime smart-remove [$BASHPID]\\" \\"start\\"; flock -x 9; logger -t \\"backintime smart-remove [$BASHPID]\\" \\"got exclusive flock\\"; test -e \\"/home/user/dest blank/backintime/localhost/user/1/20221106-214507-663\\" && (logger -t \\"backintime smart-remove [$BASHPID]\\" \\"snapshot 20221106-214507-663 still exist\\"; sleep 1; rsync -a --delete -s \\"\\$TMP/\\" /home/user/dest blank/backintime/localhost/user/1/20221106-214507-663; rmdir \\"/home/user/dest blank/backintime/localhost/user/1/20221106-214507-663\\"; logger -t \\"backintime smart-remove [$BASHPID]\\" \\"snapshot 20221106-214507-663 remove done\\"); test -e \\"/home/user/dest blank/backintime/localhost/user/1/20221105-214500-483\\" && (logger -t \\"backintime smart-remove [$BASHPID]\\" \\"snapshot 20221105-214500-483 still exist\\"; sleep 1; rsync -a --delete -s \\"\\$TMP/\\" /home/user/dest blank/backintime/localhost/user/1/20221105-214500-483; rmdir \\"/home/user/dest blank/backintime/localhost/user/1/20221105-214500-483\\"; logger -t \\"backintime smart-remove [$BASHPID]\\" \\"snapshot 20221105-214500-483 remove done\\"); test -e \\"/home/user/dest blank/backintime/localhost/user/1/20221104-221056-108\\" && (logger -t \\"backintime smart-remove [$BASHPID]\\" \\"snapshot 20221104-221056-108 still exist\\"; sleep 1; rsync -a --delete -s \\"\\$TMP/\\"

SNIPPED

&& (logger -t \\"backintime smart-remove [$BASHPID]\\" \\"snapshot 20221019-230601-797 still exist\\"; sleep 1; rsync -a --delete -s \\"\\$TMP/\\" /home/user/dest blank/backintime/localhost/user/1/20221019-230601-797; rmdir \\"/home/user/dest blank/backintime/localhost/user/1/20221019-230601-797\\"; logger -t \\"backintime smart-remove [$BASHPID]\\" \\"snapshot 20221019-230601-797 remove done\\"); rmdir \\$TMP) 9>\\"/home/user/dest blank/backintime/localhost/user/1/smartremove.lck\\""']

The problem is obvious and also some solutions are possible. But when this problem is solved maybe there are other problems coming up with the experimental feature. I suspect that this will take much more time and testing resources.

My first suggestion was to "deactivate" that feature for the 1.3.3 release. And after that release we have more time to dive into it.

Maybe @Germar can give us more background info about that feature.

A problem could be. That the feature was introduced with 1.1.6 in the year 2015 ([CHANGELOG](* add option to run Smart Remove in background on remote host (https://launchpad.net/bugs/1457210)) and #257). so it is 7 years old and could be more stable than experimental.

@buhtz
Copy link
Member Author

buhtz commented Nov 9, 2022

Thanks for your research. After asking my pillow that night I wouldn't categorize that code as "experimental" in the meaning of "new and unstable". It is quite old like a good wine and there are some fixed issues about it. So deactivating it for the next release is a bad choice.

I assume I triggered a rare case with my tests scripts. Via faketimelib and other helper scripts I can generate snapshots for the last N days. Then activating smart remove will of course cause a lot of snapshots to be deleted at once.

I will investigate that further but treat the "to many snapshots at once" in a separate PR.

EDIT: The question is if we should touch that before the next release or better after it. Beside my special testing-use-case it seems that no one else is having problems with that feature. The last issue about that was in the year 2017.

@buhtz buhtz changed the title Decide about experimental "Smart remove" feature "Run in background on remote Host" Problems with "Smart remove" in "Run in background on remote Host" when to much snapshots to delete Nov 9, 2022
@buhtz buhtz self-assigned this Nov 9, 2022
@buhtz buhtz added Discussion decision or consensus needed Bug External depends on others/upstream labels Nov 9, 2022
@Germar
Copy link
Member

Germar commented Nov 10, 2022

This feature is working fine as far as I can tell. But I called it experimental because it need to control commands running on remote without having full control over remote (if that makes any sense 🤔)

The command is splitt into parts that fit into valid ssh commands ( BiT tests the max possible length for ssh commands for this)

@buhtz
Copy link
Member Author

buhtz commented Nov 11, 2022

The command is splitt into parts that fit into valid ssh commands ( BiT tests the max possible length for ssh commands for this)

Maybe there is potential to optimize that maxlength feature. I'll investigate that further and first try to reproduce my problem again.

@Germar I haven't tested it yet but did you ever heard of the getconf ARG_MAX? If so, what was the reason that you didn't used that instead of creating sshMaxArg.py? And what does the acronym mid stands for? I understand that variable in ``sshMaxArg.py` but don't get behind its name. 😄

@Germar
Copy link
Member

Germar commented Nov 14, 2022

I'm not sure if I knew getconf ARG_MAX back in the days. But I don't think this would be helpfull because it's about the combination of the local ssh USER@HOST COMMAND and the remote COMMAND length.

Maybe you run Smart-remove manually without using Settingsdialog? This way you bypassed the check and config.sshMaxArgLength was 0 -> no limit

@buhtz buhtz added this to the 1.3.5 or 1.4.0 milestone Mar 7, 2023
@buhtz buhtz changed the title Problems with "Smart remove" in "Run in background on remote Host" when to much snapshots to delete Problems with "Smart remove" in "Run in background on remote Host. EXPERIMENTAL!" when to much snapshots to delete Oct 5, 2024
buhtz added a commit to buhtz/backintime that referenced this issue Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Discussion decision or consensus needed External depends on others/upstream
Projects
None yet
Development

No branches or pull requests

3 participants