Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when using cd in a subshell, when current directory cannot be determined #153

Closed
hvdijk opened this issue Jan 9, 2021 · 14 comments
Labels
bug Something is not working

Comments

@hvdijk
Copy link

hvdijk commented Jan 9, 2021

On a build of ksh with -fsanitize=undefined to help diagnose problems:

$ mkdir deleted
$ cd deleted
$ rmdir ../deleted
$ ksh -c '(cd /; (cd /)); :'
/home/harald/ksh/src/cmd/ksh93/sh/subshell.c:561:22: runtime error: null pointer passed as argument 1, which is declared to never be null
Segmentation fault (core dumped)

Note that it segfaults the same with default compilation flags, but it does not print out the useful extra message. The code assumes that pwd is non-null and passes it to strcmp without checking, but it will be null if the current directory cannot be determined, for instance because it has been deleted.

@McDutchie
Copy link

Thanks for the report. Could you check that this fixes it?

diff --git a/src/cmd/ksh93/sh/subshell.c b/src/cmd/ksh93/sh/subshell.c
index e8d7853..2ba184c 100644
--- a/src/cmd/ksh93/sh/subshell.c
+++ b/src/cmd/ksh93/sh/subshell.c
@@ -558,7 +558,7 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
 #if _lib_fchdir
 		for(xp=sp->prev; xp; xp=xp->prev) 
 		{
-			if(xp->pwdfd>0 && strcmp(xp->pwd,shp->pwd)==0)
+			if(xp->pwdfd>0 && xp->pwd && strcmp(xp->pwd,shp->pwd)==0)
 			{
 				sp->pwdfd = xp->pwdfd;
 				break;

@hvdijk
Copy link
Author

hvdijk commented Jan 9, 2021

That fixes the segmentation fault, but silently fails to restore the current directory: it leaves the current directory as /.

@McDutchie McDutchie added the bug Something is not working label Jan 10, 2021
@McDutchie
Copy link

Good point. The problem is inherent to non-forking subshells. Once you cd away, it's impossible to return to a nonexistent or inaccessible PWD in the same process. This is one of many reasons why non-forking subshells are actually a bad idea (see also att#480). Getting rid of them is not happening, though.

Below is a diff that attempts to mitigate the issue. cd can check if the PWD is accessible and fork a subshell if it's not. Of course that doesn't cover it if the PWD becomes inaccessible after cd; in that case, all we can do is throw an error and give up. See also issue #141 and commit dd9bc22. Could you test the diff?

Of course it would be safer to make cd always fork a subshell, unconditionally. That would cause the sort of performance hit that has caused people to protest in the past, though. I'm open to opinions.

diff --git a/src/cmd/ksh93/bltins/cd_pwd.c b/src/cmd/ksh93/bltins/cd_pwd.c
index 753aa19..baeb704 100644
--- a/src/cmd/ksh93/bltins/cd_pwd.c
+++ b/src/cmd/ksh93/bltins/cd_pwd.c
@@ -37,6 +37,7 @@
 #include	"name.h"
 #include	"builtins.h"
 #include	<ls.h>
+#include	<dirent.h>
 
 /*
  * Invalidate path name bindings to relative paths
@@ -58,6 +59,7 @@ int	b_cd(int argc, char *argv[],Shbltin_t *context)
 	int rval,flag=0;
 	char *oldpwd;
 	Namval_t *opwdnod, *pwdnod;
+	DIR *testdir;
 	if(sh_isoption(SH_RESTRICTED))
 		errormsg(SH_DICT,ERROR_exit(1),e_restricted+4);
 	while((rval = optget(argv,sh_optcd))) switch(rval)
@@ -91,7 +93,12 @@ int	b_cd(int argc, char *argv[],Shbltin_t *context)
 		dir = nv_getval(opwdnod);
 	if(!dir || *dir==0)
 		errormsg(SH_DICT,ERROR_exit(1),argc==2?e_subst+4:e_direct);
-#if !_lib_fchdir
+#if _lib_fchdir
+	if(testdir = opendir("."))
+		closedir(testdir);
+	else if(shp->subshell && !shp->subshare)
+		sh_subfork();	/* PWD inaccessible; fork to avoid the need to restore it on subshell exit */
+#else
 	/*
 	 * If sh_subshell() in subshell.c cannot use fchdir(2) to restore the PWD using a saved file descriptor,
 	 * we must fork any virtual subshell now to avoid the possibility of ending up in the wrong PWD on exit.
diff --git a/src/cmd/ksh93/sh/subshell.c b/src/cmd/ksh93/sh/subshell.c
index e8d7853..b2f87ee 100644
--- a/src/cmd/ksh93/sh/subshell.c
+++ b/src/cmd/ksh93/sh/subshell.c
@@ -558,6 +558,11 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
 #if _lib_fchdir
 		for(xp=sp->prev; xp; xp=xp->prev) 
 		{
+			if(!xp->pwd)
+			{
+				fatalerror = 2;
+				break;
+			}
 			if(xp->pwdfd>0 && strcmp(xp->pwd,shp->pwd)==0)
 			{
 				sp->pwdfd = xp->pwdfd;

@hvdijk
Copy link
Author

hvdijk commented Jan 10, 2021

Good point. The problem is inherent to non-forking subshells. Once you cd away, it's impossible to return to a nonexistent or inaccessible PWD in the same process.

On POSIX systems (this excludes anything Linux-based), if you are able to open a directory with O_SEARCH, POSIX explicitly specifies that checks are not performed on use for the *at family of functions, and I think it would be common sense that a later fchdir should similarly also not check for existence or access again, but I do not know whether the systems that implement O_SEARCH share my idea of common sense.

In my specific case of a deleted directory, on Linux, fchdir would succeed even without anything like O_SEARCH, but ksh never tries. Only in the case an inaccessible directory does fchdir fail.

Below is a diff that attempts to mitigate the issue. cd can check if the PWD is accessible and fork a subshell if it's not.

This does not help in my original test. Since opening . succeeds, nothing changes. If I change the test to restrict access to the directory, then your patch does help.

Of course it would be safer to make cd always fork a subshell, unconditionally. That would cause the sort of performance hit that has caused people to protest in the past, though. I'm open to opinions.

A possible idea: always make cd fork a subshell, except on systems where O_SEARCH and fchdir are supported and where fchdir is guaranteed to succeed on fds opened with O_SEARCH, if such systems are available and sufficiently widespread. This means you have no performance hit on those systems, and you have slower but correct results on other systems.

@McDutchie
Copy link

McDutchie commented Jan 19, 2021

This does not help in my original test. Since opening . succeeds, nothing changes. If I change the test to restrict access to the directory, then your patch does help.

OK, another idea. Instead of opening ., let's open $PWD... that's an absolute path, so it should fail if it no longer exists or became inaccessible, even if we're still in it. Same diff as before, except change "." to nv_getval(pwdnod) in the opendir call. This works for me on macOS, Linux, and FreeBSD. Can you confirm?

A possible idea: always make cd fork a subshell, except on systems where O_SEARCH and fchdir are supported and where fchdir is guaranteed to succeed on fds opened with O_SEARCH, if such systems are available and sufficiently widespread.

I don't think O_SEARCH is widespread at all. Neither Linux nor macOS support it. On FreeBSD, it's an alias for O_EXEC.

@McDutchie
Copy link

Hmm. The fix for the segfault in subshell.c caused a few regression test failures in tests/subshell.sh as it was not expected to throw an error there. This fix is probably more correct:

diff --git a/src/cmd/ksh93/sh/subshell.c b/src/cmd/ksh93/sh/subshell.c
index e8d7853..2ba184c 100644
--- a/src/cmd/ksh93/sh/subshell.c
+++ b/src/cmd/ksh93/sh/subshell.c
@@ -558,7 +558,7 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
 #if _lib_fchdir
                for(xp=sp->prev; xp; xp=xp->prev)
                {
-                       if(xp->pwdfd>0 && strcmp(xp->pwd,shp->pwd)==0)
+                       if(xp->pwdfd>0 && xp->pwd && strcmp(xp->pwd,shp->pwd)==0)
                        {
                                sp->pwdfd = xp->pwdfd;
                                break;

@McDutchie
Copy link

McDutchie commented Jan 19, 2021

New diff. I can't get the version below to break on any OS, so I'm going to commit this and see what happens. Please do comment if you still find something amiss.

diff --git a/src/cmd/ksh93/bltins/cd_pwd.c b/src/cmd/ksh93/bltins/cd_pwd.c
index 753aa19..c483229 100644
--- a/src/cmd/ksh93/bltins/cd_pwd.c
+++ b/src/cmd/ksh93/bltins/cd_pwd.c
@@ -37,6 +37,7 @@
 #include	"name.h"
 #include	"builtins.h"
 #include	<ls.h>
+#include	<ast_dir.h>
 
 /*
  * Invalidate path name bindings to relative paths
@@ -91,14 +92,20 @@ int	b_cd(int argc, char *argv[],Shbltin_t *context)
 		dir = nv_getval(opwdnod);
 	if(!dir || *dir==0)
 		errormsg(SH_DICT,ERROR_exit(1),argc==2?e_subst+4:e_direct);
-#if !_lib_fchdir
 	/*
 	 * If sh_subshell() in subshell.c cannot use fchdir(2) to restore the PWD using a saved file descriptor,
 	 * we must fork any virtual subshell now to avoid the possibility of ending up in the wrong PWD on exit.
 	 */
 	if(shp->subshell && !shp->subshare)
-		sh_subfork();
-#endif /* !lib_fchdir */
+	{
+#if _lib_fchdir
+		DIR *testdir;
+		if(testdir = opendir(nv_getval(pwdnod)))
+			closedir(testdir);
+		else
+#endif
+			sh_subfork();
+	}
 	/*
 	 * Do $CDPATH processing, except if the path is absolute or the first component is '.' or '..'
 	 */
diff --git a/src/cmd/ksh93/sh/subshell.c b/src/cmd/ksh93/sh/subshell.c
index e8d7853..2ba184c 100644
--- a/src/cmd/ksh93/sh/subshell.c
+++ b/src/cmd/ksh93/sh/subshell.c
@@ -558,7 +558,7 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
 #if _lib_fchdir
 		for(xp=sp->prev; xp; xp=xp->prev) 
 		{
-			if(xp->pwdfd>0 && strcmp(xp->pwd,shp->pwd)==0)
+			if(xp->pwdfd>0 && xp->pwd && strcmp(xp->pwd,shp->pwd)==0)
 			{
 				sp->pwdfd = xp->pwdfd;
 				break;
diff --git a/src/cmd/ksh93/tests/subshell.sh b/src/cmd/ksh93/tests/subshell.sh
index 7837a06..03fdb21 100755
--- a/src/cmd/ksh93/tests/subshell.sh
+++ b/src/cmd/ksh93/tests/subshell.sh
@@ -893,5 +893,17 @@ got=$( { "$SHELL" "$tmp/crash_rhbz1117404.ksh"; } 2>&1)
 ((!(e = $?))) || err_exit 'crash while handling subshell trap' \
 	"(got status $e$( ((e>128)) && print -n / && kill -l "$e"), $(printf %q "$got"))"
 
+# ======
+# Segmentation fault when using cd in a subshell, when current directory cannot be determined
+# https://github.com/ksh93/ksh/issues/153
+cd "$tmp"
+mkdir deleted
+cd deleted
+tmp=$tmp "$SHELL" -c 'cd /; rmdir "$tmp/deleted"'
+exp="PWD=$PWD"
+got=$( { "$SHELL" -c '(cd /; (cd /)); print -r -- "PWD=$PWD"'; } 2>&1 )
+((!(e = $?))) && [[ $got == "$exp" ]] || err_exit 'failed to restore nonexistent PWD on exiting a virtual subshell' \
+	"(got status $e$( ((e>128)) && print -n / && kill -l "$e"), $(printf %q "$got"))"
+
 # ======
 exit $((Errors<125?Errors:125))

@hvdijk
Copy link
Author

hvdijk commented Jan 19, 2021

OK, another idea. Instead of opening ., let's open $PWD... that's an absolute path, so it should fail if it no longer exists or became inaccessible, even if we're still in it. Same diff as before, except change "." to nv_getval(pwdnod) in the opendir call. This works for me on macOS, Linux, and FreeBSD. Can you confirm?

The most important thing is probably that it fixes the segfault, which it does.

After a directory has been removed though, it is valid to create a new directory with the same name, which can cause things to still break:

$ mkdir test
$ cd test
$ rmdir $PWD
$ mkdir $PWD
$ ksh -c "(cd /); pwd"
/

Ideally this would detect ahead of time that $PWD does not refer to the current directory, and fork, but if it doesn't, should it not print the "Failed to restore PWD upon exiting subshell" message you'd normally get if the directory couldn't be restored after leaving a subshell?

I don't think O_SEARCH is widespread at all. Neither Linux nor macOS support it. On FreeBSD, it's an alias for O_EXEC.

It's fine for O_EXEC and O_SEARCH to have the same values, as O_EXEC has unspecified results if applied to a directory, and O_SEARCH has unspecified results if applied to a non-directory. I don't have a FreeBSD system set up to experiment on, but I'll be happy to check when I have a bit more free time whether the approach I suggested would work. (Even if it would work, if it only works on FreeBSD, completely understand if you don't think that's enough to be worth the cost of implementing it.)

@McDutchie McDutchie reopened this Jan 20, 2021
@McDutchie
Copy link

After a directory has been removed though, it is valid to create a new directory with the same name, which can cause things to still break:

Urgh. Good point.

Ideally this would detect ahead of time that $PWD does not refer to the current directory, and fork

Thanks for the idea. Turns out there's already a test_inode function that can help with that -- it implements test foo -ef bar. Also note that e_dot is ".". The diff below (against current HEAD) should do it, I hope.

diff --git a/src/cmd/ksh93/bltins/cd_pwd.c b/src/cmd/ksh93/bltins/cd_pwd.c
index 10b8f71..801d7f6 100644
--- a/src/cmd/ksh93/bltins/cd_pwd.c
+++ b/src/cmd/ksh93/bltins/cd_pwd.c
@@ -37,7 +37,6 @@
 #include	"name.h"
 #include	"builtins.h"
 #include	<ls.h>
-#include	<ast_dir.h>
 
 /*
  * Invalidate path name bindings to relative paths
@@ -99,10 +98,7 @@ int	b_cd(int argc, char *argv[],Shbltin_t *context)
 	if(shp->subshell && !shp->subshare)
 	{
 #if _lib_fchdir
-		DIR *testdir;
-		if(testdir = opendir(nv_getval(pwdnod)))
-			closedir(testdir);
-		else
+		if(!test_inode(nv_getval(pwdnod),e_dot))
 #endif
 			sh_subfork();
 	}
diff --git a/src/cmd/ksh93/tests/subshell.sh b/src/cmd/ksh93/tests/subshell.sh
index 03fdb21..d287b80 100755
--- a/src/cmd/ksh93/tests/subshell.sh
+++ b/src/cmd/ksh93/tests/subshell.sh
@@ -904,6 +904,15 @@ exp="PWD=$PWD"
 got=$( { "$SHELL" -c '(cd /; (cd /)); print -r -- "PWD=$PWD"'; } 2>&1 )
 ((!(e = $?))) && [[ $got == "$exp" ]] || err_exit 'failed to restore nonexistent PWD on exiting a virtual subshell' \
 	"(got status $e$( ((e>128)) && print -n / && kill -l "$e"), $(printf %q "$got"))"
+cd "$tmp"
+mkdir recreated
+cd recreated
+tmp=$tmp "$SHELL" -c 'cd /; rmdir "$tmp/recreated"; mkdir "$tmp/recreated"'
+exp="PWD=$PWD"
+got=$( { "$SHELL" -c '(cd /); print -r -- "PWD=$PWD"'; } 2>&1 )
+((!(e = $?))) && [[ $got == "$exp" ]] || err_exit 'failed to restore re-created PWD on exiting a virtual subshell' \
+	"(got status $e$( ((e>128)) && print -n / && kill -l "$e"), $(printf %q "$got"))"
+cd "$tmp"
 
 # ======
 exit $((Errors<125?Errors:125))

but if it doesn't, should it not print the "Failed to restore PWD upon exiting subshell" message you'd normally get if the directory couldn't be restored after leaving a subshell?

Yes, it should, and I'm finding it difficult to figure out why it isn't doing that. Would appreciate another pair of eyes there as well, if you've got the time and inclination. This should be corrected if at all possible -- ksh supports custom builtins and nothing stops them from calling chdir(2)...

It's fine for O_EXEC and O_SEARCH to have the same values, as O_EXEC has unspecified results if applied to a directory, and O_SEARCH has unspecified results if applied to a non-directory. I don't have a FreeBSD system set up to experiment on, but I'll be happy to check when I have a bit more free time whether the approach I suggested would work. (Even if it would work, if it only works on FreeBSD, completely understand if you don't think that's enough to be worth the cost of implementing it.)

I'm always happy to accept a good patch. I also happen to run a FreeBSD 12.2 VPS. If you email me your public SSH key, I'll make you an account to experiment on.

@McDutchie McDutchie reopened this Jan 20, 2021
@hvdijk
Copy link
Author

hvdijk commented Jan 20, 2021

Thanks for the idea. Turns out there's already a test_inode function that can help with that -- it implements test foo -ef bar. Also note that e_dot is ".". The diff below (against current HEAD) should do it, I hope.

Nice, after pulling the latest changes, the only way I can get it to fail is by modifying the file system after the shell has started.

Note that the latest changes include an unrelated bug that I needed to fix before I could build: commit a75d0df contains ${base.c} where ${base}.c is meant.

Yes, it should, and I'm finding it difficult to figure out why it isn't doing that. Would appreciate another pair of eyes there as well, if you've got the time and inclination. This should be corrected if at all possible -- ksh supports custom builtins and nothing stops them from calling chdir(2)...

Will try to take another look over the weekend.

I'm always happy to accept a good patch. I also happen to run a FreeBSD 12.2 VPS. If you email me your public SSH key, I'll make you an account to experiment on.

Thanks, may just set up something on qemu or may take you up on that, will let you know.

@McDutchie
Copy link

Note that the latest changes include an unrelated bug that I needed to fix before I could build: commit a75d0df contains ${base.c} where ${base}.c is meant.

Well, that's embarrassing. I've fixed the typo. Also added a missing #include "test.h" to cd_pwd.c (see #158).

@hvdijk
Copy link
Author

hvdijk commented Jan 22, 2021

but if it doesn't, should it not print the "Failed to restore PWD upon exiting subshell" message you'd normally get if the directory couldn't be restored after leaving a subshell?

Yes, it should, and I'm finding it difficult to figure out why it isn't doing that. Would appreciate another pair of eyes there as well, if you've got the time and inclination. This should be corrected if at all possible -- ksh supports custom builtins and nothing stops them from calling chdir(2)...

sh/subshell.c contains

        if(sp->shpwd)   /* restore environment if saved */
        {
            ...
        }

This assumes that if the path of the previous directory could not be determined, there is no environment to restore, but that assumption did not hold in this case: the environment to restore simply did not include that path, and it did not need to include it as it had a fd that could and would be used instead.

This might no longer be a problem with the last changes, as not being able to determine the current working directory now forces a subshell to fork.

And unfortunately, a test on FreeBSD on qemu showed that O_SEARCH did not have the effect that I had hoped for: even after opening a directory with O_SEARCH, fchdir checks permissions again.

@McDutchie
Copy link

McDutchie commented Jan 28, 2021

Related OpenSUSE patch: https://build.opensuse.org/package/view_file/shells/ksh/ksh93-subshellpwd.dif (not tried yet not applicable)

@McDutchie
Copy link

McDutchie commented Apr 18, 2021

sh/subshell.c contains

        if(sp->shpwd)   /* restore environment if saved */
        {
            ...
        }

This assumes that if the path of the previous directory could not be determined, there is no environment to restore, but that assumption did not hold in this case: the environment to restore simply did not include that path, and it did not need to include it as it had a fd that could and would be used instead.

This might no longer be a problem with the last changes, as not being able to determine the current working directory now forces a subshell to fork.

It is still a problem. Whole aspects of the virtual subshell, such as the subshell function tree, are not freed/restored, with quite disastrous consequences. A separate bug is that cd .. doesn't update PWD or OLDPWD when run from a nonexistent directory.

Here is one way to trigger that:

$ mkdir test
$ cd test
$ ksh -c '(subfn() { BAD; }; cd ..; echo subPWD==$PWD); typeset -f subfn; echo mainPWD==$PWD'               
subPWD==/usr/local/src/ksh93/ksh/test
subfn() { BAD; };mainPWD==/usr/local/src/ksh93/ksh/test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working
Projects
None yet
Development

No branches or pull requests

2 participants