Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support all CLI inputs in PHP interactive shell #118

Open
adamziel opened this issue Jan 19, 2023 · 7 comments
Open

Support all CLI inputs in PHP interactive shell #118

adamziel opened this issue Jan 19, 2023 · 7 comments

Comments

@adamziel
Copy link
Collaborator

adamziel commented Jan 19, 2023

What is the problem?

Running PHP CLI in an interactive mode (-a) does not support certain keys:

  • Alphanumeric symbols all work
  • TTY sequences like backspace, ctrl+c, ctrl+z, ctrl+w all work
  • XTERM control sequences like ctrl+a, ctrl+e are printed as ^A or ^E
  • ANSI Escape Codes like the arrow keys are printed as, e.g., ^[[D

Presumably, it's because PHP.wasm does not have the same kind of access to the input and output streams as a native build does. Notably, PHP.wasm is compiled with libedit and ncurses and uses the xterm TERMINFO database.

Investigation

The PHP runtime is connected to host's stdin through Emscripten's NODERAWFS.

At the same time, the wrapping node.js process seems to buffer the stdin input and handle the key inputs on its own. Nothing gets sent to the PHP process until the enter key is pressed – I confirmed this by adding console.log() calls to the NODERAWFS.read function.

Once the relevant key codes reach PHP, they are somewhat processed. Here I pressed 123456789←←←←ab<ctrl+a>1_ and 12345ab1_ was printed:

php > echo <<<TTY
<<< > 123456789^[[D^[[D^[[D^[[Dab^A1_
<<< > TTY;
12345ab1_

The arrow keys clearly worked, but the ab overwritten 67 instead of being inserted before. Also, the ctrl+a key combination did not return to the beginning of the line.

Manually sending raw bytes

I thought maybe something's wrong with reading stdin and provided manually sent raw bytes throught stdin:

const stdIn = new Uint8Array([
	// send echo "a":
	...new TextEncoder().encode('echo "a";'),
	// Press left arrow
	27, 91, 68,
	// Press "d", "e", "f"
	100, 101, 102
]);
let stdInIterator = 0;

// Provide PHP with the raw bytes through the `stdin` callback:
async function main() {
	const php = await startPHP(phpLoaderModule, 'NODE', {
		stdin: () => {
			const ord = stdIn[stdInIterator++];
			return ord === undefined ? null : ord;
		},
	});

CleanShot 2023-01-17 at 12 59 57@2x

It did correctly move the caret to the left and the input buffer seemed to contain echo "a"def;. Unfortunately, that's not what got rendered in the terminal.

stdin.setRawMode(true)

Node.js can be told to stop buffering the input lines and just pass through any bytes it receives as follows:

process.stdin.setRawMode(true);
process.stdin.resume();

Weirdly, that stops printing the line buffer to stdout:

$fp = fopen('php://stderr', 'w'); fwrite($fp, "STDERR!\n"); fclose($fp);
$fp = fopen('php://stdout', 'w'); fwrite($fp, "STDOUT!\n"); fclose($fp);

CleanShot 2023-01-17 at 13 30 48@2x

I tried manually printing characters to process.stdout but didn't get anywhere – PHP CLI does a lot of work there to, e.g., replace the current line buffer with the last history entry when the up arrow is pressed.

@adamziel adamziel changed the title Support ANSI codes Support all CLI inputs in PHP interactive terminal Jan 19, 2023
@adamziel adamziel changed the title Support all CLI inputs in PHP interactive terminal Support all CLI inputs in PHP interactive shell Jan 19, 2023
@adamziel
Copy link
Collaborator Author

Passing the following raw bytes to stdin:

	const stdIn = new Uint8Array([
		// send echo "a":
		...new TextEncoder().encode('echo "abcde;'),
		// Press left arrow
		27, 91, 68,
		27, 91, 68,
		27, 91, 68,
		27, 91, 68,
		...new TextEncoder().encode('___";     '),
		10,
	]);

Outputs: ab___;

@adamziel
Copy link
Collaborator Author

     try {
      bytesRead = fs.readSync(process.stdin.fd, buf, 0, BUFSIZE, -1);
     } catch (e) {
      if (e.toString().includes("EOF")) bytesRead = 0; else throw e;
     }

bytesRead is only populated after I press enter which means readline doesn't get to directly work with the buffer.

It might in RAW mode, but then backspace and other ASCII control characters would get passed – and PHP doesn't seem to handle them correctly:

Parse error: syntax error, unexpected character 0x1B, expecting end of file in php shell code on line 1

There may be no easy fix here. I'm now thinking the problem is having node.js between terminal and php – it's probably not forwarding all the required syscalls and signals both ways. Maybe compiling PHP to WASI will solve the problem?

@adamziel
Copy link
Collaborator Author

adamziel commented Jan 25, 2023

Actually – since it's fine to only send full lines to PHP, maybe it's okay to wrap that prompt in Node.js readline module:

https://gist.github.com/dpyro/d94bb85d284cd91ed156db0404f76e7e

@adamziel
Copy link
Collaborator Author

adamziel commented Jan 26, 2023

libedit seems to be calling el_set(e, EL_UNBUFFERED, 1) which does the following:

EL_UNBUFFERED , int
Sets or clears unbuffered mode. In this mode, el_gets() will return immediately after processing a single character.

This sounds like node.js stdin raw mode after all. The implementation does this:

if ((el->el_flags & (UNBUFFERED|EDIT_DISABLED)) == UNBUFFERED)
		tty_rawmode(el);

@adamziel
Copy link
Collaborator Author

adamziel commented Jan 26, 2023

Here's a few more excerpt from libedit's code:

libedit_private void
read_prepare(EditLine *el)
{
	if (el->el_flags & HANDLE_SIGNALS)
		sig_set(el);
	if (el->el_flags & NO_TTY)
		return;
	if ((el->el_flags & (UNBUFFERED|EDIT_DISABLED)) == UNBUFFERED)
		tty_rawmode(el);

	/* This is relatively cheap, and things go terribly wrong if
	   we have the wrong size. */
	el_resize(el);
	re_clear_display(el);	/* reset the display stuff */
	ch_reset(el);
	re_refresh(el);		/* print the prompt */

	if (el->el_flags & UNBUFFERED)
		terminal__flush(el);
}
tty_rawmode(EditLine *el)
{

	if (el->el_tty.t_mode == ED_IO || el->el_tty.t_mode == QU_IO)
		return 0;

	if (el->el_flags & EDIT_DISABLED)
		return 0;

	if (tty_getty(el, &el->el_tty.t_ts) == -1) {
#ifdef DEBUG_TTY
		(void) fprintf(el->el_errfile, "%s: tty_getty: %s\n", __func__,
		    strerror(errno));
#endif /* DEBUG_TTY */
		return -1;
	}
	/*
         * We always keep up with the eight bit setting and the speed of the
         * tty. But we only believe changes that are made to cooked mode!
         */
	el->el_tty.t_eight = tty__geteightbit(&el->el_tty.t_ts);
	el->el_tty.t_speed = tty__getspeed(&el->el_tty.t_ts);

	if (tty__getspeed(&el->el_tty.t_ex) != el->el_tty.t_speed ||
	    tty__getspeed(&el->el_tty.t_ed) != el->el_tty.t_speed) {
		(void) cfsetispeed(&el->el_tty.t_ex, el->el_tty.t_speed);
		(void) cfsetospeed(&el->el_tty.t_ex, el->el_tty.t_speed);
		(void) cfsetispeed(&el->el_tty.t_ed, el->el_tty.t_speed);
		(void) cfsetospeed(&el->el_tty.t_ed, el->el_tty.t_speed);
	}
	if (tty__cooked_mode(&el->el_tty.t_ts)) {
		int i;

		for (i = MD_INP; i <= MD_LIN; i++)
			tty_update_flags(el, i);

		if (tty__gettabs(&el->el_tty.t_ex) == 0)
			el->el_tty.t_tabs = 0;
		else
			el->el_tty.t_tabs = EL_CAN_TAB ? 1 : 0;

		tty__getchar(&el->el_tty.t_ts, el->el_tty.t_c[TS_IO]);
		/*
		 * Check if the user made any changes.
		 * If he did, then propagate the changes to the
		 * edit and execute data structures.
		 */
		for (i = 0; i < C_NCC; i++)
			if (el->el_tty.t_c[TS_IO][i] !=
			    el->el_tty.t_c[EX_IO][i])
				break;

		if (i != C_NCC) {
			/*
			 * Propagate changes only to the unlibedit_private
			 * chars that have been modified just now.
			 */
			for (i = 0; i < C_NCC; i++)
				tty_update_char(el, ED_IO, i);

			tty_bind_char(el, 0);
			tty__setchar(&el->el_tty.t_ed, el->el_tty.t_c[ED_IO]);

			for (i = 0; i < C_NCC; i++)
				tty_update_char(el, EX_IO, i);

			tty__setchar(&el->el_tty.t_ex, el->el_tty.t_c[EX_IO]);
		}
	}
	if (tty_setty(el, TCSADRAIN, &el->el_tty.t_ed) == -1) {
#ifdef DEBUG_TTY
		(void) fprintf(el->el_errfile, "%s: tty_setty: %s\n", __func__,
		    strerror(errno));
#endif /* DEBUG_TTY */
		return -1;
	}
	el->el_tty.t_mode = ED_IO;
	return 0;
}

It does look like it needs to interface directly with the terminal, or at a very least, have Node.js implementation of these termios functions for all the features to predictably work:

speed_t cfgetispeed(const struct termios *);
speed_t cfgetospeed(const struct termios *);
int     cfsetispeed(struct termios *, speed_t);
int     cfsetospeed(struct termios *, speed_t);
int     tcgetattr(int, struct termios *);
int     tcsetattr(int, int, const struct termios *);
int     tcdrain(int) __DARWIN_ALIAS_C(tcdrain);
int     tcflow(int, int);
int     tcflush(int, int);
int     tcsendbreak(int, int);

#if !defined(_POSIX_C_SOURCE) || defined(_DARWIN_C_SOURCE)
void    cfmakeraw(struct termios *);
int     cfsetspeed(struct termios *, speed_t);
#endif /* (!_POSIX_C_SOURCE || _DARWIN_C_SOURCE) */

It doesn't help that it defines POSIX signal handlers that Emscripten does not support at the time:

libedit_private int
tty_get_signal_character(EditLine *el, int sig)
{
#ifdef ECHOCTL
	tcflag_t *ed = tty__get_flag(&el->el_tty.t_ed, MD_INP);
	if ((*ed & ECHOCTL) == 0)
		return -1;
#endif
	switch (sig) {
#if defined(SIGINT) && defined(VINTR)
	case SIGINT:
		return el->el_tty.t_c[ED_IO][VINTR];
#endif
#if defined(SIGQUIT) && defined(VQUIT)
	case SIGQUIT:
		return el->el_tty.t_c[ED_IO][VQUIT];
#endif
#if defined(SIGINFO) && defined(VSTATUS)
	case SIGINFO:
		return el->el_tty.t_c[ED_IO][VSTATUS];
#endif
#if defined(SIGTSTP) && defined(VSUSP)
	case SIGTSTP:
		return el->el_tty.t_c[ED_IO][VSUSP];
#endif
	default:
		return -1;
	}
}

TL;DR some ground work would be required for full readline/libedit support and wrapping in node.js readline might be much easier.

On the other hand, the only problem with process.stdin.setRawMode(true) is that nothing gets rendered on the screen in response to inputs – my guess is that WebAssembly libedit doesn't think it's in the raw mode and does not print anything on the screen. Consider this function:

void
rl_callback_read_char(void)
{
	int count = 0, done = 0;
	const char *buf = el_gets(e, &count);
	char *wbuf;

	el_set(e, EL_UNBUFFERED, 1);
	if (buf == NULL || count-- <= 0)
		return;
	if (count == 0 && buf[0] == e->el_tty.t_c[TS_IO][C_EOF])
		done = 1;
	if (buf[count] == '\n' || buf[count] == '\r')
		done = 2;

	if (done && rl_linefunc != NULL) {
		el_set(e, EL_UNBUFFERED, 0);
		if (done == 2) {
			if ((wbuf = strdup(buf)) != NULL)
				wbuf[count] = '\0';
			RL_SETSTATE(RL_STATE_DONE);
		} else
			wbuf = NULL;
		(*(void (*)(const char *))rl_linefunc)(wbuf);
	}
	_rl_update_pos();
}

It sets the raw mode and then returns back to the regular mode! PHP.wasm currently doesn't have this kind of control over stdin and so it would need much better terminal support before readline can do its job.

@adamziel
Copy link
Collaborator Author

adamziel commented Jan 26, 2023

One more thing, raw mode seems to be just a flag that changes how the input bytes are internally handled:

// This sets the raw mode
static tcflag_t
tty_update_flag(EditLine *el, tcflag_t f, int mode, int kind)
{
	f &= ~el->el_tty.t_t[mode][kind].t_clrmask;
	f |= el->el_tty.t_t[mode][kind].t_setmask;
	return f;
}

However, there are other functions that call termios directly:

/* tty_getty():
 *	Wrapper for tcgetattr to handle EINTR
 */
static int
tty_getty(EditLine *el, struct termios *t)
{
	int rv;
	while ((rv = tcgetattr(el->el_infd, t)) == -1 && errno == EINTR)
		continue;
	return rv;
}

/* tty_setty():
 *	Wrapper for tcsetattr to handle EINTR
 */
static int
tty_setty(EditLine *el, int action, const struct termios *t)
{
	int rv;
	while ((rv = tcsetattr(el->el_infd, action, t)) == -1 && errno == EINTR)
		continue;
	return rv;
}

termios.c then deals with ioctl() which is somewhat supported by emscripten, but not to the point where it can actually change parameters of process.stdin. In particular, it would need to support args like TCIFLUSH, TCOFLUSH, TCIOFLUSH, and more – see tty_ioctl(4) and tcflush(3).

Node.js TTYWrap, however, doesn't support setting these options directly but only through wrappers. TTYWrap::setRawModecalls libuv's uv_tty_set_mode:

https://github.com/nodejs/node/blob/8ba54e50496a6a5c21d93133df60a9f7cb6c46ce/src/tty_wrap.cc#L117

It also uses termios and ioctl, but the nuances very likely differ from what libedit does:

https://github.com/libuv/libuv/blob/ee206367d4e8ebb70455665de78b8309220fd7d0/src/unix/tty.c#L281

Node.js readline.module seems to implement many libedit`s features using mostly JavaScript. It consumes input bytes, parses xterm escape codes, and even does the same raw mode switching as we've seen in libedit code earlier in this thread:

https://github.com/nodejs/node/blob/8ba54e50496a6a5c21d93133df60a9f7cb6c46ce/lib/internal/readline/interface.js#L1191-L1205

Therefore, I can only see three solutions:

  1. Find a way to handle WASM libedit TTY operations using only Node.js process.stdin and readline implementations. I'm almost sure this will fail due to some mismatched nuance or a missing syscall.
  2. Implement a native Node.js C module to passthrough all libedit syscalls to their native counterparts. I don't like that – compiling native modules misses the point of having a WASM implementation in the first place. Besides, there's no easy way to port it to the browser.
  3. Implement a node-native prompt using the readline, pass the inputs to PHP, don't rely on libedit at all. I'd like to move forward with this one.

This was referenced Jan 27, 2023
adamziel added a commit that referenced this issue Jan 30, 2023
### Description

Adds support for CLI SAPI and networking in node.js:

```
> npm run build

> node ./build-cli/php-cli.js -r 'echo "Hello from PHP !";'
Hello from PHP !

> node ./build-cli/php-cli.js -r 'echo substr(file_get_contents("https://wordpress.org"), 0, 16);'
<!DOCTYPE html>

> node ./build-cli/php-cli.js -r 'echo phpversion();'
8.2.0-dev

> PHP=5.6 node ./build-cli/php-cli.js -r 'echo phpversion();'
5.6.40
```

### Highlights:

* Networking is supported (including MySQL and HTTPS)
* Switching PHP versions is supported.
* [Most WordPress PHPUnit tests pass](#111). The failures are caused by missing extensions and a few misconfigured settings
* PHP Interactive mode is supported but [the arrow keys don't work](#118)
* `wp-cli` works

### In broad strokes:

* CLI SAPI is compiled with libedit (readline replacement) and ncurses. 
* Network calls are asynchronous. Emscripten's Asyncify enables calling asynchronous code from synchronous code. TCP sockets are shimmed with a WebSocket connection to a built-in proxy server running on localhost. It supports data transfer, arbitrary connection targets, and setting a few TCP socket options.
* PHP's OpenSSL uses the same CA certs as Node.js
* PHP 5.6 is patched to work with OpenSSL 1.1.0 and many other small patches are introduced. For more details, see [patches overview](#119 (comment)), Dockerfile, and `phpwasm-emscripten-library.js`

### Future work:

* PHP Interactive server isn't supported yet. Adding support is a matter of making the incoming connection polling non-blocking using Asyncify.
* Use a more recent OpenSSL version
* [Better support for CLI interactive mode](#118)
@mcsf
Copy link

mcsf commented Jun 9, 2023

Hey @adamziel, per our conversation, who knows if this could be of use:

https://www.npmjs.com/package/blessed

Pookie717 added a commit to Pookie717/wordpress-playground that referenced this issue Oct 1, 2023
### Description

Adds support for CLI SAPI and networking in node.js:

```
> npm run build

> node ./build-cli/php-cli.js -r 'echo "Hello from PHP !";'
Hello from PHP !

> node ./build-cli/php-cli.js -r 'echo substr(file_get_contents("https://wordpress.org"), 0, 16);'
<!DOCTYPE html>

> node ./build-cli/php-cli.js -r 'echo phpversion();'
8.2.0-dev

> PHP=5.6 node ./build-cli/php-cli.js -r 'echo phpversion();'
5.6.40
```

### Highlights:

* Networking is supported (including MySQL and HTTPS)
* Switching PHP versions is supported.
* [Most WordPress PHPUnit tests pass](WordPress/wordpress-playground#111). The failures are caused by missing extensions and a few misconfigured settings
* PHP Interactive mode is supported but [the arrow keys don't work](WordPress/wordpress-playground#118)
* `wp-cli` works

### In broad strokes:

* CLI SAPI is compiled with libedit (readline replacement) and ncurses. 
* Network calls are asynchronous. Emscripten's Asyncify enables calling asynchronous code from synchronous code. TCP sockets are shimmed with a WebSocket connection to a built-in proxy server running on localhost. It supports data transfer, arbitrary connection targets, and setting a few TCP socket options.
* PHP's OpenSSL uses the same CA certs as Node.js
* PHP 5.6 is patched to work with OpenSSL 1.1.0 and many other small patches are introduced. For more details, see [patches overview](WordPress/wordpress-playground#119 (comment)), Dockerfile, and `phpwasm-emscripten-library.js`

### Future work:

* PHP Interactive server isn't supported yet. Adding support is a matter of making the incoming connection polling non-blocking using Asyncify.
* Use a more recent OpenSSL version
* [Better support for CLI interactive mode](WordPress/wordpress-playground#118)
@adamziel adamziel added this to the PHP Feature Parity milestone Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants