Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reader: Fixed race condition #4070

Closed

Conversation

alex-huff
Copy link

reader: Fixed race condition where reader would be terminated before it was properly initialized.

Currently, when reloading input very quickly (happens when updating the prompt quickly with 'reload' bound to 'change') it is possible for the reader's terminate method to be called before readFromCommand properly spawns the new reader process. This means the process is never killed and input is not reloaded until the process completes naturally or another 'EvtSearchNew' is triggered.

An example where this becomes a problem is when using fzf with rg integration and searching a large directory. Occasionally, the query being processed by rg is not the same as the query displayed by fzf and only after further modifying the query or waiting until rg finishes will the input reload to the correct state. This commit introduces a 'started' condition that the terminate method will wait for before trying to kill the process. The 'started' condition is signalled only when the reader reaches a state where it can be safely terminated.

was properly initialized.

Currently, when reloading input very quickly (happens when updating the
prompt quickly with 'reload' bound to 'change') it is possible for the
reader's terminate method to be called before readFromCommand properly
spawns the new reader process. This means the process is never killed
and input is not reloaded until the process completes naturally or
another 'EvtSearchNew' is triggered.

An example where this becomes a problem is when using fzf with rg
integration and searching a large directory. Occasionally, the query
being processed by rg is not the same as the query displayed by fzf and
only after further modifying the query or waiting until rg finishes will
the input reload to the correct state. This commit introduces a
'started' condition that the terminate method will wait for before
trying to kill the process. The 'started' condition is signalled only
when the reader reaches a state where it can be safely terminated.
@junegunn
Copy link
Owner

junegunn commented Oct 30, 2024

Is there any way I can easily reproduce the problem without having to rely on ripgrep and a large directory? Maybe using a contrived command with sleep for an artificial delay.

@alex-huff
Copy link
Author

alex-huff commented Oct 30, 2024

fzf --disabled --bind 'change:reload:echo Current query: {q} && sleep 1d'

When running this command, sometimes when you stop typing a query you may see that the item list does not update for the latest version of the query. You either have to wait 1 day or modify the query for the item list to update.

fzf-race-2-20fps.mp4

Notice at the end of the video the query is "i like to use fzf" but the item in the list of results says "i like to use fz".

@junegunn junegunn added the bug label Oct 31, 2024
@junegunn
Copy link
Owner

Thanks, I can reproduce the problem. I believe your patch will fix the problem, but the fundamental reason we have this race in the first place is that we "restart" the process asynchronously in a go routine.

fzf/src/core.go

Line 302 in 82ebcd9

go reader.restart(command, environ)

So I think a more direct way to fix it is to not do it asynchronously, i.e. remove go. I'll restructure the code so that we can call restart synchronously.

@junegunn junegunn closed this in dcb4c3d Oct 31, 2024
@junegunn
Copy link
Owner

Please see dcb4c3d and let me know if it doesn't fix the problem, or you find any flaw in the patch. Thanks.

@alex-huff
Copy link
Author

alex-huff commented Oct 31, 2024

Please see dcb4c3d and let me know if it doesn't fix the problem, or you find any flaw in the patch. Thanks.

Thank you! dcb4c3d does fix the problem I was encountering.

I did notice 3 minor issues when reading through reader.go/core.go.

  1. In restart, if startCommand returns nil, nil then fin is never called. This means the coordinator routine never receives a EvtReadFin event. So, if exec.Cmd.Start fails, no future reload commands will ever be run. I verified this in a contrived way by entering a query, moving my shell, attempting to modify the query (input did not reload as expected), moving my shell back, and then further modifying the query (still the input did not reload even with valid shell). exec.Cmd.Start is unlikely to fail but in the case that it does I think that fin should be called to properly shutdown the event poller and allow future input reloading.

  2. There is still a race condition with ReadSource when !streamingFilter.

    fzf/src/core.go

    Lines 170 to 176 in d938fdc

    if !streamingFilter {
    reader = NewReader(func(data []byte) bool {
    return chunkList.Push(data)
    }, eventBox, executor, opts.ReadZero, opts.Filter == nil)
    go reader.ReadSource(opts.Input, opts.WalkerRoot, opts.WalkerOpts, opts.WalkerSkip, initialReload, initialEnv)
    }

    This race condition is much less likely to cause problems then the original one, but still exists. It can be simulated by adding a time.Sleep(time.Second) to ReadSource and then entering a query when fzf launches. When ReadSource wakes up, the reader's termFunc will have already been set to nil. fzf is then in a state where the item list will not be updated until either the initial command completes or the query is modified.

In action:

fzf --disabled --bind "start:reload:echo start command; sleep 1d" --bind "change:reload:echo reloaded input with query {q}"
read-source-race-20fps.mp4
  1. When piping data to fzf and then trying to reload the input, the Read call in feed will not return until more data is sent to stdin. This means that when piping data to fzf from a long running program that only occasionally (or never) produces output, attempting to reload input will fail for potentially long periods of time.

Here is a contrived example:

(while true; do echo stdin data; sleep 10; done) | fzf --disabled --bind "change:reload:echo reloaded with: {q}"
fzf-stdin-reload-20fps.mp4

This is because the stdin file is in blocking mode. Calling util.SetNonblock(os.Stdin, true) before readFromStdin and then appropriately handling syscall.EAGAIN errors from Read seemed to fix the issue on my system.

@junegunn
Copy link
Owner

junegunn commented Nov 1, 2024

Thank you very much for the detailed analysis. I'll address the issues and let you know of the progress.

junegunn added a commit that referenced this pull request Nov 3, 2024
@junegunn
Copy link
Owner

junegunn commented Nov 3, 2024

I believe 19495eb addresses the first two issues you mentioned. It restored the previous code structure, but added a barrier using a channel to ensure that the process has started.

I'm not going to think too much about the third issue for now, as I'm afraid trying to fix it might cause other side effects (e.g. affecting the ingestion performance) and we can work around it using a start:reload binding.

fzf --disabled \
    --bind 'change:reload:echo reloaded with: {q}' \
    --bind 'start:reload:while true; do echo stdin data; sleep 10; done'

Thanks for your feedback!

@junegunn
Copy link
Owner

junegunn commented Nov 3, 2024

Hmm, the last commit has a bug that can freeze fzf when a process failed to start. Let me see..

fzf --bind change:reload:ls --with-shell xxx

EDIT: Fixed in acdf265

junegunn added a commit that referenced this pull request Nov 3, 2024
@alex-huff
Copy link
Author

Thank you!

tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request Nov 22, 2024
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [junegunn/fzf](https://github.com/junegunn/fzf) | minor | `v0.55.0` -> `v0.56.3` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>junegunn/fzf (junegunn/fzf)</summary>

### [`v0.56.3`](https://github.com/junegunn/fzf/releases/tag/v0.56.3): 0.56.3

[Compare Source](junegunn/fzf@v0.56.2...v0.56.3)

-   Bug fixes in zsh scripts
    -   fix(zsh): handle backtick trigger edge case ([#&#8203;4090](junegunn/fzf#4090))
    -   revert(zsh): remove 'fc -RI' call in the history widget ([#&#8203;4093](junegunn/fzf#4093))
    -   Thanks to [@&#8203;LangLangBart](https://github.com/LangLangBart) for the contributions

### [`v0.56.2`](https://github.com/junegunn/fzf/releases/tag/v0.56.2): 0.56.2

[Compare Source](junegunn/fzf@v0.56.1...v0.56.2)

-   Bug fixes
    -   Fixed abnormal scrolling behavior when `--wrap` is set ([#&#8203;4083](junegunn/fzf#4083))
    -   \[zsh] Fixed warning message when `ksh_arrays` is set ([#&#8203;4084](junegunn/fzf#4084))

### [`v0.56.1`](https://github.com/junegunn/fzf/releases/tag/v0.56.1): 0.56.1

[Compare Source](junegunn/fzf@v0.56.0...v0.56.1)

-   Bug fixes and improvements
    -   Fixed a race condition which would cause fzf to present stale results after `reload` ([#&#8203;4070](junegunn/fzf#4070))
    -   `page-up` and `page-down` actions now work correctly with multi-line items ([#&#8203;4069](junegunn/fzf#4069))
    -   `{n}` is allowed in `SCROLL` expression in `--preview-window` ([#&#8203;4079](junegunn/fzf#4079))
    -   \[zsh] Fixed regression in history loading with shared option ([#&#8203;4071](junegunn/fzf#4071))
    -   \[zsh] Better command extraction in zsh completion ([#&#8203;4082](junegunn/fzf#4082))
-   Thanks to [@&#8203;LangLangBart](https://github.com/LangLangBart), [@&#8203;jaydee-coder](https://github.com/jaydee-coder), [@&#8203;alex-huff](https://github.com/alex-huff), and [@&#8203;vejkse](https://github.com/vejkse) for the contributions

### [`v0.56.0`](https://github.com/junegunn/fzf/releases/tag/v0.56.0): 0.56.0

[Compare Source](junegunn/fzf@v0.55.0...v0.56.0)

-   Added `--gap[=N]` option to display empty lines between items.
    -   This can be useful to visually separate adjacent multi-line items.
        ```sh
        ```

### All bash functions, highlighted

      declare -f | perl -0777 -pe 's/^}\n/}\0/gm' |
        bat --plain --language bash --color always |
        fzf --read0 --ansi --reverse --multi --highlight-line --gap
      ```
      <img width="855" alt="image" src="https://github.com/user-attachments/assets/b3d2eaf2-bf44-4e3a-8d7b-9878629dd9be">
    - Or just to make the list easier to read. For single-line items, you probably want to set `--color gutter:-1` as well to hide the gutter.
      ```sh
      fzf --info inline-right --gap --color gutter:-1
      ```
      <img width="855" alt="image" src="https://github.com/user-attachments/assets/113757a1-ccfd-42a6-b946-83533f408e69">

-   Added `noinfo` option to `--preview-window` to hide the scroll indicator in the preview window
-   Bug fixes
    -   Thanks to [@&#8203;LangLangBart](https://github.com/LangLangBart), [@&#8203;akinomyoga](https://github.com/akinomyoga), and [@&#8203;charlievieth](https://github.com/charlievieth) for fixing the bugs

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants