Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluto.jl support #176

Open
fonsp opened this issue Nov 20, 2024 · 1 comment
Open

Pluto.jl support #176

fonsp opened this issue Nov 20, 2024 · 1 comment

Comments

@fonsp
Copy link

fonsp commented Nov 20, 2024

Hi! Thanks for this cool package!

Using MLDatasets.jl in Pluto does not always work πŸ˜΅β€πŸ’« which is a shame because it's really useful for our Julia-beginner target audience! Running iris = Iris() will leave the cell stuck running forever. This is because DataDeps.jl tries to ask the user for permission:

Do you want to download the dataset from ["https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"] to "/Users/fons/.julia/scratchspaces/124859b0-ceae-595e-8997-d05f6a7a8dfe/datadeps/Iris"?
[y/n]

But Pluto does not have a stdin terminal interface, users cannot enter y/n. In Pluto, Base.isinteractive() returns false to tell packages about this (previous discussion with good foresight in #12 (comment) 🌟), but it looks like DataDeps instead uses the "CI" env variable to make this distinction.

Ideally, running MLDatasets.Iris() in a non-interactive session should throw an error saying that you should set DATADEPS_ALWAYS_ACCEPT, perhaps with an example code snippet.

Ideas

Perhaps Base.isinteractive() can be used instead of env_bool("CI")?

Or a bit more low-level: function better_readline can throw when stream === stdin and isinteractive is false? Since this is guaranteed to block forever.

Let me know what you think, thank you!

PS I knowwww that Pluto should "just" support terminal input, but it's really really complicated! And by not supporting it, we have a nice side effect that people will author notebooks that never require user input to re-run on another computer.

@oxinabox
Copy link
Owner

We can't use Base.isinteractive (as discussed in #12) because that returns false for if you just are running a normal script via commandline julia myscript.jl, where you do have stdin.

In general better_readline has gone through many interations for how to work with detecting there is no input possible due to being on CI.

DataDeps.jl/src/util.jl

Lines 54 to 61 in 8fdc7ce

function better_readline(stream = stdin)
if !isopen(stream)
Base.reseteof(stream)
isopen(stream) || throw(Base.IOError("Could not open stream.", -1))
end
return readline(stream)
end

Because different julia versions have had different behavours for this.
like the stream being closed, the stream being closed and openning it being a no op,
the stream being open but reading it always being an immediate empty string.
Which is why we ended up giving up and just checking if ENV["CI"] because it kept changing.
Possibly we should just check if stdin is closed, and if so we should throw and error (which could then result in a useful instruction like you say).
Can Pluto make sure that stdin is closed?

_Pluto should "just" support terminal input, but it's really really complicated! _

This has long been my position. I feel like we have talked about this before.
Jupyter does this via popping open a little text input. Why can't Pluto?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants