Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use predicate function in lstrip etc. #27232

Closed
wants to merge 8 commits into from
Closed

Use predicate function in lstrip etc. #27232

wants to merge 8 commits into from

Conversation

simonbyrne
Copy link
Contributor

@simonbyrne simonbyrne commented May 23, 2018

Fixes #27211. Should be non-breaking.

@simonbyrne simonbyrne added the strings "Strings!" label May 23, 2018
end
SubString(s, e+1, e)
end
lstrip(s::AbstractString) = lstrip(isspace, f)
lstrip(s::AbstractString, chars::Chars) = lstrip(c -> c in chars, s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in(chars)

@fredrikekre
Copy link
Member

Perhaps the asymmetry between strip(this_decides_what_to_strip, str) (strip(f, str)) and strip(str, this_decides_what_to_strip) (strip(str, chars)) is a bit unfortunate? Can we deprecate the latter in favor of the former?

@simonbyrne
Copy link
Contributor Author

Yes: since chars can't be a string, we could safely switch arguments.

We could even get rid of it altogether? It isn't that hard to write lstrip(in(chars), str)

@ararslan ararslan added the needs news A NEWS entry is required for this change label May 23, 2018
@simonbyrne simonbyrne removed the needs news A NEWS entry is required for this change label May 23, 2018
@simonbyrne
Copy link
Contributor Author

I'll leave the decision on the chars argument to triage.

@simonbyrne simonbyrne added the triage This should be discussed on a triage call label May 23, 2018
end
SubString(s, 1, 0)
end
rstrip(s::AbstractString) = rstrip(isspace, f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A typo?
rstrip(s::AbstractString) = rstrip(isspace, f) should be rstrip(s::AbstractString) = rstrip(isspace, s)?

Copy link
Contributor

@KDr2 KDr2 May 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems lstrip and strip also have this typo.

@JeffBezanson
Copy link
Member

Changing this to always use a function would be in line with the find changes we've done.

@StefanKarpinski
Copy link
Member

Triage is in favor so let's merge if this passes CI. Also in favor of stripping all Unicode whitespace.

@simonbyrne
Copy link
Contributor Author

Should we deprecate chars?

@simonbyrne
Copy link
Contributor Author

Ah, I assume that is what you were implying.

@JeffBezanson
Copy link
Member

Eh, deprecating it would be ok but doesn't seem urgent.

@simonbyrne
Copy link
Contributor Author

Note that split/rsplit uses split(string, predicate).

@ararslan
Copy link
Member

Looks like there's a still a use of the deprecated rstrip method. This should do it:

--- a/base/mpfr.jl
+++ b/base/mpfr.jl
@@ -932,7 +932,7 @@ function _prettify_bigfloat(s::String)::String
     if !occursin('.', mantissa)
         mantissa = string(mantissa, '.')
     end
-    mantissa = rstrip(mantissa, '0')
+    mantissa = rstrip(==('0'), mantissa)
     if endswith(mantissa, '.')
         mantissa = string(mantissa, '0')
     end

@ararslan
Copy link
Member

Another one:

--- a/stdlib/Markdown/src/GitHub/GitHub.jl
+++ b/stdlib/Markdown/src/GitHub/GitHub.jl
@@ -9,7 +9,7 @@ function fencedcode(stream::IO, block::MD)
         skip(stream, -1)
         ch = read(stream, Char)
         trailing = strip(readline(stream))
-        flavor = lstrip(trailing, ch)
+        flavor = lstrip(==(ch), trailing)
         n = 3 + length(trailing) - length(flavor)
 
         # inline code block

@ararslan
Copy link
Member

Looks like this broke some doctests, and will also need to be updated in Documenter.

@simonbyrne
Copy link
Contributor Author

Maybe this is a sign that we should leave it alone?

@ararslan
Copy link
Member

Nah. Just need to adjust the rstrip usage in a couple places in the docs.

@simonbyrne
Copy link
Contributor Author

Can you point to the broken docstrings? I can't see any.

simonbyrne added a commit to simonbyrne/Documenter.jl that referenced this pull request May 25, 2018
@simonbyrne
Copy link
Contributor Author

Looking at JuliaDocs/Documenter.jl#731, I'm not 100% convinced it's a strict improvement.

Additionally, considering that the behaviour will be different than split, I'm wondering if we should switch back.

@ararslan
Copy link
Member

Can you point to the broken docstrings? I can't see any.

They're showing up in the Travis log: https://travis-ci.org/JuliaLang/julia/jobs/383852269#L2958.

Additionally, considering that the behaviour will be different than split, I'm wondering if we should switch back.

Does there need to be a 1-1 correspondence with split? They're related functions but not that related.

@nalimilan
Copy link
Member

I agree it sounds a bit weird to require using == or isequal here. These are string-specific functions, so it makes sense to support more succinct syntaxes for stripping a character or substring. (Actually we could even allow this syntax with find* functions in 1.x, I didn't propose them because the priority was to get the generic API right.)

@JeffBezanson JeffBezanson added the deprecation This change introduces or involves a deprecation label May 28, 2018
@StefanKarpinski
Copy link
Member

I have to agree that there doesn't seem to be much harm in leaving the old strip signatures. Having the character, character set or regex to strip as an optional second argument makes sense; having a predicate as a first argument also makes sense. As long as the argument to be stripped must be an AbstractString (which seems sensible), having a little variation in argument order doesn't seem like the worst thing.

@simonbyrne
Copy link
Contributor Author

Okay, should we switch the argument order then, so that it is strip(chars, string)?

While characters and character sets are fine, I'm not sure how regexes would work, since it operates on a character basis.

@nalimilan
Copy link
Member

The current strip(string, chars) sounds fine. I'm not sure why, but I find strip(chars, string) unnatural; probably because strip transforms string and chars is only a secondary argument (which is indeed optional).

@StefanKarpinski
Copy link
Member

I also don't think we should switch the order. Is there much to do here then?

@ararslan
Copy link
Member

I've also come around to thinking that this change may not actually be worthwhile... lstrip(==('x'), "xxf") doesn't really provide that much clarity over lstrip("xxf", 'x'), for example, and I think simply adding the characters for which isspace is true to _default_delims should just be fine. I'm not against the change, but I'm no longer as supportive of it as I originally was. 🙂

@simonbyrne
Copy link
Contributor Author

I think we should still allow predicate functions, since they can be much more efficient (e.g. checking ranges, short-circuiting if there are no non-ASCII chars).

I've opened a simpler PR: #27309

@simonbyrne
Copy link
Contributor Author

Closed in favour of #27309 (for now...)

@simonbyrne simonbyrne closed this Jun 15, 2018
@simonbyrne simonbyrne deleted the sb/strip branch June 15, 2018 22:32
@StefanKarpinski StefanKarpinski removed the triage This should be discussed on a triage call label Jun 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprecation This change introduces or involves a deprecation strings "Strings!"
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants