Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom infix operators #16985

Open
jamesonquinn opened this issue Jun 17, 2016 · 65 comments
Open

Custom infix operators #16985

jamesonquinn opened this issue Jun 17, 2016 · 65 comments
Labels
parser Language parsing and surface syntax speculative Whether the change will be implemented is speculative

Comments

@jamesonquinn
Copy link
Contributor

jamesonquinn commented Jun 17, 2016

There is a discussion at https://groups.google.com/forum/#!topic/julia-dev/FmvQ3Fj0hHs about creating a syntax for custom infix operators.

...

Edited to add note: @johnmyleswhite has pointed out that the comment thread below is an invitation to bikeshedding. Please refrain from new comments unless you have something truly new to add. There are several proposals below, marked by "hooray" emoticons (exploding cone). You can use those icons to skip discussion and just read the proposals, or to find the different proposals so you can vote "thumbs up" or "thumbs down".

Up/downvotes on this bug as a whole are about whether you think that Julia should have any custom infix idiom. Up/downvotes for the specific idea below should go on @Glen-O's first comment. (The bug had 3 downvotes and 1 upvote before that was clarified.)

...

Initial proposal (historical interest only):

The proposal that seems to have won out is:

    a |>op<| b #evaluates (in the short term) and parses (in the long term) to `op(a,b)`

In order to have this work, there are only minor changes necessary:

  • Put the precedence of <| above that of |>, instead of being the same.
  • Make <| group left-to-right.
  • Make the function <|(a,b...)=(i...)->a(i...,b...). (as pointed out in the discussion thread, this would have standalone uses, as well as its use in the above idiom)

Optional:

  • create new functions >|(a...,b)=(i...)->b(a...,i...) and |<(a,b...)=a(b...) with appropriate precedences and grouping.
    • Pipe first means evaluation, and pipe last maintains it as a function, while the > and < indicate which one is the function.
  • create new functions >>|(a...,b)=(i...)->b(i...,a...) and <<|(a,b...)=(i...)->a(b...,i...) with appropriate precedence and grouping.
  • create synonyms », , and(/or) pipe for |>; «, , and(/or) rcurry for <|; and(/or) lcurry for <<|; with the single-character synonyms working as infix operators.
  • create an @infix macro in base which does the first parser fix below.

Long term:

  • teach the parser to change a |>op<| b to op(a,b), so there's no extra overhead involved when running the code, and so that operators can actually be defined in infix position. (This is similar to how the parser currently treats the binary a:b and the ternary a:b:c differently. For maximum customizability, it should do this for matched synonyms, but not for unmatched synonyms, so that e.g. a |> b « c would be still be treated as two binary operators.)
  • teach the parser to understand commas and/or spaces so that the ellipses in the above definitions work as expected without extra parentheses.

(relates to #6946)

@yuyichao yuyichao added the speculative Whether the change will be implemented is speculative label Jun 17, 2016
@johnmyleswhite
Copy link
Member

Echoing the julia-dev thread, I think it would be useful to quote Stefan's main comment on this proposal:

Just to set expectations here, I don't think there's going to be much in the way of "syntactic innovation" before Julia 1.0. (The only exception I can think of is the new f.(v) vectorized calling syntax.) While having some way of making arbitrary functions behave as infix operators might be nice, it's just not a pressing issue in the language.

As someone who's participated in a good proportion of the history of Julia development, I think it would be better to focus energy on semantic changes rather than syntactic ones. There are lots of extremely important semantic problems left to solve before Julia reaches 1.0.

Note in particular that implementing this feature isn't simply a one-off diff that only the author needs to think about: everyone will have to think about how their work interacts with this feature going forward, so the change actually increases the long-term workload of every person who works on the parser.

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 17, 2016

I think that johnmyleswhite's comments are very apropos regarding the "long term" parser changes suggested. But the "minor changes" and "optional" groups are, as far as I can see, pretty self-contained and low-impact.

That is: the parser changes needed to enable the minimal version of this proposal involve only precedence and grouping for normal binary operators, the kind of changes that are more-or-less routine in other cases. A parser developer working on something unrelated would no more need to keep track of this than they need to keep track of the meaning of all of the numerous already-existing operators.

@JeffBezanson
Copy link
Member

Personally I find this syntax quite ugly and difficult to type. But I do agree it would be good to have more general infix syntax.

I think the right way to think about this is as a syntax-only issue: what you want is to use op with infix syntax, so defining other functions and operators to get that is roundabout. In other words it should all be done in the parser.

I would actually consider reclaiming | for this, and using a |op| b. Arguably general infix syntax is more important than bitwise or. (We've talked about reclaiming bitwise operators before; they do seem like a bit of a waste of syntax as it is.)

@StefanKarpinski
Copy link
Member

a f b is available outside of array concatenation and macro call syntaxes.

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 17, 2016

a f b might work, but it seems pretty fragile. Imagine trying to explain to somebody why a^2 f b^2 f c^2 is legal but a f b c and a+2 f b+2 f c+2 aren't. (I know, that last one assumes that the precedence is prec-times, but no matter what the precedence is, this general kind of thing is a concern).

As to a |op| b: initially I favored a similar proposal, a %op% b, as you can see in the google groups thread. But the nice thing about the proposed |> and <| is that they are each individually useful as binary operators, and they naturally combine to work as desired (given the right precedence and grouping, that is.) This means that you can implement this in the short term using existing parser mechanisms, and thus avoid creating headaches for parser developers in the future, as I said in my response to johnmyleswhite above.

So while I like a |op| b and certainly wouldn't oppose it, I think we should look for a way to have two different operators to simplify the required parser changes. If we're going for maximum typeability and not opposed to having | mean "pipe" rather than "bitwise or", then what about a |op\\ b or a |op& b?

@StefanKarpinski
Copy link
Member

"headaches for parser developers" is the lowest possible concern.

@JeffBezanson
Copy link
Member

"headaches for parser developers" is the lowest possible concern.

As a parser developer, I unequivocally agree with this.

|> and <| are both perfectly good infix operators, but there is zero benefit to implementing general operator syntax using two other operators. And much more needs to be said on just how verbose and unappealing that syntax is.

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 17, 2016

there is zero benefit to implementing general operator syntax using two other operators.

To be clear, the long term vision here is that there would be binary f <| y, binary x |> f, and ternary x |> f <| z, where the first one is just a function but the second two are implemented as transformations in the parser.

The idea that this could be implemented using two ordinary functions |> and <| is just a temporary bridge to that vision.

And much more needs to be said on just how verbose and unappealing that syntax is.

That's a fair point. How about replacing |> and <| with | and &? They make sense both as a pair and individually, although they might be a bit jarring to a bit hockey player.

@JeffBezanson
Copy link
Member

Stealing both | and & for this would not be a good allocation of ASCII, and I suspect many would prefer the delimiters to be symmetric.

If people want a x |> f <| y ternary operator for other reasons, that's fine, but I think it should be considered separately. I'm not sure the parser should transform |> to a flipped <|. Other similar operators like < don't work that way. But that's also a separate issue.

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 17, 2016

Stealing both | and & for this would not be a good allocation of ASCII, and I suspect many would prefer the delimiters to be symmetric.

OK.

I understand that > and < are hard to type. In terms of symmetry and typability on a standard keyboard, I guess the easiest might be something like &% and %&, but that's seriously ugly, R parallel or no. /| and |/ might be worth considering too.

...

I'm not sure the parser should transform |> to a flipped <|

I think you've misunderstood. a |> b should parse to b(a). (The version without special parsing would be ((x,y)->y(x))(a,b), which evaluates to the same thing, but with more overhead.)

@JeffBezanson
Copy link
Member

a |> b should parse to b(a)

Ah, ok, got it.

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 18, 2016

I think that we could bikeshed about which characters to use for years. I'd trust @StefanKarpinski (as the most senior person in this conversation so far) to make a ruling, and I'd be fine with that. Even if it's something I've argued against (such as a f b.)

Here's some options to see what appeals:
a |>op<| b (leaving current |> unchanged)
a |{ op }| b (nearby and same shift state on many common keyboards, not too ugly. A bit strange as standalones.)
a \| op |\ b or a /| op |/ b or combinations thereof
a $% op %$ b (relatively typable, R-inspired. But kinda ugly.)
a |% op %| b
a |- op -| b
a |: op :| b
a | op \\ b
a | op ||| b
a op b

@JeffBezanson
Copy link
Member

Stefan is not more senior than me.

@jamesonquinn
Copy link
Contributor Author

Looks as if you just nominated yourself, then, for BDFL powers on this issue! ;)

@rfourquet
Copy link
Member

a @op@ b ?

@jamesonquinn
Copy link
Contributor Author

I guess my vote is to use all 4 of \|, |\, /|, and |/. Down for evaluation, up for currying; bar towards the function. So:
a \| f (or f |/ a) -> f(a)
a /| f (or f |\\ a) -> (b...)->f(a,b...)
f |\ b (or b //| f) -> (a...)->f(a...,b)
and thus:
a \| f |\ b (or a /| f |/ b) -> f(a,b)
a \| f |\ b |\ c (or a /| b /| f |/ c) -> f(a,b,c)

Each of the 4 main operators, except perhaps |/, is useful on its own. The redundancy would certainly be un-Pythonic, but I think that the logical neatness is Julian. And as a practical matter, you can use whichever version of the infix idiom you find easier to type; they are both equally readable, in that once you've learned one you naturally understand both.

Obviously, it would make equal sense if you swapped all slashes, so that up arrows were for evaluation and down for currying.

I'm still waiting for word from On High (and I apologize for my newbie clumsiness in guessing what that meant). But if anybody taller than this bikeshed makes a ruling, for this or any other version with at least two new symbols, I'd be happy to write a short term patch (using functions) and/or a proper one (using transformations).

@JeffBezanson
Copy link
Member

We try to avoid having a BDFL to the extent possible :)

@Glen-O
Copy link

Glen-O commented Jun 19, 2016

I just thought I'd note a few quick things.

First, the other benefit (the "standalone uses") of the notation that is being proposed is that <| can be used in other contexts, in a way that improves readability. For example, if you have an array of strings, A, and want to pad all of them on the left to 10, right now, you have to write map(i->lpad(i,10),A). This is relatively difficult to read. With this notation, it becomes map(lpad<|10,A), which I think you'll agree is significantly cleaner.

Second, the idea behind this is to keep the notation consistent. There's already a |> operator, which exists to change the "fix" of a function call from prefix to postfix. This just extends the notation.

Third, the possibility of using direct infix as a f b has a bigger problem. a + b and a * b would end up having to have the same precedence, since + and * are function names, and it would be infeasible for the system to have variable precedence. That, or it would have to treat existing infix operators differently, which could cause confusion.

@StefanKarpinski
Copy link
Member

For example, if you have an array of strings, A, and want to pad all of them on the left to 10, right now, you have to write map(i->lpad(i,10),A). This is relatively difficult to read. With this notation, it becomes map(lpad<|10,A), which I think you'll agree is significantly cleaner.

I emphatically do not agree. The proposed syntax is – forgive me – ASCII salad, verging on some of the worst offenses of Perl and APL, without precedent in other languages to give the casual reader a clue of what's happening. The current syntax, while a few characters longer (five?), is pretty clear to anyone who knows that i->expr is a lambda syntax – which it is in a large and growing set of languages.

@JeffBezanson
Copy link
Member

a + b and a * b would end up having to have the same precedence, since + and * are function names, and it would be infeasible for the system to have variable precedence. That, or it would have to treat existing infix operators differently, which could cause confusion.

I don't think this is a real problem; we can just say what the precedence of a f b infix is, and keep all existing precedence levels as well. This works because precedence is determined by the name of the function; any function called "+" will have "+" precedence.

@StefanKarpinski
Copy link
Member

Yes, we already do this for the 1+2 in 1+2 syntax, and it hasn't been a problem.

@Glen-O
Copy link

Glen-O commented Jun 19, 2016

I don't think this is a real problem; we can just say what the precedence of a f b infix is, and keep all existing precedence levels as well. This works because precedence is determined by the name of the function; any function called "+" will have "+" precedence.

I didn't mean it's difficult to write the parser to make it work. I meant it leads to consistency issues, hence me saying "or it would have to treat existing infix operators differently, which could cause confusion". Among other things, consider that ¦ and don't look all that different in concept, yet one is a predefined infix operator, while the other is not.

I emphatically do not agree. The proposed syntax is – forgive me – ASCII salad, verging on some of the worst offenses of Perl and APL, without precedent in other languages to give the casual reader a clue of what's happening. The current syntax, while a few characters longer (five?), is pretty clear to anyone who knows that i->expr is a lambda syntax – which it is in a large and growing set of languages.

Perhaps I should be clearer on what I'm saying. I'm saying that being able to describe the operation as "lpad by 10" is a lot clearer than i->lpad(i,10) makes it. And in my view, lpad<|10 is the nearest you can get to that, in a non-context-specific form.

Maybe it would help if I describe where I'm coming from. I'm a mathematician and mathematical physicist, first and foremost, and "lambda syntax", while sensible from a programming standpoint, isn't the clearest for those who are less experienced in programming. Julia is, as I understand it, primarily aimed at being a scientific computing language, hence the strong resemblance to MATLAB.

I must ask - how is lpad<|10 any more "ASCII salad" than, say, x|>sin|>exp? Yet the |> notation was added. Compare with, say, bash scripting, where | is used to pass the argument on the left to the command on the right - if you know it's called "pipe", it makes a little more sense, but if you're not skilled in programming, it's not going to make sense. In that regard, |> actually makes more sense, as it looks vaguely like an arrow. And then <| is a natural extension to the notation.

Compare with some of the other suggestions, such as %func%, which does have a precedent in another language, but which is completely opaque for people who don't have extensive knowledge of programming in the language.

Mind you, I looked back a bit at one of the older discussions, and I see that there HAS been a notation used in another language that would be quite nice, in theory. Haskell apparently uses a |> b c d to represent b(a,c,d). If spaces following a function name allowed you to specify "parameters" in this way, it would work nicely - map(lpad 10,A). The only problem arises with the unary operators - map(+ 10,A) would produce an error, for instance, as it would interpret at "+10" instead of i->+(i,10).

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 19, 2016

On a f b: the precedence issues may not be as bad as Glen-O suggested, but unless user-defined infix functions have the very lowest precedence, they do exist. Say, for the sake of argument we give them prec-times. In that case,
a^2 f b^2 => f(a^2,b^2)
a+2 f b+2 => a+f(2,b)+2
a^2 f^2 b^2 => (f^2)(a^2,b^2)
a f+2 b => syntax error?

This is all a natural consequence of how you'd write the parser, so it's not particularly a headache in that sense. But it's not particularly intuitive for the casual user of the idiom.

On the usefulness of a curry idiom
I agree with Glen-O that (i)->lpad(i,10) is simply worse than lpad<|10 (or, if we so choose, lpad |\ 10, or whatever). The i is an entirely extraneous cognitive burden and potential source of errors; in fact, I swear that when I was typing that just now, I unintentionally typed (i)->lpad(x,10) initially. So, having an infix curry operation seems to me like a good idea.
However, if that's the intention, then whatever infix idiom we settle on, we can create our own curry operation. If it's a f b, then something like lpad rcurry 10 would be fine. The point is readability, not keystrokes. So I think this is only a weak argument for <|.

On a |> b c d
I like this proposal a lot. I think that we could make it so that |> accepted spaces on either side, so a b |> f c d => f(a,b,c,d).

(Note: If both my suggestion of a b |> f c d and Glen-O's of map(lpad 10,A), this does create a corner case: (a b) |> f c d => f((x)->a(x,b),c,d). But I think that's tolerable.)

This still has similar issues in terms of operator precedence as a f b. But somehow I think they're more tolerable if you can at least talk about them in terms of the precedence of the operator |>, rather than being the precedence of the ternary operator of with .

@tkelman
Copy link
Contributor

tkelman commented Jun 19, 2016

Try lpad.(["foo", "bar"], 10) on 0.5. The existing |> isn't exactly loved by all.

@jamesonquinn
Copy link
Contributor Author

@tkelman: I see the issue, but what's your point? You think we should fix the existing |> before we add extra uses for it? If so, how?

@tkelman
Copy link
Contributor

tkelman commented Jun 19, 2016

I personally think we should get rid of the existing |>.

@Glen-O
Copy link

Glen-O commented Jun 19, 2016

Try lpad.(["foo", "bar"], 10) on 0.5. The existing |> isn't exactly loved by all.

I think you've missed the point. Yes, the func.() notation is nice, and bypasses the issue in some situations. But I use the map function as a simple demonstration. Any function that takes a function as argument would be benefited by this setup. As an example, purely to demonstrate my point, you might want to sort some numbers based on their least common multiple with some reference number. Which looks neater and easier to read: sort(A,by=i->lcm(i,10)) or sort(A,by=lcm 10)?

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 19, 2016

I'd like to note once again that any way to define infix operators will allow creating an operator that does what Glen-O wants <| to do, so that at worst he'll be able to write something like sort(A,by=lcm |> currywith 10). The point of this page is to discuss how to make some a...f...b => f(a,b). I understand that whether the existing |> or the proposed <| are worthwhile operators has some relationship to that point, but let's try not to get too sidetracked.

Personally, I think the a |> b c proposal is the best one so far. It follows an existing convention from Haskell; it is logically related to the existing |> operator; it is both reasonably readable and reasonably easy-to-type. The fact that I feel that it naturally extends to other uses is secondary. If you disagree, please at least mention your feelings on the core idiom, not just the proposed secondary uses.

@JeffBezanson
Copy link
Member

I meant it leads to consistency issues, hence me saying "or it would have to treat existing infix operators differently, which could cause confusion".

I agree it's difficult to decide on the precedence for a f b. For example in clearly benefits from comparison precedence, but it's quite likely many functions used as infix would not want comparison precedence. However I don't see any consistency issue. Different operators have different precedence. Adding a f b doesn't force our hand to give + and * the same precedence.

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 23, 2016

that amount of activity is generally out of proportion

I'm sorry. I'm probably guiltiest of getting into back-and-forth.

On the other hand, I think this thread has clearly made "useable" progress. Either of the latest suggestions (a f b) or [a @> f b, with a @f b definable as a shortcut] is clearly superior in my view to the earlier suggestions like a %f% b or a |> f <| b.

Still, I think that further back-and-forth comments are probably not going to make any further progress, and I'd encourage people to use thumbs-up or thumbs-down from now on unless they have something truly new to suggest (that is, not just an orthographic change to an existing proposal). I've added "hooray" emoticons (exploding cone) to the "votable proposals". If you believe that we should not have a specialized syntax for arbitrary functions in infix position, then downvote the bug as a whole.

...

ETA: I think that this discussion is now mature enough to get a decision tag.

@oxinabox
Copy link
Contributor

oxinabox commented Jun 23, 2016

For reference, (and I expected someone else to point it out).
If your want to embed SQL-like syntax, the right tool for the job is Nonstandard String Literals, I think.
Like all macros they have access to all variables in scope when called,
and they allow you to specify your own DSL, with your own choice of priority, and they run at compile time.

select((((:emp_id, :last_name) from employee_tbl) where (:city, == ,"indianapolis")) orderby :emp_id));

Is better written

sql"SELECT emp_id, last_name FROM employee_tbl WHERE city == 'indianapolis' ORDER BY emp_id"

Nonstandard string literals are a seriously powerful bit of syntax.
I can't find any good examples of them being used for embedding a DSL.
But they can do it.

And in this case I think the result is a lot cleaner than any infix operation that can be defined.
Though it does have the overhead of having to write your own microparser/tokenizer.


I really don't see the need to a decision tag.
This has no implementation as a PR, nor any usable prototype.
that lets people test it out.
Contrast to #5571 (comment) with its 8 usable prototypes

My feels towards this go up and down everytime I read the thread. I don't think I'll really know til I try it. And right now I don't even know what I would use it for. (Unlike some of the definitions for |> and <| which I have used in F#)

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 24, 2016

SQL-like syntax, the right tool for the job is Nonstandard String Literals

Whether or not SQL is best done with NSLs, I think there is a level of DSL that is complex enough that inline macros would be very helpful, but not so complex that it's worth writing your own microparser/tokenizer.

right now I don't even know what I would use it for. (Unlike some of the definitions for |> and <| which I have used in F#)

The inline macro proposal would enable people to, among other things, roll their own |>-like or <|-like macros, so you could use it for whatever you've done in F#.

(I don't want to get into back-and-forth bikeshedding arguments, but I was responding anyway because of the below, and I do think that the inline-macro proposal kills multiple birds with one relatively-smooth stone.)

I really don't see the need to a decision tag.

I asked earlier if it was appropriate for me to create a parser patch, and nobody answered. The only word on that so far is:

I don't think there's going to be much in the way of "syntactic innovation" before Julia 1.0.

Which would seem to argue against making a patch now, as it might just sit around and bit-rot. However, now you're saying that it's not worth making a decision on this (including the decision not to decide right now?) unless we have an "implementation as a PR [or] usable prototype".

What does that mean? (What is a PR?) Would a macro that used the character '@' instead of the token @ do the job, so that @testinline a '@'f b=>@f(a, b)? Or should I submit a patch to julia-parser.scm? (I've actually begun initial looking at writing such a patch, and it looks as if it should be simple, but my Scheme is very rusty.) Do I need to create test cases?

Right now, there are 13 participants in this bug. There are a total of 5 people who have voted on one or more of the proposals and/or downvoted the bug itself, and only one of those (me) did so after the inline macro proposal was on the table. That doesn't make me confident that it's time for prototyping yet. When the number of people who have voted since the last serious proposal is more like half the number of participants, I hope some kind of rough consensus will be becoming clear, and then it will be time for prototyping and testing and deciding (or, as the case may be, giving up on the idea).

@oxinabox
Copy link
Contributor

oxinabox commented Jun 24, 2016

By "implementation as a PR [or] usable prototype".
I mean something that can be played with.
So it can be seen how it feels in practice.

A PR is a pull request, so a patch is the term you've been using.

If you made a PR it could be downloaded and tested.
More simply though if you implemented it with macros
or Nonstardard string literals,
it could be tested without having to build julia.

Like it ain't my call, but I doubt I'll be bale to make up my own opinion without something I can play with.

Also +1 to not going to back and forth bike sheding.

@diegozea
Copy link
Contributor

...or maybe an Infix.jl package with macros and nonstandard string literals.

@StefanKarpinski
Copy link
Member

We have definitely reached the "working code or GTFO" point in this conversation.

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 24, 2016

OK, here's working code then: https://github.com/jamesonquinn/JuliaParser.jl

ETA: Should I reference a specific commit, or is the above link to the latest master OK?

...

(That does not have any of the convenience macros I'd expect you'd want, such as the equivalents for |>, <|, ~, and the @defineinfix from my example above. Nor does it remove deprecate the now-useless special case logic for ~ or the |> operator. It's just the parser changes to get it working. I've tested basic functionality but not all corner cases.

...

I think that the current ugly hack with ~ shows that there's a clear use case for this kind of thing. Using this patch, you'd say @~ when you needed macro behavior; much cleaner, with no special case. Or does anyone seriously believe that ~ is utterly unique and nobody will ever want to do that again?

Note that the patch (it's not a PR yet because it targets the native bootstrapped parser, but for now the scheme one should come first in terms of PRs) is more generally useful than the issue name here. The issue name is "custom infix operators"; the patch gives infix macros, with infix operators only coming as a side effect of that.

The patch as it stands is not a breaking change, but I expect that if this became the plan the next step would be to deprecate the currently-existing ~ and |>, which would eventually lead to breaking changes.

...

Some simple tests added.

@tkelman
Copy link
Contributor

tkelman commented Jun 25, 2016

#11608 was closed with a pretty clear consensus that many of us do not want infix macros and the one current case of ~ parsing was a mistake (made early on for R compatibility and no other especially good reason). We intend to deprecate and eventually get rid of it, just haven't done it (along with the work of modifying the API for the formula interface in JuliaStats packages) yet.

Macros are now technically generic, but their input arguments are always Expr, Symbol, or literals. So they aren't really extensible to new types defined in packages the way functions (infix or otherwise) are. Possible use cases for infix macros are better served by prefix-annotated macro DSL's or string literals.

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 25, 2016

(Sorry I posted prematurely; fixed now.)

In #11608, I see several negative arguments:

===

What would the following transform into?
...
y = 0.0 @in@ x == 1.0 ? 1 @in@ 2 : 3 @in@ 4

This was dealt with in the thread:

Cases like that are why I always use parenthesis...

and

same precedent ... apply without being macros: 0.0 in 1 == 1.0 ? 2 in 2 : 3 in 4

===

more functionality to Julia that people have to implement, maintain, test, learn to use, etc.

which is (partially) answered (and seconded) here by:

"headaches for parser developers" is the lowest possible concern.

===

is there no way for 2 packages to simultaneously have definitions for the same macro-operator that could be used together unambiguously in a single user code base?

This is an interesting point. Obviously, if the macro just calls a function, then we have all the dispatch power of the function. But if it is a true macro, as with ~, then it's more complicated. Yes, you could imagine hackish workarounds, like attempting to call it as a function, and catching any errors to use it as a macro... but that's kind of ugliness should not be encouraged.

Still, this is just as much of an issue for any macro. If two packages both export a macro, you simply can't have both with "using".

Is this likely to be more of a problem with infix macros? Well, it depends what people end up using them for:

  • Just a way to have user-defined infix functions. In that case, they're no worse than any other function; dispatch works fine.
  • As a way to use other programming styles, using operators like the |> and <| that @Glen-O discusses above. In that case, I think there will quickly develop common conventions about what macro means what, with little chance of collision.
  • As a way to make special-purpose DSLs, like the SQL example above. I think these will be used in specific contexts and the chance of collision is not too bad.
  • For things like R's ~. At first, this looks the most problematic; in R, ~ is used for several different things. However, I think that even there, it's manageable, with something like:

macro ~(a,b) :(~(:$a, quote($b))) end

Then, the function ~ could dispatch based on the type of the LHS, but the RHS would always be an Expr. This kind of thing would allow the principal uses it has in R (regression and graphing) to coexist, that is, to dispatch correctly despite coming from different packages.

(note: the above has been edited. Initially, I thought that an R expression like a ~ b + c used the binding of b and c through R's lazy evaluation. But it doesn't; b and c are the names of columns in a data frame passed explicitly, not names of variables in local scope that are thus passed in implicitly.)

===

The only way forward here would be to develop an actual implementation.

Which I have done.

===

Macros are now technically generic, but their input arguments are always Expr, Symbol, or literals. So they aren't really extensible to new types defined in packages the way functions (infix or otherwise) are.

This relates to the point above. Insofar as an infix macro calls a specific function, that function is still extensible through dispatch in the normal way. Insofar as it doesn't call a specific function, it is doing something structural/syntactic (such as what |> does now) that should not be extended or redefined. Note that even if it calls a function, the fact that it is a macro can still be useful; for instance, it can quote some of its arguments, or process them into callbacks, or even interact simultaneously with the name and the binding of a variable, in a way that a direct function call cannot.

===

Possible use cases for infix macros are better served by prefix-annotated macro DSL's or string literals.

As was pointed out in the referenced thread:

[Infix is] easier to parse (for English and most western speakers), because our language works that way. (The same thing generally holds for operators.)

For example, which is more readable (and writeable):

select((:emp_id, :last_name) @from employee_tbl @where city == 'NYC' @orderby :emp_id)

or

send(orderby((@where selectfrom((:emp_id, :last_name), employee_tbl) city == 'NYC'), :emp_id))

?

===

Finally:

#11608 was closed with a pretty clear consensus

Looks pretty evenly split to me, with "who's gonna do the work" casting the deciding vote. Which is now at least partly moot; I've done the work in JuliaParser and I'd be willing to do it in Scheme if people like this idea.

@jamesonquinn
Copy link
Contributor Author

jamesonquinn commented Jun 26, 2016

This is my last post in this thread, unless there's positive reaction to my hacked juliaparser. It is not my intention to impose my will; just to present my point of view.

I'm arguing in favor of infix macros (a @m b=>@m a b). That doesn't mean I'm not aware of the arguments against. Here's how I'd summarize the best argument against:

Language features start at -100. What do infix macros offer that could possibly overcome that? By their very nature, there is nothing you could accomplish with infix macros that couldn't be accomplished with prefix macros.

My response is: Julia is first of all a language for STEM programmers. Mathematicians, engineers, statisticians, physicists, biologists, machine learning people, chemists, econometricians... And one thing that I think most of those people realize is the usefulness of a good notation. To take an example I'm familiar with in statistics: adding independent random variables is equivalent to convolving PDFs, or even to convolving derivatives of CDFs, but often expressing something using the former can be an order of magnitude more concise and understandable than the latter.

Infix versus prefix versus postfix is, to some degree, a matter of taste. But there are also objective reasons to prefer infix in many cases. Whereas prefix and postfix lead to indigestible precipitates of back-to-back operators like the ones that make Forth programmers sound like German politicians, or the ones that make Lisp programmers sound like a Chomskian caricature, infix puts the operators in what's often the cognitively most natural place, as near to all their operands as possible. There's a reason nobody writes math papers in Forth, and why even German mathematicians use infix operators when writing equations.

Yes, infix macros could be used to write obfuscated code. But existing prefix macros are just as prone to abuse. If not abused, infix macros can lead to much clearer code.

  • (a+b @choose b) beats binomial(a+b,b);
  • score ~ age + treatment beats linearDependency(:score, :(age + treatment));
  • domSelect("#logo") @| css "color" "red" @| fadeIn "slow" @thenApply addClass "dummy" beats the holy hell out of addOneTimeEventListener(fadeIn(css(domSelect("#logo"),"color","red"),"slow"),"done",(obj,evt)->addClass(obj,"dummy")).

I realize that these are just toy examples but I think the principle is valid.

Could the above be done with nonstandard string literals? Well, the second and third examples would work as NSLs. But the problem with NSLs is that they give you too much freedom: unless you're familiar with the particular grammar, there's no way to be sure even what the tokens of an NSL are, let alone its order of operations. With infix macros, you have enough freedom to do all of the above examples, but not so much that it isn't clear on reading the "good" code what the tokens are and where the implied parentheses go.

@StirlingNewberry
Copy link

StirlingNewberry commented Jun 26, 2016

The it needs certain things to be moved from unknown unknowns to known unknowns. And unfortunately, there is not a mechanism to do this. Your arguments need a structure which does not exist.

@stevengj
Copy link
Member

stevengj commented Nov 2, 2017

Now that <| is right-associative (#24153), does the initial a |>op<| b proposal work?

@Ismael-VC
Copy link
Contributor

I have made a package for the hack mentioned by Steven in #24404 (comment):

@cscherrer
Copy link

I'm not how many potential infix operators this affects, but I'd really like to use <~. The parser won't cooperate -- even if I space things carefully, it wants a <~ b to mean a < (~b).

<- has a similar problem.

Sorry if this is already covered by this or another issue, but I couldn't find it.

@JeffBezanson
Copy link
Member

We could potentially require spaces in a < ~b; we've added rules like that before. Then we could add <- and <~ as infix operators.

@cscherrer
Copy link

Thanks @JeffBezanson, that would be great! Would this be a special case, or a more general rule? I'm sure there are some details in what the rule should be to allow more infix operators, give clear and predictable code, and break as little as possible existing code. Anyway, I appreciate the help and the quick response. Happy new year!

@Liso77
Copy link

Liso77 commented Dec 31, 2017

In case that a <~ b will be different than a < ~b I would like to see a =+ 1 as error (or warning at least)

@Glen-O
Copy link

Glen-O commented Nov 7, 2019

I know this is quite an old discussion, and the question asked was asked quite some time ago, but I thought it was worth answering:

Now that <| is right-associative (#24153), does the initial a |>op<| b proposal work?

No, unfortunately, |> still gets the precedence. The update done makes it so that, if you define <|(a,b)=a(b), then you can successfully do a<|b<|c to obtain a(b(c))... but this is a different concept.

@o314
Copy link
Contributor

o314 commented Nov 9, 2019

Frozen during 2 years, a comment and a commit 2 and 5 days ago !

See Document customizable binary operators f45b6be

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parser Language parsing and surface syntax speculative Whether the change will be implemented is speculative
Projects
None yet
Development

No branches or pull requests