-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename @linq macro? #177
Comments
I actually want to separate it out #173 |
That seems a little excessive. Then we would have DataFramesMetaMeta.jl. 😄 |
To me the linq macros is just a distraction. There are too many things to maintain anyway. |
On the contrary, I think a begin block form for a |
|
It's true that
@> df begin
@transform(c = a + b)
@groupby(g)
@combine(c_mean = mean(c))
end But I'd rather write this: @query df begin
transform(c = a + b)
groupby(g)
combine(c_mean = mean(c))
end The second form would require a custom macro in DataFramesMeta.jl. |
That doesn't compose well with other functions. Also, it's an extra burden on dev which is already stretching very thin. I vote for deprecating Also, what's stopping DataFramesMeta.jl from just re-exporting |
It composes just as well as
I'm willing to contribute a |
I see. That sounds like a good idea. I also support making the |
Maybe instead of special-casing @pdeffebach latest PR has come quite close to this already AFAICT. |
yes this is basically exactly what we do [here](yes, this is DataFramesMeta.jl/src/DataFramesMeta.jl Line 100 in e0bfa26
). The current implementation throws an error if an input is not a Symbol or an expression of the form y = f(:x) . If we simply uncomment that error, then we will fully support any expression that could be passed to DataFrames.transform .
With regards to re-naming. I would prefer That said, one feature I would like to implement for
@CameronBieganek I don't think we need a An important conflict that as not come up yet here is the tension between
would work. But then, we would likely need some way for
The mix of |
Sorry, I wrote The idea of allowing optional underscores to pipe into an argument other than the first one is interesting. |
I think it's relevant because the main benefit of As you know, we want to emulate dplyr's |
I agree that
That would work fairly well, except then we wouldn't be able to add other arbitrary functions to the chain/pipe. For instance, I'd like to be able to write the following: @chain df begin
transform(c = a + b)
where(a < 100)
tail(30)
repeat(outer = 3)
end If we implemented your proposal, then repeat(..., [] => (() -> 3) => :outer) (or something like that), which would cause an error. Personally, my main desire for a
Well, we would need a macro if we want to allow
I like this idea.
To ameliorate this situation, perhaps
|
The new PipelessPipes.jl seems to provide almost everything I want. The only thing it doesn't let me do is write |
I was thinking the same thing. I really do like writing Additionally, I think any But PipelessPipes is promising and maybe they will be able to accept PRs when this task moves higher up on the development priorities list. |
Things like
What's the advantage of using |
Sorry, I meant that the ambiguity wouldn't be that important inside the chaining block. Not in general. Though in theory a user could, post 1.0, replace every call to
Copying and pasting into the REPL. With |
I guess, if we allow users to omit the |
The other reason to support both block piping and It would be nice if there were a way to overload |
I don't know! maybe. I hardly understand how |
If shadowing is not possible, DataFramesMeta could probably monkey patch the PipelessPipes module. 🙈 😂 |
Perhaps the most reasonable approach would be for PipelessPipes to have a separate macro like |
Why would |
Ah I see, because of the additional symbol transformations, I didn't read the previous posts carefully enough |
@jkrumbiegel To clarify, we would like @df_ mydata begin
transform(y = 2 * :x)
end to be equivalent to @_ mydata begin
@transform(y = 2 * :x)
end |
Aha, well that shouldn't be difficult to do, I can make a mockup macro with that behavior. The functions would be all the DataFramesMeta macros one can use within |
What's tricky is that while DataFramesMeta macros should be added the |
Yes I wouldn't build this into |
Yeah I'm actually a bit bummed @CameronBieganek suggested the renaming for |
OTOH it would make sense for a DataFramesMeta macro to use a more specific name (like |
It seems to me like it would be fine to have the DataFramesMeta-specific macro live in Chain.jl, rather than DataFramesMeta. For the sake of argument, let's call it
To be clear, in I think it's reasonable for this to live in Chain.jl. It does not require introducing a DataFramesMeta.jl dependency for Chain.jl. And then @xiaodaigh would get his wish of separating Other piping packages could do something similar if they wished. For example, Pipe.jl could add a macro called Overall it seems like a benefit to remove all piping functionality from DataFramesMeta.jl and rely solely on piping packages like Chain.jl and Pipe.jl. |
What would be the advantage? That macro wouldn't make any sense without DataFramesMeta, and versions would have to be kept in sync with DataFramesMeta if we add new macros. Moving things to other packages doesn't reduce the maintainance burden -- it increases it as then you need to handle interactions between packages and their version bounds. |
I'd say given that the Chain.jl code is like 70 lines, the easiest thing would be to just copy it and make your edits in DataFramesMeta. It's not like there are going to be big changes you'll have to merge in later. The package is pretty much done and I can't see what should change about it |
Ok @nalimilan, if you really don't want to have module Chain
function_to_macro(ex) = ex
# other stuff
end module DataFramesMeta
using Chain
Chain.function_to_macro(ex) = # rewrite `transform` to `@transform`, etc
# other stuff
export @chain # re-export the overridden @chain
end I don't know if that sort of pattern would be considered admissible, but I think it would work. |
I don't even think it needs to be in a separate module. The current implementation of |
Well, I didn't mean anything particular about using modules. I just meant that Chain.jl could define an internal function called But I guess that would be a monkey patch, since it would be overwriting an existing method rather than adding a new method. 🤷♂️ |
FWIW, if you do want to keep the DataFramesMeta piping macro as a completely independent implementation, I think it would be ok to call it |
@jkrumbiegel I didnt realize until now that you were the owner of I think the best course of action is to wait a little bit and see how I seem to be the only person who likes |
I just played a little with this again and made a using DataFrames, Chain, DataFramesMeta
macro dfchain(first, block)
dataframesmeta_symbols = [:transform, :combine, :select, :where, :orderby]
b = copy(block)
last_linenumbernode = LineNumberNode(0)
for line in b.args
if line isa LineNumberNode
last_linenumbernode = line
elseif line isa Expr && line.head == :call
symbol = line.args[1]
symbol isa Symbol && symbol in dataframesmeta_symbols || continue
macrosymbol = Symbol("@" * String(symbol))
line.head = :macrocall
line.args[1] = macrosymbol
insert!(line.args, 2, last_linenumbernode)
end
end
esc(quote
@chain $first $b
end)
end You can use it like this: df = DataFrame(id = rand(1:100, 1000), weight = randn(1000), name = [String(rand('a':'z', 5)) for _ in 1:1000])
@dfchain df begin
where(:id .> 50, .!startswith.(:name, "a"))
groupby(:id)
combine(x = sum(:weight))
orderby(-:id)
end Just thought this was a good place to save it while we're thinking about the options for DataFramesMeta. Interestingly, it might point again to :col being a good choice for DataFramesMeta because it mixes more naturally with non-DataFramesMeta expressions. |
Okay I think we should move forward and deprecate Now that there is
I propose to add |
I am OK, we would just need to resolve jkrumbiegel/Chain.jl#19 (but I hope it will be resolved soon) |
Closing, since |
I'm not sure if the current plan is to keep
@linq
or not, but if it is kept, I think a different name would make more sense.@linq
only makes sense if you're a former C# programmer. Perhaps@query
instead?Also, I'd like to argue in favor of keeping
@linq
, or rather@query
. Here's a query from the README that uses Pipe.jl (except I've changed:a
toa
, etc).What I would prefer is a begin block syntax like this:
I think the begin-block syntax is much easier to read. The second example avoids the noise from all the
@
,_
, and|>
symbols.Actually, I already have an open issue for begin-block syntax for the
@linq
macro: #136The text was updated successfully, but these errors were encountered: