-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support arbitrary tuples #34
Conversation
Ordering::Less => { | ||
slice1 = gallop(slice1, |x| x.0 < slice2[0].0); | ||
slice1 = gallop(slice1, |x| accessor1(x) < accessor2(&slice2[0])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To join efficiently, tuples need to be in sorted order. This is a precondition of gallop
/binary_search
. In the current implementation, both slice1
and slice2
are in sorted order because they come from a Relation
, but the mapped values (once accessor
is applied) may not be.
It would be possible to make this a contract of the accessors (probably checked at runtime?), but that would be a pretty big footgun, and you would still have to re-index variables when their elements are in the wrong order. Accessors only help when tuple elements aren't grouped properly (e.g. (A, B, C)
vs. ((A, B), C)
). However, there are type-safe ways of handling that particular case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no, I feared I'd be missing something like that. 😟
I'm familiar with h-lists, but not exactly sure how one would apply them here? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datafrog
expects leapers to use an interface like (Key, Value)
, which becomes awkward when Key
or Value
contain multiple elements themselves. That's why you see variables with types like ((Origin, Point), Origin)
or ((Loan, Point), ())
over in Polonius. However, the only prerequisite for an efficient join is that variables/relations have a common prefix: You should be able to join (Loan, Origin, X)
with (Loan, Origin, Y, Z)
directly, without having to re-index them as ((Loan, Origin), ...)
.
I think it would be simplest to express this constraint on top of h-lists (with the typical ordering reversed, so (((A), B), C)
instead of (A, (B, (C)))
), since you can take a reference to a valid h-list representing any prefix of that type. You could also do something similar with extension traits on top of tuples (impl Prefix<(A, B)> for (A, B, C)
), but everything would have to be Copy
. This is fine for Polonius I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation!
This PR aims to make
datafrog
less dependent on(Key, Value)
tuples, lessening the burden of having to create lots and lots of intermediary variables that do little more than move an element to the front of the tuple, so it can be used as key.Before
As such where you currently have to introduce auxiliary variables of
(Key, Value)
tuples:After
… with this PR you simply extract the key from an arbitrary tuple index using a closure:
This cuts down the maintenance burden of having to keep several additional auxiliary variables and relations up-to-date, reducing the mental overhead and brings your datafrog Rust code a little bit closer to their corresponding datalog rules.
@frankmcsherry @nikomatsakis I hope that these changes are not in conflict with the soundness of the API of datafrog or the semantics of Datalog?
Performance
I did some performance testing with criterion, but did not find any consistent and noteworthy performance regressions for existing APIs and found the overhead of the
…_by
method variants minimal.Known limitations
Unfortunately this PR only allows lenses to return references to arbitrary elements of the tuple (via
Accessor1: Fn(&Tuple1) -> &Key
), but no owned ad-hoc values.I had initially hoped to be able to allow for arbitrary Key values, such as ad-hoc tuples or other computed values. This would have allowed for a returning
(&X, &Y)
as key for(X, Y, Z)
tuple, rather than just single individual elements of it. Support for arbitrary tuples as keys would have been very convenient for scenarios with complex n-ary keys such as borrow_check.rs.Changing
Accessor1: Fn(&Tuple1) -> &Key,
toAccessor1: Fn(&Tuple1) -> Key,
unfortunately causes accessors that return a reference to the tuple or elements of it, such as|tuple| &tuple
or|tuple| &tuple.0
to trigger this language limitation. Unfortunately those are the most commonly used accessors and I'm not aware of a workaround that would allow both, arbitrary keys and borrowed tuple element keys.Minimal reproducible snippet
Error:
This is part #3 of a stacked pull-request and adds upon #33: