-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Placeholder
and replace_grammar_node
in favor of pointer-like container object
#1007
Conversation
ping @mmoskal in case you have a better idea than serializing this as a singleton Side note: I suppose we don't even need a special type for this, as we could just use an empty |
guidance/_grammar.py
Outdated
@@ -243,6 +243,28 @@ def match_byte(self, byte): | |||
def max_tokens(self): | |||
return 1000000000000 | |||
|
|||
class Box(Terminal): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably wants a different name. Could re-use the old Placeholder
name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd only use Placeholder
for something which is actually going to be replaced (and not via that fun Python thing of resetting self.__class__
which Guidance used to do). I'd say Box
is a bit nondescript, though. DeferredGrammar
perhaps? Or DeferredReference
|
||
@value.setter | ||
def value(self, value): | ||
if self._resolved: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just making sure we know we're shooting ourselves in the foot if we try to set the same Box's value more than once. Not strictly necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Not strictly necessary" only applies to now, when you're writing this. In six months' time when you're hunting some obscure bug, this could make it much less obscure :-)
guidance/_grammar.py
Outdated
if self._resolved: | ||
return self._value | ||
else: | ||
raise ValueError("Box is not yet resolved") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: better error messages here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately, but regular guidance users shouldn't be going here
@@ -1133,6 +1125,14 @@ def process(self, node: GrammarFunction): | |||
"literal": "", | |||
} | |||
} | |||
elif isinstance(node, Box): | |||
if node.value is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually unnecessary as accessing .value
will raise an exception if it's not set...
Singleton Join is good. They are removed in Rust grammar optimization stage. |
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1007 +/- ##
==========================================
- Coverage 70.21% 61.25% -8.97%
==========================================
Files 62 62
Lines 4422 4426 +4
==========================================
- Hits 3105 2711 -394
- Misses 1317 1715 +398 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not certain on the name, but this isn't in the public API yet.
guidance/_grammar.py
Outdated
if self._resolved: | ||
return self._value | ||
else: | ||
raise ValueError("Box is not yet resolved") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately, but regular guidance users shouldn't be going here
guidance/_grammar.py
Outdated
@@ -243,6 +243,28 @@ def match_byte(self, byte): | |||
def max_tokens(self): | |||
return 1000000000000 | |||
|
|||
class Box(Terminal): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd only use Placeholder
for something which is actually going to be replaced (and not via that fun Python thing of resetting self.__class__
which Guidance used to do). I'd say Box
is a bit nondescript, though. DeferredGrammar
perhaps? Or DeferredReference
guidance/_grammar.py
Outdated
def __init__(self): | ||
super().__init__(temperature=-1, capture_name=None) | ||
self._resolved = False | ||
self._value = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing _value
is a GrammarFunction
?
|
||
@value.setter | ||
def value(self, value): | ||
if self._resolved: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Not strictly necessary" only applies to now, when you're writing this. In six months' time when you're hunting some obscure bug, this could make it much less obscure :-)
Essentially reopening #638 with some major simplifications allowed by the rust re-write (the serialized grammar now directly supports "references").
Reopening because:
#995 still takes far too long to process the large JSON schema. Line-profiling revealed that ~80% of the time spent constructing the
GrammarFunction
is due toreplace_grammar_node
. In particular, this function has to traverse the entire grammar tree, and we potentially have to call it many times if there is a lot of mutual recursion.Simply adding a node type that acts as a container (which we can fill after we decide what its contents should be) side-steps this problem entirely.