You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question: how to calculate the correct path rate based on the tool execution log, for example, for the query and golden solution path:
{
"query": "give me the number of movies directed by Sofia Coppola",
"solution": [
"GET /search/person",
"GET /person/{person_id}/movie_credits"
]
},
I have the following execution log:
Query: give me the number of movies directed by Sofia Coppola
Planner: search for the person with the name "Sofia Coppola"
API response: Sofia Coppola's person_id is 4567
collect the list of movies directed by Sofia Coppola whose person_id is 4567
API response: The list of movies directed by Sofia Coppola includes The Virgin Suicides, Lost in Translation, Marie Antoinette, Somewhere, The Bling Ring, A Very Murray Christmas, and The Beguiled.
count the number of movies in the list directed by Sofia Coppola
API response: The number of movies directed by Sofia Coppola is 7.
Thought: I am finished executing a plan and have the information the user asked for or the data the user asked to create
Final Answer: Sofia Coppola has directed a total of 7 movies.
API Selector: GET /search/person to search for Sofia Coppola
API response: Sofia Coppola's person_id is 4567
Instruction: Continue. Collect the list of movies directed by Sofia Coppola whose person_id is 4567.
GET /person/4567/movie_credits to collect the list of movies directed by Sofia Coppola.
API response: The list of movies directed by Sofia Coppola includes The Virgin Suicides, Lost in Translation, Marie Antoinette, Somewhere, The Bling Ring, A Very Murray Christmas, and The Beguiled.
Instruction: Continue. Count the number of movies in the list directed by Sofia Coppola.
No API call needed.
Final Answer: Sofia Coppola has directed a total of 7 movies.
So how should I determine whether the actual execution path matches the golden standard?
The text was updated successfully, but these errors were encountered:
Hi, @Yifan-Song793
Thanks for releasing this dataset and code.
I have a question: how to calculate the correct path rate based on the tool execution log, for example, for the query and golden solution path:
I have the following execution log:
So how should I determine whether the actual execution path matches the golden standard?
The text was updated successfully, but these errors were encountered: