Replies: 2 comments 3 replies
-
This is really powerful, @abrichr, thanks for opening this to discussion. As far as the actual architecture is concerned, I have no major comments to add. Congratulations on such an elaborate piece! In terms of driving development progress and adoption, I would add a few comments, if that helps:
|
Beta Was this translation helpful? Give feedback.
-
Thanks a lot for putting all of this together. What an effort ! I will try to review this in multiple parts as it is so complex to do it in one pass. Overall: The decoupling of steps make sense totally. In a previous discussion long time ago, I mentioned that may be we need a "chain of actions" to be made for each steps, now, with "chain-of-code" logic, this makes sense. I think, to reason in a "divide-and-conquer" mindset, the problem can be splited in to small step that:
I think the problem above is the nucleus of all the chain. If we manage to solve this, then the whole action chain can be composed little by little. This is also what you mean if I assume correctly with the graph and the description. Now, focusing on this base case, we need to discuss in details about the input. The abstract formula is that: given a screen, represent it so that the model can understand. Then the model can "reason" in Chain of Code to produce the next action. The key point to discuss here is:
Failing to do 2 points above and we would have hallucination, as the LLM can give us anything. So my thoughts is that, somehow we must find a way to "translate" the UI using our own ontology to feed as inputs into the Chain-of-Thoughts models. With this solved, we can start chaining things to together and think about complex case of fine-tuning for example (this is where the decision on the next action depends on not only the current view, but also the previous views and all the orders which represent the "intent" of the user ) |
Beta Was this translation helpful? Give feedback.
-
We are inviting the community to comment on our proposed approach to AI-First Process Automation:
https://github.com/OpenAdaptAI/OpenAdapt/wiki/OpenAdapt-Architecture-(draft)
Please feel free to point out limitations and/or suggest alternatives. Thank you for your contributions!
(Also please note this is a living document, and is undergoing ongoing modifications.)
Beta Was this translation helpful? Give feedback.
All reactions