Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assist Intents #93

Open
JosephAbbey opened this issue Nov 17, 2024 · 1 comment
Open

Assist Intents #93

JosephAbbey opened this issue Nov 17, 2024 · 1 comment
Assignees

Comments

@JosephAbbey
Copy link

JosephAbbey commented Nov 17, 2024

The Extended Open AI conversation integration works really well, but the true HomeAssistant way is to use an Intent. This allows any conversation agent (Assist or any AI) to use the features.

Warning

Creating custom intents in a custom_component is still slightly annoying, as users have to manually copy the custom_sentences file (which defines how the default conversation agent recognises the intents).

This does not apply to AI conversation agents.

Tip

I have some example custom intents: JosephAbbey/ha_custom_sentences

Ask about events

The Intent

This is easy, it just takes a start and end time as input and returns the calendar events, in fact I have an intent in my examples that does just this, however it is optimised for future events and relative time, so a dedicated tool would be very useful.

The Sentences

This would have to be an AI-only intent as there is not standard format for queries.

Ask about the current state

The Intent

The general format of the intent is quite strait forward:

  • Accepts an entity (camera or image), a prompt and possibly a provider.
  • Has a description to explain to AI conversation agents what the tool does.
  • Has custom_sentences that allow Assist to recognise the intent.

However, the specifics are interesting;

  • A provider and a model need to be specified:
    • This could be a global configuration.
    • Or it could be passed as part of the query (e.g. "Ask Groq Llama who is on the doorbell camera.")
  • All of the configuration options (definitely a global setting):
    • remember
    • duration
    • frames
    • width
    • detail
    • temperature
    • max tokens
    • expose images

The best way I can see of allowing a user to specify a provider in the request is to create some sort of vision.* entity, which stores the provider, model, and all of the configuration. Then the intent accepts the vision entity as input (this also benefits the yaml mode for service calls as an entity id can be specified instead of a provider id). That however is a large breaking change and restructure for the project.

The other option is just to have global configuration options for the intent.

The Sentences

I think that the general format of the sentence will be:

Who is on the door bell camera?

Where we match the phrase on the and the camera entity door bell camera. Then either the whole sentence is used as the prompt or just the Who is, I prefer the former.

I don't have a good way for built-in intent recognition to process a sentence like:

Who is at the door?

This is easy with an AI agent as the AI can recognise that the user has a door bell camera and call the intent as required.

Happy to help

I have written a bunch of intents before and am happy to help implement this if you would like to add this to your project.

@valentinfrlch
Copy link
Owner

Thanks a lot, this sounds like an amazing addition! I don't have any experience writing intents for HA Assist and have limited time atm. But if you'd like to send a PR, I'd be happy to review and help if you have any questions with llm vision.

Looking forward to collaborating!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants