Set Up Voice Pipeline

In this section we will configure your Satellite1 so it can control smart devices in your home intuitively. By the end of this section you should be able to say:

"Hey Jarvis, are any doors unlocked in the home?"
"Hey Jarvis, lock those doors please."
"Hey Jarvis, what is the difference between a black hole and white hole?"
"Hey Jarvis, close the garage door and turn on the TV then tell me a joke."

What is a Voice Pipeline?

Think about your interaction with any voice assistant:

Wake Word - You say a special phrase, like "Hey, Jarvis!"
Speech-to-Text - Your voice command is recorded and converted to a text transcription.
Conversation Agent - The text transcription is processed by a rules-based engine (or perhaps an LLM) which executes your command and generates a text response.
Text-to-Speech - The text response is converted into a synthetic voice response that is played back through the speaker.

That's a voice pipeline. It's the backbone of any voice assistant. Each step in a voice pipeline can be modified and customized to fit your needs. What wake word do you want? What language are you speaking? Do you want a standard voice response or to hear Arnold Schwarzenegger speak back to you? Do you want Home Assistant, Google, or OpenAI to process and execute your command? Follow the steps below to set up a voice pipeline for your Satellite1.

Create a Voice Pipeline:

Go to Settings -> Voice Assistants -> Add Assistant
Name your pipeline. Select your preferred Conversation Agent, Speech-to-Text, and Text-to-Speech engine.

Assign a Voice Pipeline & Wake Word

Go to Settings -> Devices & Services -> ESPHome and click "1 device" under your Satellite1 device.
In the Configuration section, select your Voice Pipeline.
You can also set your preferred wake word. (NOTE: If you want to build your own custom wake word, then read here)

Congratulations! You've created your own voice pipeline for your Satellite1. Read below to expose the device you want to control to your voice assistant and build even more advanced voice pipelines.

Exposing Entities

Your home assistant likely has hundreds if not thousands of entities. If you want to use them in your voice assistant, then you need to expose them to your voice assistant. Here's how:

Go to Settings -> Voice Assistants -> Expose and select the entities you want to control via your voice assistant.
Enable the "Assist" toggle switch and consider adding alias names that you might use when referring to the entity.

Standard Conversation Agents

There are two standard voice pipelines we recommend trying out to get your feet wet:

Home Assistant's Cloud Assist Pipeline (Requires paid Home Assistant Cloud account, response times are fast!)

Set up Cloud Assist Pipeline

Home Assistant's Local Assist Pipeline (Free and completely private, response times depend on your hardware)

Set up Local Assist Pipeline

Local AI Conversation Agents

The FutureProofHomes team is working on a Local AI Base Station that "just works". Therefore, these docs will avoid sending you down a deep local AI rabbit hole. :) Thanks for your patience & stay tuned!

Once you have a standard pipeline running, you can upgrade to a Generative AI conversation agent. This allows your voice assistant to respond to natural, conversational commands—such as “It’s dark in here” to turn on the lights—instead of requiring specific phrases like “Turn on the living room lights.”

Keep in mind that running fully local Generative AI LLMs is not for the faint of heart. Results can vary depending on the model you choose (e.g., Llama, Qwen), how many entities you expose and how they’re named, your GPU’s capabilities (more VRAM is better), and many other factors.

If you’re ready to take on the challenge, here are some tutorial videos to help you get started. Good luck!

Ollama AI Powered Conversation Agent (Free, requires a GPU, and can be hard to set up with simi-reliable function calling.)

Set up Ollama Conversation Agent

Cloud AI Conversation Agents

Warning: You are entrusting a cloud-based artificial intelligence that does not protect your privacy with control over your home. Proceed with caution!

Google AI Conversation Agent (Free, but will collect your data.)

Set up Google AI Conversation Agent

OpenAI ChatGPT Conversation Agent (Expensive, and not open at all, despite the marketing name, and will collect your data.)

Set up OpenAI ChatGPT Conversation Agent

NOTE: The following prompt has perfomed well with both OpenAI and Google's conversation agents.

Your name is Jarvis and you are a voice assistant for Home Assistant.
Answer questions about the world truthfully.
Answer in plain text without markdown language. 
Keep responses simple and to the point.
Always use 12hr time formats.

Combine Conversation Agents

Combine a standard conversation agent with an AI conversation agent (It's like magic!)

A "fall back" pipeline will first use a non-AI conversation agent to process your request, and if that fails it will fall back to your preferred AI conversation agent. Combining these two conversation agents results in an almost magical voice experience and is highly recommended.

Simply toggle on the "Prefer handling commands locally" switch underneath your Generative AI conversation agent: