Skip to content

Smart Voice Control

Voice Local Privacy

With Home Assistant Assist you can control your smart home with your voice - 100% locally without cloud! No data leaves your home, and it works even when the internet is down.


ComponentFunctionLocal Options
Wake WordListens for activation wordopenWakeWord, microWakeWord
STTSpeech to textWhisper, Speech-to-Phrase
IntentUnderstands commandsHA Conversation, LLM
TTSText to speechPiper, Home Assistant Cloud
SatelliteMicrophone/speakerESP32, Voice PE
DevicePriceDisplayMicrophoneSpeakerWake Word
ATOM Echo~$13✅ (small)On-device
S3-BOX-3~$50✅ Touch✅✅On-device
Voice PE~$59✅✅✅✅On-device
CoreS3SE~$70✅ Touch✅✅On-device

# Settings → Add-ons → Add-on Store
# Search "Whisper" → Install
# Configuration (Settings tab):
model: small # tiny, base, small, medium, large
language: en # Your language
# Start add-on
# Wait for model download (can take time)
# Settings → Add-ons → Add-on Store
# Search "Piper" → Install
# Configuration:
voice: en_US-lessac-medium # English voice
# Start add-on
# Settings → Add-ons → Add-on Store
# Search "openWakeWord" → Install → Start
# Supported wake words:
# - "Ok Nabu"
# - "Hey Jarvis"
# - "Alexa"
# - "Hey Mycroft"

Price: ~$13

The cheapest voice satellite:

  • ✅ ESP32 based
  • ✅ Built-in microphone + speaker
  • ✅ On-device wake word (microWakeWord)
  • ✅ LED status indicator
  • ✅ Easy web setup
  • ⚠️ Small speaker (low volume)
  • ❌ No display
# 1. Go to: https://www.home-assistant.io/voice_control/thirteen-usd-voice-remote/
# 2. Click "Connect" in Chrome/Edge
# 3. Select COM port
# 4. Click "Install Voice Assistant"
# 5. Enter WiFi credentials
# 6. Device appears in HA
substitutions:
name: living-room-voice
friendly_name: "Living Room Voice Assistant"
micro_wake_word_model: hey_jarvis
packages:
m5stack.atom-echo:
url: https://github.com/esphome/firmware
files:
- voice-assistant/m5stack-atom-echo.yaml
refresh: 0s
esphome:
name: ${name}
friendly_name: ${friendly_name}
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password

Buy: M5Stack, AliExpress


Wake WordLanguageModel
”Ok Nabu”MultiopenWakeWord, microWakeWord
”Hey Jarvis”EnglishmicroWakeWord
”Alexa”MultimicroWakeWord
”Hey Mycroft”EnglishmicroWakeWord
# 1. Go to: https://www.home-assistant.io/voice_control/create_wake_word/
# 2. Choose a unique word (3-4 syllables)
# - Avoid common words
# - Only English supported currently
# 3. Generate training data with Piper
# 4. Train model (may take several attempts)
# 5. Download and install

automation:
# Broadcast to all speakers
- alias: "Voice - Good Morning Broadcast"
trigger:
- platform: time
at: "07:00:00"
condition:
- condition: state
entity_id: binary_sensor.workday
state: "on"
action:
- service: tts.speak
target:
entity_id: tts.piper
data:
media_player_entity_id:
- media_player.living_room_speaker
- media_player.kitchen_speaker
message: >
Good morning! It's 7 o'clock.
The temperature outside is {{ states('sensor.outdoor_temperature') }} degrees.
{% if states('sensor.rain_probability') | int > 50 %}
Remember your umbrella - there's a chance of rain.
{% endif %}
# Voice reminder
- alias: "Voice - Washing Machine Done"
trigger:
- platform: state
entity_id: binary_sensor.washing_machine_running
to: "off"
action:
- service: tts.speak
target:
entity_id: tts.piper
data:
media_player_entity_id: media_player.kitchen_speaker
message: "The washing machine is done. Don't forget to empty it."
# Welcome home
- alias: "Voice - Welcome Home"
trigger:
- platform: state
entity_id: person.brian
to: "home"
action:
- delay: "00:00:30"
- service: tts.speak
target:
entity_id: tts.piper
data:
media_player_entity_id: media_player.hallway_speaker
message: >
Welcome home!
It's {{ states('sensor.indoor_temperature') }} degrees inside.

# Lights
"Turn on the living room light"
"Turn off all lights"
"Set the kitchen brightness to 50 percent"
"Change the bedroom color to blue"
# Climate
"What's the temperature?"
"Set the thermostat to 72 degrees"
"Turn on the bathroom heater"
# Devices
"Turn on the TV"
"Start the vacuum"
"Lock the front door"
# Information
"What's the weather today?"
"When does the sun set?"
"Is anyone home?"
# Scenes
"Activate movie night"
"Good night"
"I'm leaving home"
# configuration.yaml or via UI
intent_script:
CustomWelcome:
speech:
text: "Welcome! What can I help with?"
conversation:
intents:
CustomWelcome:
- "hey [assistant]"
- "hello"
- "what can you do"

type: vertical-stack
cards:
# Voice status
- type: entities
title: "🎤 Voice Control"
entities:
- entity: assist_satellite.living_room_voice
name: "Living Room Satellite"
- entity: assist_satellite.bedroom_voice
name: "Bedroom Satellite"
- entity: binary_sensor.whisper_running
name: "Whisper Status"
- entity: binary_sensor.piper_running
name: "Piper Status"
# Test button
- type: button
name: "Test Voice"
tap_action:
action: call-service
service: tts.speak
target:
entity_id: tts.piper
data:
media_player_entity_id: media_player.living_room_speaker
message: "Voice control is working!"

Ofte stillede spørgsmål

How much RAM does local voice require?
Whisper 'small' model uses about 2GB RAM. 'Medium' uses 5GB. Raspberry Pi 4 with 4GB can run 'small'. For better performance a mini-PC or dedicated server is recommended.
Can I use multiple wake words?
Yes! Since HA 2025.10 you can have up to 2 wake words per satellite. You can use 'Ok Nabu' for one language and 'Hey Jarvis' for another pipeline.
Why is the response slow?
Usually the Whisper model is too large for your hardware. Try 'tiny' or 'small' model. Also check that you're running on a fast storage device (SSD, not SD card).
Can I use this with LLMs like ChatGPT?
Yes! You can configure your Assist pipeline to use OpenAI, Ollama (local), or other LLMs as the conversation agent for more natural interactions.
What's the difference between Whisper and Speech-to-Phrase?
Whisper is a general speech-to-text model that understands everything. Speech-to-Phrase is optimized for smart home commands and is faster/lighter, but more limited.


Last updated: December 2025


Kommentarer