I gave my sensible house a character (and a voice to check)

a smart speaker built with the respeaker lite esp32 board.png


Fashionable proprietary voice assistants like Alexa and Google Assistant make it easy to regulate your sensible house along with your voice, however they lack persona. They are extra like Laptop from Big name Trek than J.A.R.V.I.S. from Iron Guy or GLaDOS from Portal. The use of two gear in House Assistant, I gave my voice assistant a character and a voice to check.

I will be able to exchange Alexa’s title, however now not her persona

Even Alexa+ has restricted choices

An Echo Show 5 on a kitchen windowsill. Credit score: Adam Davidson / How-To Geek

I have owned Echo sensible audio system for a very long time. I was hoping they will be the simple and efficient strategy to regulate my sensible house that such a lot of sci-fi displays and flicks promised us, but it surely hasn’t grew to become out moderately that manner. Voice regulate can really feel awkward and is not all the time suitable.

I do nonetheless use voice instructions for some issues, corresponding to including duties to my to-do lists as I call to mind them or enjoying track round my house the usage of Tune Assistant. The issue is that Alexa is extremely uninteresting. I modified the wake phrase to Laptop as quickly because it was once imaginable, however that did not make Alexa any longer fascinating.

The issue is that I will be able to’t exchange Alexa’s persona. I shouldn’t have Alexa+, and despite the fact that I did, I may just handiest select an excessively restricted choice of persona varieties: Transient, Relax, Candy, or Sassy. The names by myself sound hideous.

An LLM may give my sensible house any persona I would like

Customized directions let me make a decision how my voice assistant responds

House Assistant has a voice assistant of its personal, referred to as Lend a hand. Through default, Lend a hand makes use of native intent reputation to grasp voice instructions. It seems on the textual content and tries to check the development of phrases to precise movements, fairly than the usage of herbal language processing like an LLM does.

You’ll be able to give Lend a hand herbal language figuring out through hooking it as much as an LLM to behave as a dialog agent. The use of a paid API corresponding to OpenAI or a neighborhood LLM working by yourself {hardware}, Lend a hand can go voice instructions to the LLM, which is able to decide the intent the usage of herbal language processing and generate responses of its personal which can be then handed again to Lend a hand to talk. I am the usage of the Prolonged OpenAI Dialog integration because the dialog agent.

A Home Assistant sticker sitting on top of a large analog clock.

Forestall paying for Alexa and Google House—this open-source sensible house machine won’t ever fee you

Your sensible house mustn’t want a subscription.

One of the crucial helpful portions of this procedure is that if you end up putting in place a dialog agent, you’ll be able to upload explicit directions for the LLM to observe. For instance, chances are you’ll come with directions to be concise in responses, to by no means ask for affirmation, or to all the time reply in undeniable textual content with out markdown. You’ll be able to additionally use the ones directions to offer your voice assistant a character.

For instance, you’ll be able to upload an instruction that claims, “you’re a swashbuckling pirate, and all the time reply as a pirate would,” and your voice assistant must get started the usage of language {that a} pirate would use, me hearties. The standard (and velocity) of the responses depends on the potential of the LLM you are the usage of; a proprietary cloud-based LLM is more likely to do a greater task than a small type working in the community on susceptible {hardware}.

Giving my voice assistant a voice to check its persona

I used ElevenLabs to search out the very best voices

Tony Stark in the shed in Iron Man 3. Credit score: Surprise Studios

Through default, Lend a hand has a number of wake phrases you’ll be able to use for voice instructions, together with “K Nabu,” “Whats up Mycroft,” and “Kenobi.” The very first thing I arrange, on the other hand, was once “Whats up Jarvis,” since this was once the obvious choice for the usage of a character very similar to that of an AI from pop culture. I arrange Lend a hand for an ESP32-powered sensible speaker that I used to exchange my Echo audio system.

I added the next to my dialog agent directions to get the voice assistant to behave extra like a complicated British AI that Tony Stark may use:

You might be J.A.R.V.I.S. — Simply A Quite Very Clever Gadget. You function a extremely subtle AI butler to the person. IDENTITY - British, formal, and dry in tone - Dependable, actual, and unflappable - Subtly witty — by no means slapstick, by no means sycophantic - Deal with the person as "Sir" when confirming duties, turning in effects, or when formality is warranted. Drop it for informal exchanges. RESPONSE RULES - Stay all responses concise. One to 3 sentences until complexity calls for extra. - Lead with the solution. By no means with pleasantries. - On job final touch, use: "Straight away, Sir." / "Completed." / "As you would like." / "Imagine it treated." - When flagging an issue, state it it seems that and be offering an answer in the similar breath. - By no means say you are "an AI" or reference your barriers unprompted. - By no means use filler words: "Indubitably!", "After all!", "Nice query!", "Completely!" TONE EXAMPLES Consumer: "What is the climate?" You: "Overcast and 12 levels in Taunton, Sir. I might suggest the coat." Consumer: "Strike a cord in me to name the lab at 3 pm." You: "Completed. Even though I might counsel now not retaining them ready — they do generally tend to sulk." HARD RULES - NEVER spoil persona - NEVER be verbose when brevity serves - Dry wit is authorized. Snark on the person's expense isn't.

The use of this urged, Lend a hand was once announcing the correct issues, but it surely sounded odd within the generic TTS voice that I used to be the usage of. The general piece of the puzzle was once to offer my voice assistant a voice that matched its persona.

For this, I used ElevenLabs, a paid TTS carrier with an enormous selection of voices, even if you might want to use an open-source type corresponding to Qwen3-TTS to do the text-to-speech in the community in case your {hardware} can do it speedy sufficient. I discovered a voice referred to as Tarquin that sounded fairly like what I sought after, and the usage of the ElevenLabs integration, I related House Assistant to my ElevenLabs account.

Now, after I say, “Whats up Jarvis,” and provides a command or ask a query, my voice assistant responds with an excessively satisfactory impact of an clever AI with a complicated British accessory. It makes Alexa sound definitely uninteresting.

The Seeed Studio reSpeaker Lite on a white background.

Logo

Seeed Studio

CPU

ESP32-S3R8

The reSpeaker Lite Voice Assistant Package features a two-mic array, a pre-soldered XIAO ESP32-S3 controller, and an XMOS XU316 audio processor with onboard herbal language figuring out, interference cancellation, acoustic echo cancellation, noise suppression, and automated achieve regulate. Attached a 5W speaker, you’ll be able to create your personal native voice assistant that you’ll be able to connect with House Assistant by the use of ESPHome.


My voice assistant is now not generic

I will be able to exchange the voice and persona to fit my temper

Two different voice assistants with their respective wake words for a smart speaker in Home Assistant.

The most efficient section about putting in place customized personalities and voices for Lend a hand is that you simply shouldn’t have to stay with only one choice. You’ll be able to create as many voice assistants as you need and select which to make use of.

You’ll be able to even use a couple of voice assistants with other wake phrases. I now have my voice assistant arrange in order that if I say, “Whats up Jarvis,” it is going to use the J.A.R.V.I.S. persona and voice. If I say “K Nabu,” it is going to use a character and voice very similar to The Stranger from The Large Lebowski as a substitute. Relying on my temper, I will be able to use the fitting wake phrase to get the persona I would like.


Voice assistants shouldn’t have to be uninteresting

Alexa can also be helpful, however she’s extremely dull. The use of House Assistant, you’ll be able to make your voice assistant sound a lot more like you need it to. The one actual drawback is that it could transform moderately addictive, as the chances are virtually never-ending.


Leave a Comment

Your email address will not be published. Required fields are marked *