Agentic Ai Voice Mode

Agentic Ai Voice Mode

Duration:

Duration:

7 months (2025-26)

7 months (2025-26)

Teams:

Teams:

CXD Noida & CRI Bangalore

CXD Noida & CRI Bangalore

Role:

Role:

Benchmarking, concept exploration, user flows, Prototyping, UI design, motion design, UI validation

Benchmarking, concept exploration, user flows, Prototyping, UI design, motion design, UI validation

What is it?

What is it?

Context

Context

India's smart home market is growing fast, but voice control hasn't caught up. Most users rely on Alexa or Google Home-platforms that require setup, speak primarily in English, and have no native understanding of the Havells device ecosystem. For the many households that have already invested in Havells appliances, controlling them by voice means jumping between apps, linking accounts, and learning a new command syntax that doesn't match how they naturally speak.

Why it exists?

Why it exists?

Problem Statement

Problem Statement

The brief for Voice Mode was to build an AI powered voice agent that feels native to the Havells One app, works with zero configuration, and speaks the way Indian households actually speak. No jumping between apps. No Alexa. No Google Home. No account linking. Just talk.

Control screen of a smart switch plate, Air Purifier and a fan

Control screen of a smart switch plate, Air Purifier and a fan

Who is it for?

Who is it for?

Consumer Study

Consumer Study

Before touching a single frame, we spent 2 weeks auditing how existing voice assistants behave in a smart home context. We looked at Alexa +, Google Home, LG, and a handful of regional players like Atomberg to understand where the experience falls apart for Indian users.

The Multitasking Parent is walking in with a sleeping toddler, hands literally full. She needs to turn on the AC and dim the lights without unlocking her phone or navigating an interface. A direct voice command resolves the whole situation in one breath.

The Elderly User — a grandfather character — is frustrated by small text and complex menus. The breakthrough insight here was language: when he says "bahar ki light on kar do," the system should respond in kind. "Ji, light on kar di hai" isn't just a translation — it's a signal that the technology understands him on his terms

The Allergy Sufferer is returning home after a day in a dusty city and needs clean air immediately. "Ghar ki hawa saaf kar do, bahut dhool mitti hai aaj" — the system infers that what's needed is the air purifier set to fast speed, without requiring the user to name the device or the setting.

Insights

Insights

These user stories drove concrete product decisions. Multilingual support, contextual inference, preset orchestration, and the conversational tone of AI responses all trace back directly to a specific persona's need.

These user stories drove concrete product decisions. Multilingual support, contextual inference, preset orchestration, and the conversational tone of AI responses all trace back directly to a specific persona's need.

What we built?

What we built?

Solution

Solution

Voice Mode introduces an agentic AI assistant within the Havells One app that allows users to interact with their smart home using natural, conversational language. 

Instead of manually controlling devices or creating automations, users can simply express what they want to achieve. The system interprets user intent, understands contextual cues, and orchestrates multiple devices seamlessly to deliver the desired outcome. 

Discovery

Discovery

Adding Widget/Happy flow

Adding Widget/Happy flow

Feedback

Feedback

Error cases

Error cases

Motion Design

Motion Design

Static screens are easy to design. What Voice Mode actually needed was a system of motion that communicated three distinct states — idle, active, processing — through changes in the oval's feel without ever being distracting. The principle we kept returning to was "motion with intent." Every transition had to serve a communicative purpose: state change, feedback, or emotional tone.

Listening

Working on it

Start-Up

What we chose?

What we chose?

Design Trade offs

Design Trade offs

Controlled Beta Rollout vs Full Release

Controlled Beta Rollout vs Full Release

Conducted competitor analysis and benchmarking across voice assistants to understand interaction patterns, feedback mechanisms, and system responses.


This allowed us to iterate quickly based on feedback before scaling, rather than risking a suboptimal experience across the entire user base.

Conducted competitor analysis and benchmarking across voice assistants to understand interaction patterns, feedback mechanisms, and system responses.


This allowed us to iterate quickly based on feedback before scaling, rather than risking a suboptimal experience across the entire user base.

Tap-to-Speak vs Continuous Listening

Tap-to-Speak vs Continuous Listening

Instead of a continuous speech model, the first version used a tap-to-speak interaction, requiring users to press the mic for each command.

While this introduced an extra step, it ensured better control, reduced false triggers in early stages. A continuous, hands-free model is planned for the future.

Instead of a continuous speech model, the first version used a tap-to-speak interaction, requiring users to press the mic for each command.

While this introduced an extra step, it ensured better control, reduced false triggers in early stages. A continuous, hands-free model is planned for the future.

What changed?

What changed?

Key metrics

Key metrics

Voice Mode activation rate (% of eligible sessions where Voice Mode was opened): [X]% Task completion rate (commands resulting in a device state change): [X]% Multi-device command rate (commands affecting 2+ devices in a single session): [X]% Repeat activation rate (users returning to Voice Mode in a second session): [X]%