Agentic Ai Voice Mode

Agentic Ai Voice Mode

Duration:

Duration:

7 months (2025-26)

7 months (2025-26)

Teams:

Teams:

CXD Noida & CRI Bangalore

CXD Noida & CRI Bangalore

Role:

Role:

Benchmarking, concept exploration, user flows, Prototyping, UI design, motion design, UI validation

Benchmarking, concept exploration, user flows, Prototyping, UI design, motion design, UI validation

What is it?

What is it?

Context

Context

India's smart home market is growing fast, but voice control hasn't caught up. Most users rely on Alexa or Google Home - platforms that require setup, speak primarily in English, and have no native understanding of the Havells device ecosystem. For the many households that have already invested in Havells appliances, controlling them by voice means jumping between apps, linking accounts, and learning a new command syntax that doesn't match how they naturally speak.

Why it exists?

Why it exists?

Problem Statement

Problem Statement

The brief for Voice Mode was to build an AI powered voice agent that feels native to the Havells One app, works with zero configuration, and speaks the way Indian households actually speak. No jumping between apps. No Alexa. No Google Home. No account linking. Just talk.

Control screen of a smart switch plate, Air Purifier and a fan

Control screen of a smart switch plate, Air Purifier and a fan

Who is it for?

Who is it for?

Consumer Study

Consumer Study

Before touching a single frame, we spent 2 weeks auditing how existing voice assistants behave in a smart home context. We looked at Alexa +, Google Home, LG, and a handful of regional players like Atomberg to understand where the experience falls apart for Indian users.

The Multitasking Parent walks in with a sleeping toddler, hands full - navigating an app isn't an option.

The Elderly User is frustrated by small text and complex menus, and needs the system to respond in Hindi when he speaks in Hindi.

The Allergy Sufferer says "ghar ki hawa saaf kar do" and expects the right device to activate at the right setting - without naming the device or the room they are in.

Insights

Insights

These user stories drove concrete product decisions. Multilingual support, contextual inference, preset orchestration, and the conversational tone of AI responses all trace back directly to a specific persona's need.

These user stories drove concrete product decisions. Multilingual support, contextual inference, preset orchestration, and the conversational tone of AI responses all trace back directly to a specific persona's need.

What we built?

What we built?

Solution

Solution

Voice Mode introduces an agentic AI assistant within the Havells One app that allows users to interact with their smart home using natural, conversational language. 

Instead of manually controlling devices or creating automations, users can simply express what they want to achieve. The system interprets user intent, understands contextual cues, and orchestrates multiple devices seamlessly to deliver the desired outcome. 

Discovery

Discovery

Understanding the intent

Understanding the intent

Adding Widget

Adding Widget

Feedback

Feedback

Error cases

Error cases

Motion Design

Motion Design

Static screens are easy to design. What Voice Mode actually needed was a system of motion that communicated three distinct states - idle, active, processing - through changes in the oval's feel without ever being distracting. The principle we kept returning to was "motion with intent." Every transition had to serve a communicative purpose: state change, feedback, or emotional tone.

Listening

Working on it

Start-Up

Start-Up

Listening

Working on it

What we chose?

What we chose?

Design Trade offs

Design Trade offs

Controlled Beta Rollout vs Full Release

Controlled Beta Rollout vs Full Release

A Beta rollout to 1000 users allowed us to iterate quickly based on feedback before scaling, rather than risking a suboptimal experience across the entire user base.

A Beta rollout to 1000 users allowed us to iterate quickly based on feedback before scaling, rather than risking a suboptimal experience across the entire user base.

Tap-to-Speak vs Continuous Listening

Tap-to-Speak vs Continuous Listening

While this introduced an extra step, it ensured better control and reduced false triggers in early stages. A continuous, hands-free model is planned for the future.

While this introduced an extra step, it ensured better control and reduced false triggers in early stages. A continuous, hands-free model is planned for the future.

What changed?

What changed?

Key metrics

Key metrics

Voice Mode achieved a 74% activation rate and 89% task completion rate in Beta, validating both the zero-setup architecture and the multilingual AI approach. The feature was demoed by me and my colleagues to Havells leadership and subsequently showcased at an internal Dealer's Meet event marking its first public-facing presentation before the official rollout on the Play Store and App store.

Voice Mode achieved a 74% activation rate and 89% task completion rate in Beta, validating both the zero-setup architecture and the multilingual AI approach. The feature was demoed by me and my colleagues to Havells leadership and subsequently showcased at an internal Dealer's Meet event marking its first public-facing presentation before the official rollout on the Play Store and App store.