Voice in the Metaverse
How a design tool for creating rich voice experiences for apps on Meta devices created a new developer community.
Joining Meta in 2021, I was excited to contribute to the metaverse!
Meta's Quest VR headset made the metaverse accessible to a wider audience through affordability. This widespread adoption pushed Meta to expand their immersive experience, focusing on developing tools to craft voice experiences and voice commands for apps by third-party developers.
Embracing the Metaverse Challenge
Team: Me (lead product designer), conversation designer, tech lead, and three engineers—later expanding to include a project manager, product manager, UX researcher, and content writer.
We had two primary goals:
Empower creators to design, test, and launch complete voice experiences on Meta devices
Help Meta's internal design teams craft engaging conversational experiences
Elements of a typical conversational flow
Basic Natural Language Processing (NLP) framework
Resilience Amidst Challenges
Despite the absence of official UX research support and a dedicated project manager, we rallied together, sharing management responsibilities. Through grassroots user research efforts, including conversations with designers, engineers, and product managers, we unearthed invaluable insights into how they work and what they desired in a conversation design tool. Among the insights collected, our top findings include the need for:
Reusable assets (both singular and grouped)
Testing and debugging baked-into the tool
An approval system for collaborative teams (determined to be a stretch goal)
A system that can be used by a wide spectrum of domain knowledge (beginners and experts)
The biggest pain point was the difficulty in quickly parsing details when collaborating to understand the scenario of actions.
Wit.ai state of design
Reviving Wit.ai for Modern Challenges
The goal was to add our feature to a platform. We had two choices: develop a new third-party platform or use an existing one called Wit.ai that had been largely unsupported since 2015 and was the foundation of our internal version of the tool that I was also a lead for. Despite its lack of continued support, Wit.ai, a platform for Natural Language Programming, was still actively used by many developers to integrate voice experiences. Thus, we decided to provide support for Wit.ai once again and announce our new feature at Connect 2022.
Top themes that emerged from our research
Understanding More with Official UX Research
After championing for new resources, we dove deeper by conducting remote 1-1 interviews with 8 conversation designers at Meta in collaboration with our dedicated UX Researcher. These designers, experienced in platforms like Alexa, provided valuable insights into creating voice experiences. From the research, we identified six key themes but focused on two due to time constraints:
Simplicity: Emphasis on visual simplicity and ease of use in tools.
Versioning: Need for a unified source of truth for versioning interactions, especially during testing.
Collaborative sketches for the Simplicity theme
Collaborative sketches for the Versioning theme
A novel idea to leverage an electrical engineering learning game’s feedback system for our troubleshooting experience
An Electrifying Breakthrough
I facilitated design charrette workshops to understand where the team saw each theme's significance and to ensure we could move forward with a shared vision. During these sessions, participants highlighted:
Simplicity: Importance of standardized patterns in conversational flows, simplified interactions amidst complex data, and user education in tool exploration.
Versioning: Strong demand for multiplayer support, clear statuses of logical components, and the ability to select content for publication to facilitate iterative updates.
A new idea, inspired by a children's electrical engineering (EE) learning game, was proposed during one of our feedback sessions. This idea provided an intuitive feedback model that communicates when a connection passes or fails, which the team enthusiastically embraced.
Conversational flow concept leveraging Unified Modeling Language (UML)
Basic Natural Language Processing (NLP) framework
I tested our new design language against artifacts collected from conversation designers in our research for organization practices
Designing for Clarity and Utility
Our research and collaboration with our conversation designers led to many explorations in design language. We experimented with traditional Unified Modeling Language (UML) and variations that grouped logic to create more visually compact flows. However, after testing with our team, we discovered that the conversational field is still relatively new and lacks a canonical design language. This pushed us to refine our approach, opting for a streamlined design with distinct color identifiers for enhanced usability.
Introducing Composer, a dynamic tool for crafting engaging voice experiences across Meta's devices. With its flexible platform, users can easily map out logical flows, enabling intelligent responses to user queries and enhancing interactions.
Basic tools are provided at the top left of the the interface
Domain/document actions are provided in the middle of the interface
Canvas/file actions are provided at the top right of the interface
Keyboard shortcuts are displayed on the bottom of the interface
Composer Canvas
When designing Composer, we aimed to craft an intuitive tool for conversation designers. By integrating zones for basic actions and typical keyboard shortcuts, we ensured designers could quickly acclimate to the interface, allowing them to focus on mastering new elements.
Experimenting with different design languages led us to conclude that a color-coded components library was the most user-friendly.
To enhance usability, we standardized the width of all components, simplifying information parsing and organization. For complex functions, users can add rows of actions, expanding components vertically without sacrificing clarity. To ensure that we met our goals, I tested our components by building conversational flows gathered as artifacts from my research.
Additionally, the snap-to-grid feature promotes better organization and prevents accidental component overlap, ensuring a cleaner, more efficient workspace. A characteristic that also derived from our EE learning game idea.
Validation is enabled when connecting a branch of a conversation to the power source
Semantic colors and toast messages are used to provide feedback from validation
Troubleshooting
Our validation interaction draws inspiration from an EE learning game where users connect a power source to various devices using conductive strips. This concept, born from a collaborative brainstorming session, resonated well with our team.
Inspired by this novel idea, we designed our system to select conversational flows for validation by connecting branches of conversations to a power source. To validate a conversational flow, users connect the starting branches they wish to validate and press a button to initiate the process. The system scans for common errors or conflicts, providing guidance on resolving these issues. We also provide semantic colors to indicate pass or no pass statuses, ensuring clear and immediate visual feedback.
Once validation is completed and any issues are resolved, users can confidently publish their conversational flows to their applications.
Test each turn in a conversation and view the rendered functions
Conversational testing can be done within the tool in a familiar SMS experience
Conversational Testing
We envisioned a robust version control system; however, while developing this feature, we implemented an interim solution to meet immediate needs.
Published flows can be tested directly within the console. Users can access the production model and utilize the right panel to conduct conversational tests in an SMS-style interface. Inspired by our novel idea from the EE learning game, we designed the system to test each point in the conversation by rendering processed functions in the conversational flow as the user tests each turn. As users interact with the bot and receive responses, the conversational graph dynamically highlights the active pathways, enabling users to easily review and adjust their logic and responses. Additionally, voice dictation functionality and the ability to test unpublished conversational flows are planned for a future release, further enhancing our testing capabilities.
This solution ensures that our users can effectively test and refine their conversational flows while awaiting the full implementation of our planned features.
Triumph Amidst Global Reception
Our beta testing phase garnered enthusiastic responses from existing Wit.ai customers, sparking excitement within Meta's developer community worldwide. The crowning moment arrived at Meta Connect 2022, where Composer made its patent-pending debut alongside the new Quest Pro, marking a monumental achievement for our team.
Humbled by Creative User Outcomes
The growth of voice experiences since the launch of Composer has been remarkable, with many AAA titles for the Quest and various apps across Meta’s device ecosystem recognizing its value. These tools continue to be leveraged for purpose-built voice experiences in conjunction with Meta AI. Being part of a project that empowers creators to explore limitless possibilities has been incredibly humbling and rewarding. Witnessing the innovative ways users utilize Composer reaffirms the impact of thoughtful design and the boundless potential of user creativity.