June 2017

Alexa / Voice / AI Chat

Screen Shot 2017-06-27 at 7.29.21 AM

What does user experience look like without a mouse & keyboard or easy to enter form fields? How do people talk to AI systems to solve problems? These systems can be through devices like Alexa or virtual systems like automated chatbots. Understanding psychology and human behavior become just as valuable as business knowledge when designing these systems. Because of this we will see new software architectures come into play that deal with these variations and have multiple and guided user flows.

There are two main components of developing an apps for Alexa, intent schemas and utterances.

  • Intents: An intent represents an action that fulfills a users spoken request.
  • Utterances: A likely set of spoken phrases that are mapped to intents.

Here’s a sample intent for a few commands I’ve used to control my house.

{
  "intents": [
    {
      "slots": [
        {
          "name": "Command",
          "type": "LIST_OF_COMMANDS"
        },
        {
          "name": "OnOff",
          "type": "LIST_OF_ONOFFS"
        }
      ],
      "intent": "SendJarvisCommand"
    },
    {
      "intent": "AMAZON.HelpIntent"
    }
  ]
}

Intents also have slots. For example:

LIST_OF_COMMANDS --> lights | porch | yard | backyard | frontroom | hallway | garage | basement | workbench | wemo | bedroom | tv room	
LIST_OF_ONOFFS --> on | off | dim

The list of utterances, or ways users can interact with this data might look like:

SendJarvisCommand turn the {Command} {OnOff}
SendJarvisCommand switch the {Command} {OnOff}
SendJarvisCommand {Command} {OnOff}

This text is then passed to a lambda function where it needs parsed and then returns a response.

So your traditional web form is converted like this.

  • Form Target – Becomes Intent (SendJarvisCommand) which is mapped to code.
  • Droplist & form fields – Becomes Slots
  • Utterances – not in traditional apps.

Because utterances don’t match up to anything in existing form fields most businesses and user experiences have holes that need filled. You can also chain intents to one another creating user flows that are out of order.

So whether it’s voice or AI chat you need architectures that deal with this dynamic workflow, getting some of the data at unexpected points during a conversation, then re-prompting the user in a dynamic way to solicit input required to complete the task.

Whether it’s retail for a shopping assistant or a chatbot to help you reset your password it’s really fun time because we need to invent something new!

3 Trends about to take off

alexa-small2017 is halfway over and while some of these trends started a few yeas ago I think we’re about to see a few of these really catch-on, and by catch-on, I mean go mainstream and get funded.

Over the next week I’m going to discuss these three trends in a little more details and provide some code or projects to help you get started!

Trend 1 – Alexa / Voice / AI Chat

What does user experience look like without a mouse & keyboard or easy to enter form fields? How do people talk to AI systems to solve problems? These systems can be through devices like Alexa or virtual systems like automated chatbots. Understanding psychology and human behavior become just as valuable as business knowledge when designing these systems. Because of this we will see new software architectures come into play that deal with these variations and have multiple and guided user flows.

Trend 2 – JS/UI Architecture

SPAs (Single Page Applications) built on frameworks like Angular and React changed the way we think about building software. But for the longest time it was code, code, code without much thought of how it’s going to scale, and how it’s going to be supported in the enterprise. These systems need to hang around and be maintainable for 10 years, refactoring your apps should not have to entail complete system rewrites. Having architectures for UIs that allow for modular building and refactoring are crucial for adoption of these technologies that change every 9 months.

Trend 3 – Augmented Reality

There’s more and more data about the world around us. Why not provide more ways for people to interact with this data? Why make everyone have to look something up on a browser or an app? Voice combined with computer vision begins to bridge this gap. Google Glass still might not be the right use case, but camera augmentation, heads-up displays in cars, smart kiosk and push voice will be where this begins to have application.

Scroll to Top