Alexa / Voice / AI Chat -

What does user experience look like without a mouse & keyboard or easy to enter form fields? How do people talk to AI systems to solve problems? These systems can be through devices like Alexa or virtual systems like automated chatbots. Understanding psychology and human behavior become just as valuable as business knowledge when designing these systems. Because of this we will see new software architectures come into play that deal with these variations and have multiple and guided user flows.

There are two main components of developing an apps for Alexa, intent schemas and utterances.

Intents: An intent represents an action that fulfills a users spoken request.
Utterances: A likely set of spoken phrases that are mapped to intents.

Here’s a sample intent for a few commands I’ve used to control my house.

{
  "intents": [
    {
      "slots": [
        {
          "name": "Command",
          "type": "LIST_OF_COMMANDS"
        },
        {
          "name": "OnOff",
          "type": "LIST_OF_ONOFFS"
        }
      ],
      "intent": "SendJarvisCommand"
    },
    {
      "intent": "AMAZON.HelpIntent"
    }
  ]
}

Intents also have slots. For example:

LIST_OF_COMMANDS --> lights | porch | yard | backyard | frontroom | hallway | garage | basement | workbench | wemo | bedroom | tv room	
LIST_OF_ONOFFS --> on | off | dim

The list of utterances, or ways users can interact with this data might look like:

SendJarvisCommand turn the {Command} {OnOff}
SendJarvisCommand switch the {Command} {OnOff}
SendJarvisCommand {Command} {OnOff}

This text is then passed to a lambda function where it needs parsed and then returns a response.

So your traditional web form is converted like this.

Form Target – Becomes Intent (SendJarvisCommand) which is mapped to code.
Droplist & form fields – Becomes Slots
Utterances – not in traditional apps.

Because utterances don’t match up to anything in existing form fields most businesses and user experiences have holes that need filled. You can also chain intents to one another creating user flows that are out of order.

So whether it’s voice or AI chat you need architectures that deal with this dynamic workflow, getting some of the data at unexpected points during a conversation, then re-prompting the user in a dynamic way to solicit input required to complete the task.

Whether it’s retail for a shopping assistant or a chatbot to help you reset your password it’s really fun time because we need to invent something new!