ChattBotz

I work with Emerging Tech – Build and Deploy an Amazon Lex Chatbot and Convert it to an Alexa Skill

Hey everybody, welcome to this talk on buildand deploy an Alexa chatbot and convert it to an Alexa Skill. My name is Sohan Maheshwar,and I’m a Developer Advocate for AWS working in the Benelux region. Let’s dive straight in. I’m so excited to be here today to talkto you about conversation interfaces, and that’s what the focusof today’s talk will be. We will talk a bit about Amazon Lexand the technology behind it, how it works, and what you needto do to build a Lex chatbot.We will talk a little bit about a few conceptsrelated to chatbots like intents, utterances, and slots. We will talk a little bitabout Amazon Alexa and how it isthe next major disruption in computing, and then we’ll end with a small demo. So let’s dive straight in. Now, I’m sure all of you havein some capacity or the other worked with personal computing, or else you probablywouldn’t be watching this talk today. And all of the personal computingthat we have interacted with so far has looked like what you’re looking aton the screen right now. It’s been either a mobile phoneor a laptop or a desktop computer, or you could even have interactedwith a thermostat or a remote control, a car entertainment system. Now, if you really noticethe interfaces and all of these have sort of been similar, and to give youa small history lesson on interfaces, well, it all started in the ’70s, when you hadthese small black and white monitors like the onethat you see on the screen, and you had just this keyboard,and you enter text-based commands to do something.Things evolved a little bita little later on in the mid-’80s when for the first timeyou had graphical user interfaces. You could actuallysee a file on the desktop, and you could movethis piece of hardware around that would translate onto the screen, and I’m talking about a computer mouse. And things evolveda lot more in the 2000s when for the first time you hadtouchscreens in your pocket, and that was mind-blowing back then. You know, you had this amazing devicethat would be in your pocket that could give access to the wealthof information in the world, and you could use a touchscreen,so you could zoom by doing that pinch to zoom, and all of these are interactionswe have on a day to day basis.But, guess what, these interactionsand these interfaces are interfaces that we have actuallylearned over a period of time. It’s not like any of these interactionsreally came naturally to us, but we learnt it to be able to talk, to be able to communicatewith computing. But thanksto all the advancements in tech, especially in the field of machinelearning and speech recognition and natural language understanding, we are at a stage nowwhere we can actually communicate with computing via a conversation, and we call theseconversational interfaces. Now, conversational interfacesare set to sort of change how we think and operate and behave, because, for the first time, as humans, we can actually communicate by chatting or talking, which was obviouslynot possible earlier.And this topic is very dear to me, because I worked in the Alexa teamfor over two years, and now I’m working in the AWS team. So I’ve workedvery closely in that team, and I sort of know the cutting edgetechnology that really goes into this. One thing I should always say isthat this paradigm really excites me, because I think it lowersthe barrier to access technology. Right now I’m surewe’ve all gone through this phase where we’ve helpedeither a parent or a grandparent or an elderly person with tech, and usually the conversation goessomething like this where you say, hey, click this, do this, swipe here,and this is what would happen. You’re sort of teaching thema series of steps to do something, but with conversation interfaces,all they need to know is to sort of know the language. And they’ll be able to accesstechnology that would have probably beena little difficult earlier.When it comes to conversationalaccess and tech, I think the most important tenantis for it to be natural. We as humanshave to be able to speak naturally, and it’s the job of the computerto really understand what’s going on, and it’s the job of youas a developer or a technologist to design the chatbot or to design the Alexa experiencein such a way that the skill or the chatbot’sable to understand what the human is really saying. Now, a line I often useis that I think so far we as humans have been forcedto think like computers.We shouldn’t be thinkingin terms of drop down menus and radio buttons, but, you know, that’show we’ve interacted with our tech. But with conversational interfaces, I think computers are being forcedto think like humans, which I think is very powerful. You also want your conversationalaccess to be on demand, especially in caseslike customer support or informational chatbots. You want it to be on demand. You don’t want it to be like, say,email where you send an email, you wait for two to four daysfor a response back. Conversation allows that instantsort of transfer of information that you really want. You also of coursewant it to be accessible, so you don’t want it to just be on, say, a platform or a website where you needlike five levels of access to get to. You need it to bewhere the people really are, and I think all of this entails in the factthat it has to be efficient. It has to be efficientfrom a tech point of view and from a designpoint of view as well.So Amazon Lex is a service that Amazon has built for building conversation interfacesusing voice and text, and we will talka little bit about how Lex works, its benefits, its features, and we’ll talk a little bit abouthow you can actually build something using Amazon Lex. Amazon Lex basicallyoffers the complete solution when it comes to buildinga conversational interface, so you don’t need to havea machine learning background to be into speech recognitionor natural language processing. All you need to knowis to know a little code so that you can build a really niceconversational experience, so Lex takes care of thingslike speech to text and speech recognition and so onand also the dialog management. You can actually deploy itto different places, and we’ll talka bit about that in a while. Lex is completely scalable and connects to a whole bunchof AWS technologies as well, so you don’t have to worry about, okay, catering to one userversus a million users.It has a lot of security servicesthat it works with, so you don’t have to worryabout things like personalisationor even authentication, and of course you get greatanalytics about how people are actually chatting or how peopleare talking to your interface or your chatbot,so you can really iterate on making it a lot better. So let’s go to someof the features of Lex, and I think one of the best features is the fact that you can buildthis chatbot just once, right, the logic of how the front endand the back end works, and, again,we’ll talk about that later, and you can deploy itto multiple platforms. So you can deploy it to mobile, you can deploy it to web, also popular messaging platformslike Slack and Facebook and Kik and Twilio SMS as well.And the demo’sgoing to show you just that. Lex is of coursedesigned for builders, it’s efficient, it’s intuitive, in fact, very recently,the Flemish government actually built a chatbot using Lex to answer questions from the citizensabout the COVID-19 situation. So, as you can see,it’s designed for builders. you can really use it to reallyscalable chatbots very quickly. Also, Lex is Enterprise ready, so, if you have a bunch of SaaS toolsthat you want connected, maybe you wanta chatbot that’s internal that just connects to sort of metricswithin your organisation, or maybe it is something to helpnew joinees in your organisation, it can automatically connect them.You’ll have to connect it of course,but it has options to connect it to other SaaS systems as well. And of course there iscontinuous learning, so, as more people use the chatbot and more people use this experiencethat you’ve build, the better it gets. So over time you can makea really, really powerful chatbot that really understandsyour customers really well. All right, so that was about why you needto think about using Lex. So let’s talk a little bitabout their design workflow and how Lex actually works. So on one side youhave your customers, your customers could be on mobile,they could be on the web, maybe it’s via IoT, or they could be on any popularmessaging platforms such as Twilio and Kik and Slack. Now, when they use a Lex chatbot,a few things happen. One is as a developeryou can actually choose whether you want to authenticatethem before they start the bot.Now, sometimes youmay want to authenticate them, especially if there isan account associated with what informationthey’re trying to get. But a lot of informational chatbotsdon’t need authentication, so this is somethingthat you can absolutely choose to do. Cognito is an Amazon service that actually takes care of a lotof this authentication and identity, so you can use Cognitoto authenticate your users, you can even useCloudWatch to get metrics on how peopleare actually using your chatbot. Now, here comes the interesting part. So, when someone talks to your chatbot, Lex basically uses two technologies, both of which are based on the samedeep learning that powers Amazon Alexa. The first one is if your useruses speech to talk, then there is somethingcalled speech recognition or automatic speech recognition, ASR, which converts speech to text.Now, it just doesn’t convertspeech to text, but it does so with a lot of context, because when it comes to speech, two words can sort of sound the same. You really sort of needcontextual speech recognition to make your speech recognitionthat much more accurate. Once you’ve convertedthat speech to text or if someone speaksto your chatbot only via text, how does the bot actuallyunderstand what you’ve said? The bedrockof all conversation interface, right, chatbotand Alexa and what have you is something callednatural language understanding. Now, this is a pretty popular fieldin computer science right now, because it really sort of drives the whole conversationalexperience home. So what natural languageunderstanding does, and this is really simplifying it is it converts unstructuredconversational data, you know,human conversation’s not structured, that is grammar, but it’s unstructured.It converts that into a structure that computers can understand. How it does that is it convertslike a sentence that I say or an utterance, as we call it, into a bunch of thingscalled intents and slots, and we’ll get to that in a bit. So what happens is Lex uses this NLU,natural language understanding, to convert what the user said into a structure which iswhat the intent is, what the slots are, and that is sent to a place to fulfil, you know, whatever serviceyou’re providing. Typically that is a Lambda function, this is your back end essentially. Your back end you can choosewhat you want to do with it, you can hit an API, you can hard code some data,you can query a database.All you have to do is sendsome structured data back, and Lex takes careof showing the output back to the user on the platform that they have chosento interact with you. So that was the workflowbehind how a chatbot actually works. Now, let’s take a closer lookinto a couple of terms that I actually mentioned earlier, which was what is an intent,what is an utterance and what are slots. For the purpose of this presentationand the demo, I build a simple ordering flowers bot. I’m based out of Amsterdam,and it’s tulip season here, so the tulips are bloomingall over the city, so I built a simple botthat simulates a conversation where a user or a customerwants to order flowers.So here’s how the conversationbetween the customer and the bot goes. So the customer finds the bot and says somethinglike I would like to buy flowers. Now, let me break this down. What the customer says isessentially what an utterance is. Now, a customerthe thingwith conversation interfaces is that a customer can say the same thingin many different ways. A customer could say somethinglike I would like to buy flowers, I want to buy some flowers. Hey, can you help me buy some flowers? Hey, can you help mepurchase some flowers? So there’s so many different waysof saying the same thing. To give you another example, something as simpleas asking for the weather. A customer can say anythingfrom what’s the weather, tell me the weather, to, is it how outsideor do I need a coat today? Just different waysof saying the same thing.Now, in any conversational interface, all these different utterances are matchedto something called an intent. An intent basically performs an action that is a response to any utterance. Why do you really need this intent? Well, like I said,there are so many different ways of saying the same thing, so every utterance has to be mapped to something called an intent. All right, so a customer saidI would like to buy flowers. The bot responds with, hey, what type of flowerswould you like to order? The customer says tulips, and the bot responds and continuesthe conversation with, hey, what day do you wantthe tulips to be picked on? What time? Can you confirm?So on and so forth. You’ll see here that the chatbot is actually askingfor certain pieces of information. I’ve underlined tulips there. A tulip is the type of flower. A user could saysomething like rose or lily. And similarly the bot is also askingfor the day and the time at which they want the flowersto be delivered.These pieces of informationare called slots, and think of slots as variableswithin an utterance, any piece of datathat can change from user to user within an utterance. For example, going back to the weatherexample that I spoke about, I could askfor the weather in Amsterdam, but someone could askfor the weather in Paris or in Delhi or in New York, so, in that case, the city there,Paris, Delhi, Amsterdam, etcetera, becomes this slot, and the slot valuebecomes what I just mentioned. So that is what a slot really is. You’ll also see that this is how the conversationwould typically end where the bot says something like, hey, your tulips will be readyfor pickup at 9:00 A.M. Is that correct?And the user says, yes, and the bot says, thank you,your order is placed, and there’s a nicelittle emoji there as well. So this is what we call the fulfilment where after all of this conversation, your back end is sort of fulfillingthe user’s request. And again your back end couldtypically be a Lambda function which talks to, you know,your services that have already built in your start up or your Enterprise.All right, so the core of anyof the conversational experience is basicallyhow the conversation is handled and how the dialogue is handled. So we have somethingwe called slot elicitation, which is basically how Lexasks for pieces of information. Now, for somethinglike the flower delivery bot to work, it needs a few pieces of information. It needs what type of flower,it needs the city, the date, and the time. These four pieces of informationit absolutely needs. Now, you would hope that all your users and all your customerswould talk like this where your bot says somethinglike, hey, what type of flower do you want, and the customer saysI would like tulips that I can pick up tomorrowin Amsterdam at 9:00 A.M., so in one shot they’re giving youall four pieces of information, which is great. But unfortunately or fortunatelythis is not how we talk or how we communicate as humans.Imagine if you were to call upyour local flower seller and they say, hey,what flowers do you want or, hey, how can I help you? You wouldn’t give them all thesepieces of information in one shot. Typically there is a backand forth between the user and, you know,the person providing that service, so we modelled we’ve given the option to modela similar sort of paradigm with the bot as well, so in this case you can seethe bot is asking questions like, hey, what type of flowerswould you like? What day do you want it?What time of day and so on? So essentially what’s happeningis what we call slot elicitation where the bot is basicallyasking the user for the pieces of information it needs to complete that request, and it’s doing so by usingsomething we call a prompt, which is basicallya spoken or a typed face that invokes this intent,or that gets that piece of information. Like I said, the core of providinga great conversational experience is managing this dialogue.Now, what you saw earlier iswhat we call a multi-turn dialogue, where there is different turnsbetween the user and the bot. The bot says something,the user responds. The bot says something,the user responds. The other way of actually talking is what we calla single turn conversation, where the bot says something,the user says something, and the conversation has done. Most conversations that we’ve seenare actually multi-turn conversations, and these lead to better userexperiences and happier customers, so Lex takes careof this multi-turn conversation and what we call dialogue management. So, as you can see,there are a few types of slots, there’s a type of flower, there’s a pickup dateand a pickup time, and each of these slotshas corresponding prompts as well. So, if a user doesn’t mentionthe type of flower, the Lex bot will remind them saying, hey, what typeof flower would you like? Or if they don’t mentionthe date or the time, the corresponding prompt will be, hey, what time would you likethe flower to be picked up? The great thing about this isit’s not completely linear as well.So suppose a user says something like I would like the tulipsto be picked up tomorrow at 9:00 A.M., the users answer two questions there. So the next question by the botis not again going to be, hey, what timeshould we pick up the tulip, because the users alreadyanswer that question. So there is enough intelligence therefor that dialogue to be managed and for each of those slot valuesto be picked up. All right, so we’re going to be talkinga little bit about how you can customiseconversations. Now, conversations are not easyif you really think about it. As humans, it comes naturally to us, but for a computer to mimica conversation, it’s not as easy. And the reasonis I think because conversation and we as humanswhat we do really well is hold context.We hold context amazingly well. I can meet an old friend of mineand refer to some good times we had like maybe 10 years in the past, and that reference is immediatelypicked up by the person, because both of ushave that shared context. So we have tried to give that sort of contextualisation to Lex and to bots being built on Lex as well. And having that sortof contextualisation and personalisation is the key to buildinga good conversational experience. So, for instance, if a user says somethinglike I would like to buy flowers, and we have been doing thatover the last past week, the bot’s next response could be, hey, would you preferto buy tulips again? So you’re givingan added contextualisation to that particular user. There’s a good chanceif a user has bought tulips on four consecutive daysand the user bought the fifth day, they’re probablygoing to buy tulips again, which is why we’ve given youthat option to actually do that.Similarly you can also validatethe user’s input again, if you want. So maybe you don’t havetulips available on the day that the user’s asked again. You can say, hey, sorry,I don’t have availability, would a later dayactually work for you? Like I said, conversation is not easy,and you need context, and you need to sometimesstore that context to have a nice conversationalexperience with the bot, so for that you’ve been giventhe option of storing something calleda session attribute. Now, a session attributeis some piece of data that helps with storing context. This could be context of a session, a session is defined from when a userstarts using the bot to stopping.So maybe the userlogs in and says, hey, I need help with my order, this is my reference number, and that reference numberis stored as a session attribute, so until the botactually helps the user, that reference number is stored. Sometimes session attributescan be permanent as well. Maybe this is a bot which requiressome sort of login and authentication, and the user, you know, logs in, so then you maybe know the user’s name. So you can store the user’s name in a session attribute,so that the next time the user logs in, you can welcome them,hey, like a welcome back, which sort of gives you that niceuser experience at the end of the day. As you can see,Lex sort of maintains this context by storing datathroughout a conversation. This data could be anything from a slot value to a confirmation, and all these are actuallystored in session attributes.We give you, the developer, the flexibility in figuring outwhat to do with the session attributes. Now, conversationis not always linear. All right, if conversationswere always linear, they’d be boring, but they’d be easier to maybe buildfor when you’re building a chatbot, so typically a conversationwith a chatbot could go one way. So, if you’re building or ratherif you were building a bot that helped with buying flowers, maybe the user’s almost donewith the entire process. At the end, the bot says something like would you like a smallor a large bouquet? Now, at this point,the user doesn’t know how many flowersare there in a small bouquet, like how small is smallor how large is large, so the user says something like how many flowersin a large bouquet? This is a different intentat this point in time.When a user says something like this, this would match to a whole new intent. It wouldn’t be partof the order of flowers intent that we spoke about earlier. So, in this case,you’ll have to switch context and store the current contextin a session attribute and answer the user’s question. So maybe you say, hey,30 flowers in a large bouquet. Then if the user sayssomething like, oh great, place my order or confirm, if you’ve stored the contextin the session attributes, you don’t have to askthe previous questions again, because that contextis really maintained, and, again,Lex gives you that option to do so. So this way you can switchseamlessly between intents and come back and not havea bad experience for your user. If you didn’t storethe session attributes there and the user asked a question like how many flowersin a large bouquet, you bot answer is saying,hey, 30 flowers in a large bouquet, and then when they came back, you’ll have to go throughthe entire process again, which is not a good look.So you need to storethat session attributes when you’re switching context. You can also chain differentintents together sometimes, so, for instance, say a userhas gone through the entire flow to buy flowers and the bot says somethinglike, hey, anything else today, and the user says, hey, you know what,I want to update my address, this is for someone else. You can actually chainthe update address with the order flower’s address and have a seamless conversation so that the userdoesn’t feel very confused. This also leadsto a good user experience and gives you flexibilityas a developer as well. All right, so, when it comesto specifically text bots on Lex, you can take advantage of the medium. It’s always good to really takeadvantage of the medium and to use rich message formatting. So, for instance,there will be a lot of times when you actually want to showvisual feedback to your users, for instance, this is an examplefrom a car rental bot where a user is looking at three cars before they choosewhich one they want to rent.Maybe a user is buying t-shirtsvia your amazing chatbot, and they want to look at like the threedifferent colours that you offer before they actually place the order. So you can take advantageof the rich messaging formats on different platforms to givea better experience for your user. Now, each platform has its own wayof sort of figuring out formatting.So Slack versus Facebookversus Kik will look different. Similarly, you can really customisethe experience on your mobile app and on your web app as well, but make sure you use the medium well to provide like a good experiencefor your user. When it comes to the fulfilmentof what your chatbot’s doing, there are a couple of waysyou can do it. Most people will chooseto use a Lambda instances, so, again, that is a call that’s madeto your Lambda function, and you can choosewhat you want to do with it.Maybe you make an API call,maybe you hit a database, you can hard code some data.It’s really up to you, and your Lambda will return whatever textthat your user’s choosing to see. A lot of times you can returnthat to client as well, so the output can be returnedto the client for their processing too, and a lot of times their dialogue partis sort of taken care of by Alexa. So until the slots are elicited,all the prompts are thrown in, and then the fulfilment is made to your Lambda function. Like I said, Lex takes care of the entire life cycleof the chatbot, so, if you can actually save your bot and it preservesthe current state on the server, you can build your bot,which sort of builds the binary, and you can build like test devand prod versions as well, and you can test it out rightin the Lex window. And, once you think it’s ready, you can publishit out to different platforms. You can publish it out to the messagingplatforms as well as mobile and web.I think one of the greatest thingsabout just using chatbots is the fact that you can implementcontinuous learning on the chatbot. This helps a lot, especially in the casesof things like customer service bots. Just to give you an example,most customer service clients I mean not just bots,but if you take even like a call centre, 80% of their queries are 70 to 80% of their queries are from the same bankof questions or queries. Typically customers havethe same bunch of queries. With a chatbot, there could be onethat’s outside of those usual queries, and, at that point of time, youcan choose to maybe switch to a human at the back end where you’re reallyaugmenting this experience with the human.The human answers the question, and that question and answersare again fed back into your system. The next time someoneactually asks that question, the bot is answering the question, and that process is becominga lot more efficient over time. You can also use CloudWatchto monitor your metrics, so you can get really good metricsfor how people are using your bot, when they’re using it, what are the intentsthat are being hit the most, what are the responsesthat you’re getting the most and so on. And another thing you can also do is sort of manually lookat the utterances that were missed.And this brings us to an importantpart about conversation interfaces, is that testing becomesso much more important for all the artificial intelligencein testing and automating your testing, when it comesto conversation interfaces, you really want to lookat a lot of beta testing and manual testing for your bot. The reason is you might have thoughtof maybe ten different ways someone could say a certain thing, but there could be a legitimateeleventh way of saying something. So you really want to lookat your missed utterances to see and to make your botthat much more understanding and that much more efficient. Like we mentioned earlier, Lex is multi-platform, so you can buildthat one bot just the one time, and you can deploy it to a mobile app. You can deploy it to Android,iOS, all of that. You can deploy it to a hostof messaging platforms, mainly Slack, Kik, Facebook Messenger, and for your SMS as well.You can deploy it on the web usingall the commonly used web SDKs like React and Java Scriptand Python and so on, and you can also integratewith AWS IOE. With use cases, and I’m sureyou already have amazing ideas of how you can implement Lexin your organisation or your startup, but right now we’re seeinga lot of popularity in contact centre and informational bots, and these are extremely popular, and they help make the systemso much more efficient, especially when it comes to thingslike asking for information when you have a lotof information on your website, sometimes a botis just that much easier. You can also use itto build applications, I know of peoplewho have built great bots, even in things like their dev ops, so it’s contributingto Enterprise productivity, and, like we mentioned earlier,you can also integrate with AWS IoT, so you can build IoT bots as well. All right, like I mentioned earlier, I worked for more than two yearsin the Alexa team, and it was an exciting time for me.I want to talk a bitabout Amazon Alexa, how it ties into thisand then show you the demo. So Alexa, in case you don’t know, and I’m guessing a lot of you watchingthis right now have a device at home, but for those of you who don’t know, Alexa is the cloud-based service that powers devicessuch as the Amazon Echo, which you see on the screen. Now, these devices that you seeare very, very powerful in the sense that they eachhave a microphone array, so you can get far field recognition, I can speak to a devicelike almost 20 feet away. And these devices really help with everythingfrom day to day to entertainment to weather to news to music, and there’s so much more.And the real vision for Alexa is this whole Alexa everywheresort of vision where people are interactingusing their voice because it’s such a natural interface not just at their home,but also on the go at their workplacein their car and so on. Now, when it comes to Alexa, there is something calleda skill on Alexa. Think of it as similarto how the mobile ecosystem has apps, Alexa has skills. So anybody can builda voice-based experience, it’s called an Alexa skill,and upload it to the skill store. Right now there are100,000+ skills worldwide, and that number is just growing, and like you can seemost of the big brands in the world have amazing Alexa skillsthat they have published so that users can use them.The skill store is very popular,and anybody can build a skill for free and upload it to the skill store, I will also show you a demoof how you can do that. So once you build a Lex bot, it’s actually fairly simple to takethe front end of that Lex bot, so I’m talking about the intents,the utterances, and the slots and the prompts, there is an option to export them. Just hit the export button, and make sureyou choose Alexa skills kit. This is a frameworkbasically to build a skill, so make sure you chooseAlexa skills kit as a platform before exporting your bot. Once you have done that,you can go to developer.amazon.com and create a new skill. It would ask youfor a few pieces of information like the name of your skilland what language model, because it’s therein multiple languages like German and French and Italian, etcetera. So it would ask you that. And then, as you can see,there is a drag and drop JSON option, so you just drag and drop that file that you’ve just downloaded, and it sort of builds the front endfor the Alexa skill for you from the bot that youhave created in Lex earlier.Okay, so let’s do a quick demoon how Lex actually works, how you actually builda front end for your skill, how you deploy to a web UI and then to a messaging platform and then export that to a JSON file, which you can import to an Alexa skill. So I’ve just builta simple chatbot here, and, as you can see, this is a simple botthat has just the one intent, which is web UI order flowers. Now, every intent has a bunchof utterances associated with it. These are utterances that you,as a developer, have to earn. Essentially you sort of put yourselfin your customer’s shoes and think, okay, what arethe different ways they could in this case ask to buy flowers? And, as you can see,there’s a fairly comprehensive list of different ways to buy flowers.There’s everythingfrom ‘may I get flowers’, to ‘I want to order flowers’,’I want to buy flowers’, ‘I want to put in an order’, just different waysof saying the same thing. All of these utterancesreally map to this particular intent that you see here. Now, we also had slot valuesin this particular chat bot, and there werethree specific slot values, the type of flower, the pickup date,and the pickup time. Now, as you can see,the flower type is the first slot, and the slot type is somethingthat I have defined on my own. It’s what I call a custom slot type, and I’ve called itthe web UI flower type. What you can do isthere are two types of slots, there’s a custom slot type when it’s a data set that is customto your particular chatbot.In this case,I can limit the values of that slot to maybe the typesof flowers that I sell, and you can also add validations. So, for example,a customer asks for hydrangea and the seller doesn’t sell that. I can throw in another saying,hey, I don’t sell that, but these are the typesof flowers that I actually sell. Every slot is associatedwith a prompt as well, you can see the prompt which says what type of flowerswould you like to order. So if your customer doesn’t mention the type of flower that they want, the bot actually throwsthat particular prompt and says, hey, what type of flowerwould you like to order? Now, the other two slots arepick up date and pick up time, and you’ll notice that both of themstart with Amazon Dot.These are essentiallybuilt in slot types, which are slot types provided by Amazon for you to build your bots easier. Now, these slot types have well-tested. They’re very comprehensive, and they exist for lotsof common data sets that you might usewhile building a bot, so things like date, time,currency, city names, place names, and so on. The two built in slot types as well have their corresponding prompts, so you have what day do you wantthe flower type to be picked up, and you’ll noticethat we have referenced another slot within the prompt, which is somethingthat you absolutely can do. You’ll also see here that there issomething called a confirmation prompt, which is something that you seeat the end of whether or not at the end of the process, so once the bot has gotthe pieces of information, the bot’ll actually confirm saying,hey, is this what you wanted? Do you confirm?This is a good practice to have when you’re actually askingfor a lot of pieces of information from your user. If it’s just the one shot, then probablynot a good idea to have that.All right, so nowwe’re going to actually publish the bot. I can choose the alias,and I can actually publish this bot. I can choosewhere I want to publish it to as well. I can choose to publish itto either web or mobile, or I can go to all the differentchannels that Lex has connections to, which is namely Kik,Facebook, Slack, and SMS. Now publishing it to these channelsis very simple. You have to create an app or a bot on those platforms and then just enter some detailslike a client ID verification token and a couple of other detailswhich just links the two together. So in this case I have createdan app on Slack, and I have publishedthis bot to Slack as well. So first let’s see how the botworks on the UI framework that I’ve built here as the bot, and as you can see there isa chat window here, so I’m just going to chat with itand say I want to buy flowers.That is the rich messaging typethat we spoke about, so there’s a photoof some flowers there, and I’m going to clickon the tulips button. So what day do you wantthe tulips to be picked up? The great thingabout this conversation experience is I don’t have to just enterlike a date and a DDMMYY format. I can specify a date, or I can even say two daysfrom now or tomorrow. For now thoughI’m just going to say June 20th, it asks me for what time.I say 10:00 A.M., and there’s a confirmation prompt,so I say, yeah, sure, let’s do it. Thanks for your order.And it was literally that easy. Of course this goes to your Lambdaas a structured JSON, and your Lambda doesthe part where it fulfils this order. It maybe talks to your API and so on.Now, I’ve done the sameon the Slack channel as well. I’ve added the app order flowers, which is of course a bot. And I can havethe same conversation here. So I’m going to say buy flowers, and, as you can see, it is the buttons are nativeto how it would look on Slack. If I gave you this demoon Facebook Messenger, it would look native to how it would lookon Facebook Messenger, and Lex actuallytakes care of all of this, which is another plus pointabout why you should use Lex. So I’m just going to click on tulips. It says what date, I say June 20th, okay, I say 10:00 A.M., looks like there’s going to bea lot of flower orders on June 20th, and I say yes. And this is done. So this was an exampleof the web UI and on Slack.Now, I’m going back to my bot, and, yeah, this is where all the botsthat you created would be listed, and, if I just choose thatand click on actions, there is an optionto actually export this bot. So I’m going to click on that,choose the version. When it comes to platform,make sure you check Alexa skills kit. This is the framework on which youneed to use to build Alexa skills. So you can actually exportthe front end of your bot, which is the utterances,the intents, the prompts, etcetera, as a JSON file, which youcan import into your Alexa skill. So I’m just going to export this. And this gives you a downloadablewhich is a ZIP file. I can just download that. You can then go to developer.amazon.comand create a skill. You have to enter a skill name and what we calla skill invocation name, which is the phrase a user saysto start talking to your skill, so for a customer to start talkingto us, they have to say open order flowers.In the JSON editor pane, I can just dragand drop that same JSON file, which basically gives youa JSON representation of all that we built earlier,so, as you can see, all the sample utterances are here, slot types and the promptsare here as well, so it’s essentially justa simple JSON file that I’ve imported to Alexa, which makes it so easyto start building. I’m just going to clickon save and build model, which actually buildsand trains the model as you go. Now, with Alexa, I recommend buildinga subnet backend in the sense you builda different Lambda function. Alexa gives you the optionto host your own Lambda function within your skill itself, so the core option you see hereis precisely for that. It’s basically a Lambda functionthat is just for the skill.And you have the ability to testyour skill in the browser as well, so, even if you don’t have a device, you can test your skill in the browser, which is what we’re going to doas soon as our skill is built. It has. So I’m just going to buildthe skill now. Yeah, there we go. I can use my voice or text,but I’m just going to use my voice. Open order flowers and buy flowers. What type of flowerswould you like to order? Tulips.What day do you wantthe tulips to be picked up? June 20th. Pick up the tulips at what timeon the 20th of June, 2020? 10:00 A.M. Okay, your tulips will be readyfor pickup by 10:00 on the 20th of June, 2020. Does this sound okay? Yes. Thanks for using Order Flowers. And that was it. So, as you can see, we used the same botthat we built on Lex, and we exported that bot,and we imported it to Alexa to start building the skill.Of course I have builtmy own back end for the skill, but it’s a great way to sortof convert that Lex bot that you’ve been buildinginto an Alexa skill. So you saw in the demohow we built this one chatbot on Lex, we deployed it to a web page,we deployed it to Slack, and then we used the same,the JSON file from the Lex bot, and we exported it into Alexa skillto build an Alexa skill out. Of course for this demo, I built two different back endsfor the Alexa skill and the Lex bot. Now, theoretically you can usethe same back end, but it becomes a little easier, if you actually have two separateback ends for the two, because the main difference betweenbuilding a Lex bot and an Alexa skill is that with a Lex bot, you’ll actuallyget access to full transcriptions of what the user said.But with an Alexa skill,all you get is the intents, the slots, and so on. You actually don’t get the entiretranscription of what the user said, so there is that slightsubtle difference which probably makes itmore practical for you to have two different Lambda instances for your Lex chatbotand your Alexa skill. The one thing to really keepin mind why building a chatbot and then building the sameconversational experience on Alexa is that there is a certaindifference in designing for text versus designing for voice, and this is purely a designconversation we’re having right now. So just to, you know, show youand illustrate those differences, when you’re designing a chatbot, you’re sort of designedfor reading and writing. You’re designed for peopleto be reading something, and whereas for Alexa you’re designedfor people to be listening to it and for speaking out loud,and that makes a huge difference.Just to give you an example,take the differences between how the Harry Potter books read when you’re reading the bookversus actually looking at the film. You’ll see big differencesin the dialogues that are spoken out, because, one,you’re actually reading out another reading, and the otheryou’re actually listening to, so make sure you keep that in mindwhile designing for the two.When it comes to text in a chatbot, you can personalise your brandby the use of rich messages and emojis. You saw that we usedlike a nice flower emoji when the order was complete, and we also used richmessaging which of course with voice you can doin different ways. You can use somethingcalled speech cons, which are basicallythings like sound effects or words that are commonto a certain area that you hear a lot, like hurrah or congratulationsand things like that. Alexa gives you the optionof using these phrases to make it soundthat much more interesting.You also get accessto a huge sound library on Alexa that you can actually useas you saw in the demo that we created. Like I said earlier, designingfor reading and writing means that people havethe ability to skim read. I think we’ve all read textbooks, especially when there’s a lot of text,where we’re able to skim read and just pick out the pieceof information that we want. So, when you’re buildinga Lex experience, it’s okay to bea little more informational so that people can pick out the thingsthat they really ask you for. When it comes to voice,it’s not quite the same.People don’t have the we don’t have the option to sortof skip through what Alexa is saying. So try to be as brief as possible,and this is imperative. In fact, in the Alexa teamwe should say the thumb rule was the one breath rule. If it takes longer than one breathto say that entire sentence out, it means it’s too long.So keep it as brief as possible. And lastly, and this is very subtle,but when presenting choices, with something like Lex in text basically you can presentmultiple options just fine, so you can say something like, hey, would you like friesor salad? And people would replywith either of the two. In voice, make surethose choices are definite, because saying something like, hey,would you like fries or salad informally especially and we often hearthe response to that is yes, because people don’t knowif that is a choice in itself or they have to choose between the two.So make sure you havevery definitive choices like, hey, which one would you like,fries or salad? These are the subtle differencesbetween designing for text versus designing for voice. I think keeping in mindsome of these differences actually leadsto really strong experiences that your customers will come back to, and it is all about buildingsuch engaging experiences, so keep that in mindwhile you’re actually building it out. And that was it for my talk, so first of all thank you so muchfor attending this conference.It is a Virtual Summit, of course, but do check out the discovery zone. We have some machinelearning competency partners, like Accenture, Deloitte,and Snowflake, so go have a chat with them, see maybe if you can work with themand get to learn something as well. I had a great time doing this.Thank you so much. I would love to hearthe sort of conversational experiences you are building. So hit me up on Twitter.That’s my Twitter handle right there. And I’d love to hear from you. Thanks again and bye.

Leave a Reply

Your email address will not be published. Required fields are marked *