How to design an Amazon Alexa skill, part 3

Related Articles


We are RocketMill: This is Callum Stannard


We are RocketMill: This is Stefan Domzalski


How to design an Amazon Alexa skill, part 3

Missed part two? Find out the fundamentals to launching an Alexa skill.


Good afternoon, RocketMill, and Merry Christmas/Happy New Year internet, depending on when you are watching this. It is still 2017 in the RocketMill office, and that means we need to start thinking about how to make organic work in the New Year.

And so, I hand over to this man, John Truong, who’s been on our team for nearly six months now. He is a brilliant technical SEO executive on my team, and I’m delighted to say he has written a blog post, which I very much recommend to you. It’s got the top five SEO trends for 2018. If you’d like to learn more about all of these topics, I heartily recommend reading through John’s post and making sure you apply them to your campaigns in the New Year.

But for now, we’ll focus on the third and final instalment in our series of videos on building an Alexa skill for your business. Now, since our last presentation, Amazon have somewhat moved the goalposts and introduced some new features for business Alexa users, but we’ll have more on that towards the end.

Recap: designing an Amazon Alexa Skill

But now, let’s just recap last month’s presentation where we looked at creating a skill via the Amazon Developer Services website, then using the Alexa Skills Kit Interaction Model Builder to design the voice user interface and map utterances too intents.

One way of hosting our skill behind the scenes, so hosting the source code on the Amazon web services lambda website as a lambda function. You can also do it as an HTTPS endpoint somewhere else on the web if you fancy.

How to test and publish our skill, and that pretty much rounds it off. But that was only one type of skill, because there are four others. So, today I just want to talk about flash briefing, which is the one for which our client base – and for you potentially – is the one you might want to think about.

The Flash Briefing skill

The flash briefing skill. Well, you probably expect me to start with a slide like this. How to build a flash briefing skill, but actually I’m going throw that idea away and tell you how not to build a flash briefing skill. Or at least some things you should consider if you’re thinking about doing so.

What is an Alexa Flash Briefing?

So, first and foremost, what is an Alexa Flash Briefing? Well, you can think of one as a bit like a personalised radio news bulletin. Users choose the sources they want to hear when they ask for their flash briefing, and they can call this in on demand. It might come in, in the morning, say, you might listen to major news publishers, local news sites, industry sources, weather forecasts. Basically, it’s your daily personal news bulletin.

How do Flash Briefings work?

And, actually, how do Alexa Flash Briefings work? They’re actually pretty simple. Firstly, they have a predefined interaction model. So, you’ll recall with custom interaction skills we had to define how users interact with our skill. We had to define the utterances that would map to the intent of the functions behind the scenes. But, actually, with a flash briefing, you’ve only really got two ways of calling upon it. You’ve got: “What’s my flash briefing?” or “What’s in the news?” So, they’re predefined, so in that sense they’re a bit easier.

They’re also quite easy to build because all they do is pull in content from a feed. So, it could be a JSON feed, which is Amazon’s preference, or you can use an RSS feed.

There are two types of content you can put into your feed. One is text content, which will be read in Alexa’s voice; and one is audio, which will be read in a voice of your choice. That rhymes.

And there’s some key elements to a feed, which need to be there. You have the preamble, which basically introduces the feed. So, that could be: “From the RocketMill press centre”. The name of your feed, which is how users will determine that feed from another. The update frequency, which needs to be fairly regular. If you don’t publish often, this probably isn’t a skill type for you. Hourly, daily or weekly. Content type, which is again just text or audio, it’s the type of content you’re going to be putting into your feed. And finally, a custom error message, which is kind of like a maintenance message if your skill isn’t available because of down time or because of maintenance or whatever the case might be.

Text content does not support SSML

A key thing to take away here is if you’re choosing between the content types, text content in an Alexa Flash Briefing Skill does not support SSML, which is speech synthesis mark-up language. Now to show you why that’s important, I’ve got a short clip of Alexa on my desk.

Alexa, start RocketMill company meeting.

[Alexa: Ladies and gentlemen, it is the one, the only, Chris Philpot.]

So, you may recognise this as part of Alexa’s introduction to my last presentation. Now, besides stroking my burgeoning ego, it’s also demonstrated free SSML tags, which you can use to control Alexa’s voice in a custom interaction skill.

So, if we have a look at the source code, I’ve highlighted the three tags in question. Firstly, there is the audio tag, which allows you to play back low quality audio hosted at an HTTPS URL somewhere on the web. We have the break tag, which allows us to trigger Alexa to pause. You can either specify how strong the pause should be or you can give a duration. And we have the phoneme tag, which we can use to control how Alexa pronounces words by specifying a phonemic pronunciation.

So, because I’m pedantic, and I didn’t like my surname being mispronounced – any more than I like it being misspelt, thank you guys – I specified the IPA pronunciation. You can always use X-SAMPA, the symbols you’ll see in pretty much any good dictionary. There’s also similarly a ‘say as’ tag, which you can use to tell Alexa to specify whatever word should be spelled out or interpreted as a cardinal number, an ordinal number, digits, a phone number. You can even beep out expletives, which we haven’t had to do just yet.

Crucially, Flash Briefings take away the ability to control exactly what Alexa sounds like because they don’t support SSML. But you can use far longer and higher quality audio recordings than you can with a Custom Interaction Skill. To put a number onto that, with a Custom Interaction Skill, you can host up to 90 seconds of audio or playback up to 90 seconds of audio at 48 kilobits per second in terms of its bit rate. Compare that to a Flash Briefing, where you get 10 minutes at 256 kilobits per second. For those of you who don’t talk KBBS like we me, you can think about that like the difference between and old ’90’s American sitcom where it’s kind of square picture, a bit blocky, a bit low quality versus a 4K Netflix stream. We’re talking absolute chalk and cheese in terms of audio quality.

So, while you can add audio to a Custom Interaction Skill, the quality is poor, and it’s best suited to short soundbites and audio effects. And so, therefore, if you are building a Flash Briefing, you should be using audio for maximum control with non-synthesized voices or just choose your words carefully so that Alexa pronounces it in the best way possible. You can get good results with a text skill, as I’m about to demonstrate with some flash briefing examples.

Flash briefing examples

So, first up, here’s a really good audio stream from the BBC.

[Alexa: In headlines from BBC News.]

[Journalist: The world’s largest lithium ion battery has started delivering power into the electricity grid in Australia. The 100-megawatt battery is connected to a wind farm. As Hywel Griffith explains from Sydney.]

As you’d expect, the BBC feed is just right. High quality audio clips of human presenters taken from their existing audio reports or radio bulletins. If you don’t have the means to create your own audio recordings on a regular basis, then the text fed Alexa skill can sound pretty darn good too. So, here’s our local news.

[Alexa: From Brighton and Hove News, a Brighton builder replaced hundreds of pounds worth of lead stolen from a church roof free of charge so the Christmas fair could still go on last weekend. Lee Broughton of Modernist Carpentry had been fixing the leaky roof of St. Andrew’s Church Hall in Hillside, Moulsecoomb, when the lead….]

Now that’s pretty good. Alexa was being fed with text from a website. Did you pick up on the mistake at the end? Because the “lead”. You can see that it truncated in the feed and so Alexa just read out a truncated message. Because it’s very easy to build a Flash Briefing by connecting Alexa to an existing feed of data, it’s very easy to forget to tailor that content to this new quote-on-quote audience. To give you a demo of that, here’s what happened when I connected the RocketMill blog to an Alexa Flash Briefing.

[Alexa: In digital marketing news, have you ever considered how important alt text is to those using screen readers? I explain how adding a short line of descriptive text to your images can make your content more accessible.]

So, this is the text content of one of our blog posts before the ‘read more’ link. And it’s been written exclusively from the web, as you could hear. If we wanted to adapt this for a new audience, we could either make sure our intros were kind of device agnostic and written with Alexa somewhat in mind or we could use a magic field in WordPress to add a text description of the post that was designed to be read aloud. But, hey this was just an experiment working with data, which we already had, a feed, which already existed. You wouldn’t get a major tech publisher doing that, right?

Let’s listen to a clip from the Alexa flash briefing of Tech Radar, a technology media website which boasts 20 million monthly visitors and, which calls itself the market leading authority on all things tech. Here’s their skill.

[Alexa: In tech news, the best VR headset for your money. From Oculus an HTCV to PlayStation VR and Google Cardboard. Your cash back offer is available on select Micromax smartphones. Bharat 2 plus, Bharat 3, Bharat 4.]

You can hear how poorly this content has been optimised for an Alexa Flash Briefing feed. Now at first, I thought it was reading out bar ad placements from the source code to the post. It transpires it’s actually a story about a cash back deal on a Bharat range of smartphones. You probably haven’t heard of them because they’re sold in India. This is the Tech Radar News UK feed. So, you can see this is completely irrelevant for our audience, and even if it wasn’t, it’s really poorly written as a news bulletin. It’s not a news bulletin at all really.

And users have made it very clear that, that’s what they think of the skill and to be fair to Tech Radar, they’re far from alone. If this is an experiment, it shouldn’t be public. It reflects really poorly on a tech brand. Users come to them expecting news about technology.

I could say the same for CNET and you can see what this user has said about their posts, their Flash Briefing, which just reads out the post title in each case. CNET is a technology brand, which publishes news from CBS Interactive. They’re roughly the 220th most popular website on the internet according to SimilarWeb, and between this and its apps, they produce and distribute plenty of high quality audio/visual content. So why reviews this, why not pull that through to your Flash Briefing feed?

The key takeaway guys is that it’s pretty darned easy to build a Flash Briefing for Alexa. You just need a feed of regular newsworthy content. But, optimise your content for the platform because if you’re lazy, users will vote with their feet.

What type of Skill is right for your business?

And so, I’ll conclude by just covering off that question I asked very early on. What type of skill is right for your business? As far as I’m concerned, if you are a business with small frequent interactions, so Uber, National Rail, Jamie Oliver, anything where you can guide users through a simple step-by-step process of a conversion funnel, a Custom Interaction Skill with your brand, might well work pretty well.

Similarly, if you are a publisher with regular editorial output, you should launch a good quality, well considered text or audio based Flash Briefing Skill, but make sure you optimise for the platform.

If you’re an e-commerce brand with fewer purchases than say and Uber-type brand where you expect users to dip in quite often, if you’re selling products to one person occasionally, realistically a skill might not be for you. But you should certainly be looking to appear on voice-friendly platforms like Amazon, of course.

And ultimately if a Skill isn’t going to make your customers’ lives easier, if it’s just a gimmick, then invest the time in your existing web content, where you’ve already got an audience. Remember how when the App Store first launched on IOS, every Tom, Dick and Harry decided: “Hey, we need our own app”? How many were actually useful? How may were installed once by someone who worked in the marketing team at that business and then immediately uninstalled because they were taking up space. Don’t waste your time and money because your users aren’t going to invest their time into a really poorly designed skill if it doesn’t benefit their lives. So instead, improve your website and push for voice results in regular web search, which will be a big focus for us early in the first part of the New Year.

And just finally, just to cover this off. While I was putting this presentation together, Alexa for Business launched. Now this might be of interest because this allows you to publish Alexa skills purely for internal use, they don’t go on the public Alexa skills store. So, you can think of it almost like an Alexa intranet, and these skills can do things like control your conference room bookings, you can use it to manage meetings and so on, and so for many businesses this might be where an Alexa Skill and voice technology can benefit you. Not in terms of your customer engagement, but in terms of streamlining internal processes.

That’s all for this month. Thank you very much.