Postcode Recognition Telephone Service with Twilio

python code example

This guide assumes you know a little bit of Python and Twilio already. Using Python with the Flask framework, I’ll show you how I handle speech recognition specifically for postcodes in Twilio. The goal of this code is to allow a caller to say their postcode, either normally or phonetically, tidy the transcription of that postcode and then direct them to a local agent based on their location.

If you want the really “interesting” part (I find it interesting, I don’t know how many other people will), skip down to the Is It Correct Phonetic section.

Code Explained

Imports

These are all the modules needed to make the code work. textblob is a simple tool we can use to classify if a piece of text is positive or negative. text2digits does what it says and converts text to digits.

import re
from flask import Flask, request, session
from twilio.twiml.voice_response import VoiceResponse, Gather
from twilio.rest import Client
import os
from textblob.classifiers import NaiveBayesClassifier
from datetime import datetime
from text2digits import text2digits

Setting up

Next we need to set up a few things before we can proceed. The secret key is needed to make your Flask app run. Then we add in some postcode hints and phonetic hints. These will be used by Twilio when doing speech recognition later.

dateTimeObj = datetime.now()
now = dateTimeObj.strftime("%d/%m/%Y, %H:%M:%S")

app = Flask(__name__)
app.secret_key = "xxxxxxxxxxxxxxxxxx"

postcode_hints = "insert postcode hints here like RG12 7HT"

phonetic_hints = "insert phonetic hints here like Romeo Golf one two seven Hotel Tango"

# TWILIO ACCOUNT DETAILS

# Set them as environment variables when you start
account_sid = os.environ['TWILIO_ACCOUNT_SID']
auth_token = os.environ['TWILIO_AUTH_TOKEN']
client = Client(account_sid, auth_token)

# Training data for yes/no sentiment analysis
train = [
    ("Yes", "pos"),
    ("Yeah", "pos"),
    ("Yep", "pos"),
    ("yes", "pos"),
    ("yes that's correct", "pos"),
    ("that's right", "pos"),
    ("no that is incorrect", "neg"),
    ("no that's incorrect", "neg"),
    ("no that's not right", "neg"),
    ("that's not correct", "neg"),
    ("no that's wrong", "neg"),
    ("nope", "neg"),
    ("No.", "neg"),
    ("no", "neg"),
    ("nah", "neg")
]

# Classifier takes the training data for use later
cl = NaiveBayesClassifier(train)

Welcome

The welcome section plays a message and sets a count to zero. We’ll use this count variable later so we know how many times a caller has tried to speak their postcode.

# Welcome
@app.route("/welcome_b", methods=["GET", "POST"])
def welcome_b():
    resp = VoiceResponse()

    # Setting count = 0 so we know how many times they retry their postcode
    session["count"] = 0
    print("count = " + str(session["count"]))

    resp.say("Welcome")

    resp.redirect("/get_postcode_b")

    return str(resp)

Get Postcode

The first real part of the call is to collect the caller’s postcode. We use Twilio’s gather verb to do this. You can see we’ve set the language to en-GB and have used our postcode_hints variable to help Twilio transcribe what the caller might be saying. After Twilio has finished gathering the speech, it will go to the gather_get_postcode section.

# Get Postcode
@app.route("/get_postcode_b", methods=["GET", "POST"])
def get_postcode_b():
    resp = VoiceResponse()

    gather = Gather(input="speech",
                    speechTimeout="auto",
                    language="en-GB",
                    hints=postcode_hints,
                    action="/gather_get_postcode_b")

    # Ask for postcode
    gather.say("Please say your full postcode now.")
    resp.append(gather)

    # Didn't hear anything so go and check count and go to no_1 or no_fallback
    resp.redirect("/no_b")

    return str(resp)

Gather Get Postcode

This section is just used to see if the caller said anything or not by checking Twilio’s SpeechResult. If they said something, we’ll process it and ask if that’s the right postcode. If we didn’t hear anything, we’ll go to our no fallback section.

# Gather Get Postcode
@app.route("/gather_get_postcode_b", methods=["GET", "POST"])
def gather_get_postcode_b():
    resp = VoiceResponse()

    # If the caller said something
    if "SpeechResult" in request.values:
        # Set the postcode to be what they said
        session["postcode"] = request.values["SpeechResult"]
        resp.redirect("/is_it_correct_b")
    else:
        # Didn't hear anything so go and check count and go to no_1 or no_fallback
        resp.redirect("/no_b")

    return str(resp)

Is It Correct

If the caller said something, we need to process what they said and then ask if we recognised it correctly. First of all we need to remove any weird symbols from the transcription. Then we remove any spaces because we want to be left with just a simple string. Finally we remove the last 3 characters from the postcode because we only really care about the first half.

Once we’ve done all that processing, we ask the caller if their postcode starts with what ever we’ve gathered and ask them to say yes or no, again with Twilio’s gather verb.

# Is It Correct
@app.route("/is_it_correct_b", methods=["GET", "POST"])
def is_it_correct_b():
    resp = VoiceResponse()

    # Remove any weird symbols from the postcode
    full_postcode = session["postcode"].replace("-","").replace("-", "").replace(".", "").replace("/", "").replace("?", "").replace(",", "").replace("#", "").replace(" for ", "4")
    # Remove any spaces from postcode
    postcode_no_spaces = full_postcode.replace(" ", "")
    # Remove last 3 characters from postcode
    session["postcode_start"] = postcode_no_spaces[:-3]
    print(session["postcode_start"])

    gather = Gather(input="speech",
                    speechTimeout="auto",
                    language="en-GB",
                    speechModel="numbers_and_commands",
                    hints="yes, no, yeah, nope",
                    action="/gather_is_it_correct_b")

    # Ask if correct
    gather.say("Is the first part of your postcode " + session["postcode_start"] + "? Please say yes or no.")
    resp.append(gather)

    # Didn't hear anything so go and check count and go to no_1 or no_fallback
    resp.redirect("/no_b")

    return str(resp)

Gather Is It Correct

When ever we use a gather verb in Twilio, we then check if the caller said anything by checking the SpeechResult. We’re hoping they said yes so we use your sentiment analyser to see if the speech result was “pos” or “neg”. If the sentiment is positive, happy days! We recognised their postcode and we can now transfer them to a local agent.

If the caller answered anything other than yes, we’re going to go to our no section which is going to act as a fallback.

# Gather Is It Correct
@app.route("/gather_is_it_correct_b", methods=["GET", "POST"])
def gather_is_it_correct_b():
    resp = VoiceResponse()

    # If the caller said something
    if "SpeechResult" in request.values:
        # Set what the caller said
        answer = request.values["SpeechResult"]
        # Do sentiment analysis on the text to return pos or neg
        sentiment = cl.classify(answer)

        # If they said something like "yes"
        if sentiment == "pos":
            resp.redirect("/local_area_transfer_b")
        # If they said something like "no"
        else:
            # Didn't hear anything so go and check count and go to no or no_fallback
            resp.redirect("/no_b")

    else:
        # Didn't hear anything so go and check count and go to no or no_fallback
        resp.redirect("/no_b")

    return str(resp)

What is the count?

The no section takes that count variable that we set at the very start and adds 1 to it. If this is the first time the caller is saying “no”, then the count will be equal to 1 and we go to section no_1. If it’s equal to 2, we go to section no_2. In reality, these just loop back round and ask the caller for their postcode again.

If this is the 3rd time that the caller has said “no”, we go to a new section that lets us collect the postcode phonetically as one last attempt.

# What is the count
@app.route("/no_b", methods=["GET", "POST"])
def no_b():
    resp = VoiceResponse()

    session["count"] += 1
    print("count = " + str(session["count"]))
    if session["count"] == 1:
        resp.redirect("/no_1_b")
    elif session["count"] == 2:
        resp.redirect("no_2_b")
    else:
        resp.redirect("/get_phonetic_postcode_b")

    return str(resp)

Get Phonetic Postcode

Now we can ask the user to say their postcode phonetically. This is our last chance of capturing what the caller says. You can see we’re now using the phonetic hints in the gather verb.

# Get Phonetic Postcode
@app.route("/get_phonetic_postcode_b", methods=["GET", "POST"])
def get_phonetic_postcode_b():
    resp = VoiceResponse()

    gather = Gather(input="speech",
                    speechTimeout="auto",
                    language="en-GB",
                    hints=phonetic_hints,
                    action="/gather_get_phonetic_postcode_b")

    # Ask for postcode
    gather.say("Please say your full postcode now, using the phonetic alphabet.")
    resp.append(gather)

    # Didn't hear anything so go and check count and go to no_1 or no_fallback
    resp.redirect("/no_b")

    return str(resp)

Is It Correct Phonetic

Now is the interesting part. I’ve written some logic that should be able to take a phonetic postcode and tidy it up into just the first half.

You’ll notice the first thing we have is our text2digits module which we’ve set as t2d.

We take the full phonetic postcode and set it to what the caller said. Then we replace all the weird symbols again and make sure that ” for ” is replaced with “4”.

Now we create a new variable called tidy_postcode and set it to be blank. We use this in our for loop coming up now. The for loop finds any digits and makes sure there’s spaces around them. We need this because the next bit of code is going to remove the last 3 words by finding the spaces. If we have a postcode that ends up like “Romeo Golf 127 Hotel Tango” then the 127 would be deleted. The for loop makes sure that 1, 2 and 7 are all separated. Then we finish it by setting the tidy_postcode variable and removing anything where there’s more than one space.

Now we can remove the last 3 words from our postcode so we’re left with “Romeo Golf 1 2”. If we end up with “Romeo Golf one two” then we use t2d to convert that one into a 1 and a 2.

After that we can extract just the first letter from each of the words (this is why it was important to have a digit at the end and not a number word). Then we can extract just the number or numbers. Finally we merge them all together and use new_postcode = "".join(new_postcode) to join it all up without spaces. Then as a final measure we make sure it’s all in capitals so we should be left with a very neat and tidy “RG12” which we can again ask the caller if that’s the correct postcode.

# Is It Correct Phonetic
@app.route("/is_it_correct_phonetic_b", methods=["GET", "POST"])
def is_it_correct_phonetic_b():
    resp = VoiceResponse()

    t2d = text2digits.Text2Digits()

    # Get the postcode the caller said
    full_postcode = session["postcode"]
    # Remove symbols and replace " for " with 4
    full_postcode = full_postcode.replace("-","").replace("-", "").replace(".", "").replace("/", "").replace("?", "").replace(",", "").replace("#", "").replace(" for ", "4")
    # Set the tidy postcode to be blank for now
    tidy_postcode = ""
    # Find any digits and make sure they have a space between them
    for character in full_postcode:
        if not str.isalpha(character) and character != ' ':
            if tidy_postcode[-1] != ' ':
                tidy_postcode+= ' '
            tidy_postcode += character
            tidy_postcode += ' '
        else: tidy_postcode += character
    tidy_postcode = re.sub(' +', ' ', tidy_postcode)

    # Remove last 3 words
    postcode_half = tidy_postcode.rsplit(' ', 3)[0]
    # Convert any number words into numbers
    postcode_half = t2d.convert(postcode_half)

    # Extract just the first letter from the words
    postcode_words = postcode_half.split()
    postcode_words = [words for words in postcode_words if words.isalpha()]
    postcode_letters = [letter[0] for letter in postcode_words]

    # Extract just the numbers
    postcode_numbers = list(filter(str.isdigit, postcode_half))

    # Join letters and numbers to make the first half of the postcode
    new_postcode = postcode_letters + postcode_numbers
    # Remove the spaces
    new_postcode = "".join(new_postcode)
    # Make sure it's all capitals
    session["new_postcode"] = new_postcode.upper()

    gather = Gather(input="speech",
                    speechTimeout="auto",
                    language="en-GB",
                    speechModel="numbers_and_commands",
                    hints="yes, no, yeah, nope",
                    action="/gather_is_it_correct_phonetic_b")

    # Ask if correct
    gather.say("Is the first part of your postcode " + session["new_postcode"] + "? Please say yes or no.")
    resp.append(gather)

    # Didn't hear anything so go and check count and go to no_1 or no_fallback
    resp.redirect("/no_b")

    return str(resp)

Thank you for taking the time to read through this tutorial. If you’d like to know more or just want to get in touch, you can email me at me@willhelliwell.com.

Happy coding!