Developers don’t need to be building a voice or messaging tool to find Twilio’s APIs useful. The company, which is most known for its communications platform, also has a wide variety of APIs that can help developers embed more functionality within their apps, including those for authentication, session management and data synchronization
The new TwilioAuth SDK in the Twilio Console can be used to add push notification authentication, passwordless logins and approving in-app transactions to apps on iOS and Android.
With recent successful attacks on SMS-based authentication, the ability to handily add an authentication agent to your app is definitely useful.
If a developer only addressing Android, there’s a new Twilio Verification SDK for Android in developer preview, that works with the Google SMS Retriever API; this lets apps registered with Google Play Services use SMS to verify the user’s identity without giving them access to all your text messages — just the SMS the app has told the API it’s looking for.
Google and Twilio collaborated on this API to make it easier for developers around the world to use it without worrying about geographical numbers. iOS doesn’t allow programmatic access to text messages, so there’s no equivalent for iPhones, but this lets apps use phone numbers instead of emails for verifications that are harder to fake.
The SDK can spot phone numbers that are fraudsters using VoIP numbers rather than connected to real devices, and you don’t have to worry about users making a typo when they fill in their email address and never completing the signup process.
This Twilio Authentication service is part of what Twilio calls its Engagement Cloud. This is a package of APIs for different ways to communicate with customers. It includes the Notify API to send push notifications, the TaskRouter API to connect customers to the agent in a call center who has the right skills or speaks the right language, and the new Proxy API for connecting a customer to a specific employee without sharing personal phone numbers that ought to stay private.
Lyft uses these services to connect riders and drivers. Morgan Stanley Wealth Management will start using Proxy to let you text message a broker without sharing the real phone numbers.
Another customer is Nordstrom.
“Nordstrom wants the personal shoppers to be high touch and great experience. They want the shoppers to text with customers when they see something that would suit them, but they won’t want to give out the customer’s personal phone number,” Twilio CEO Jeff Lawson explained to the New Stack. Privacy is only part of that; it’s also about session management. “What if my usual personal shopper goes away? You want the next person to seamlessly take over the conversation. With Remind, they want to proxy the communications between students and tutors so it’s safe and secure, and they want to log and report the calls.”
Proxy doesn’t just handle the routing of calls and texts without disclosing the original numbers (and managing and load balancing a pool of phone numbers so it can provision the temporary phone numbers in real time to avoid queuing and latency, using geographically local numbers to keep costs down); it also includes session management and logging. A session is a JSON representation of a conversation between two people, which might cover multiple channels; if a text message and a voice call are both about the same order, you want them grouped into the same session and logged together.
Proxy uses Twilio Channels, so a session can include chat through the Twilio Chat API across a range of services, like a Facebook wall message. Developers can set the “time to live” on a session through Twilio Chat; a chat session can be closed when an agent has handled a support call so they can close that support ticket rather than having to stay in the channel, but the interaction between a buyer and seller in a marketplace could take days or weeks (and you can set the time-to-live to zero for a multiyear conversation that just keeps going).
A Twilio Chat service can be done automatically by checking if the user is reachable (they’re active on a device or they’ve registered for push notifications), or with timers; if a customer is chatting with an agent on a web site and they close the tab, the agent won’t know they’re gone but the backend can use a timer to close the chat and clean up the session.
The Chat API uses pre and post events to handle messages; the synchronous pre-events can trigger notifications or block events before they’re processed — perhaps to block a message with swearing or a credit card or phone number in — and the asynchronous post-events that occur after a message has been processed are useful for logging, or for triggering a chatbot.
Twilio’s Chat and Proxy APIs take care of delivery to whatever devices users are active on. To manage and sync state between users and devices, Twilio Sync has an SDK and REST API for Android, iOS and major browsers that allow to store, view and update state on devices, 16,000 at a time, using token-authenticated WebSockets, and bi-directional webhooks to invoke your backend and processing logic.
The 16,000 limit doesn’t stop developers sending more data; they can have multiple 16,000 sync objects, such as documents, lists or unordered JSON collections, or they may keep one sync object that’s a pointer an S3 bucket. Developers can use Sync to create collaborative or cross-platform apps, with each update to the app getting synchronized back and forth between devices, or for real-time applications like co-browsing, dashboards, route planning, tracking apps and anything else where you need to make sure that no state is lost — because the user can always look back at the state that was stored in the cloud to decide what information needs to be sent to a device (and devices that are offline store sync locally and resynchronize once connected). Twilio’s programmable chat is actually built on top of the Sync API, storing the state for all the message and user objects.
The Channels model will work well for Twilio’s Speech Recognition and natural language Understand APIs. Initially it’s for voice calls, especially to call centers, letting any developer create speech-driven interactive voice response systems (instead of making users press buttons in the dialer on their phone); it’s built into Twilio’s Gather API, so the options are those that make sense inside a call, and at the moment it only does 60 seconds of voice recognition. You can set timeouts to make sure people have finished speaking before you start the recognition, including pauses, but you have to specify which of the 89 languages and variants the spoken words are in. You can give the recognizer hints to boost the speech model, which handles general rather than specialized vocabulary; you’ll want to do that for names and number formats.
But, Lawson told us, “We’ll have more flexible ways to use it in future. If you build your natural language understanding for IVR and now you want it for text messaging or for an Alexa skill, you can reuse your models. And if Alexa or Apple or Google does something new next week, you can use that with Twilio.” And because these are APIs, you can use the recognized speech to drive other code; emailing a transcript to a customer, passing it into a form or using an intent classifier to pick out verbs, nouns and dates so you can make a booking or process an order.