7 min read

Dialogflow Voicebot Best Practices: More Than Chatbots

Featured Image

Conversational AI has become immensely popular in recent years among businesses, users, and developers. With so many conversational AI platforms on the market, it can be difficult to tell which one offers the features you need for your business.

One popular platform is Google Dialogflow. It's most commonly used as a tool for building text-based chatbots, but it can do more than that. In fact, Dialogflow works behind the scenes for many of the most popular customer service solutions, including Mosaicx.

So if it's best known for text-based chatbots, why should you trust voice-based conversational AI using Google Dialogflow? Because Dialogflow is also widely used for another type of application: voicebots.

What is a voicebot?

A voicebot is a conversational agent that uses artificial intelligence and natural language understanding (NLU) to interpret the intent and meaning in the speech of its conversational partner. Voicebots use input voice recognition and translation (commonly referred to as speech-to-text or STT) alongside a text-to-speech (TTS) engine to understand human speech and respond using everyday language.

Voicebots are similar to chatbots—which interact with users through text-based channels, such as websites and messaging apps—but they are designed to be used over the phone or other voice-enabled devices. This allows users to interact with voicebots in a more natural way, without having to type or use a keyboard. Some examples of voicebots include Siri, Alexa, Google Assistant, Cortana, and Bixby.

Voicebots can be used for a variety of purposes, such as:

  • Customer service: Answer customer questions, provide support, and resolve issues.
  • Sales: Generate leads, qualify prospects, and close deals.
  • Marketing: Promote products and services, collect feedback, and drive traffic to websites.
  • Education: Provide educational content, answer questions, and help students learn.
  • Entertainment: Play games, tell jokes, and provide other forms of entertainment.

A voicebot is a conversational agent that uses artificial intelligence and natural language understanding (NLU) to interpret the intent and meaning in the speech of its conversational partner.

What is DialogFlow?

Dialogflow is a powerful and versatile platform provided by Google that is widely used for building chatbots and voicebots. It offers a range of features that make it an excellent choice for developing conversational applications.

The platform is backed by Google's expertise in natural language processing and artificial intelligence. If you've ever talked to Google Assistant on a smartphone or smart speaker, you've experienced some of the best voice-recognition technology on the market. That same technology backs Dialogflow, which ensures that your voicebot can handle complex queries and provide a great user experience.

Other features that make Dialogflow well-suited for voicebots include:

  • Speech-to-text and text-to-speech capabilities
  • Natural language understanding
  • Intent detection
  • Context management
  • Integration with other Google Cloud Platform services

Getting Started with Conversational AI Using Google Dialogflow

Building a voicebot with Dialogflow starts with creating agents and defining intents. Intents are the reasons customers call. These intents should define what your voicebot can do, such as answering questions, providing information, or completing tasks. You will also need to create entities, which are the words or phrases that your voicebot can understand.

Once you have created your agents and defined your intents and entities, you must build conversation flows. Conversation flows are the sequences of steps that your voicebot will take to complete a task or answer a question.

All of this can time-consuming and confusing. First, you must discover why people call and be sure not to leave out any important intents. Secondly, people may phrase each intent 100 different ways. These are your entities. So If a Dialogflow voicebot responds to 50 intents, engineers must program up to 5,000 variant entities. For example, the voicebot must understand that “Los Angeles,” “Los Angeles International,” “LA,” and “LAX” all describe the same airport.

It's easy to miss intents and entities, but the Mosaicx team helps you navigate these pitfalls. We also continually modify intents and entities over time.

Once the voicebot is ready, you must make it accessible to the public. If your voicebot is great but the underlying infrastructure experiences frequent downtime, you'll create a poor customer experience. Mosaicx provides a robust, scalable infrastructure for some of the largest brands in the United States, allowing voicebots to serve their customers via call, text, and other channels. 

How does Dialogflow compare?

Many small conversational AI platforms have popped up in recent years, but there are still only a couple true alternatives to DialogFlow:

  • Amazon Lex: Amazon Lex is a natural language processing service that can be used to build conversational interfaces, including chatbots and voicebots. It offers a number of features that are similar to DialogFlow, such as speech-to-text and text-to-speech capabilities, natural language understanding, intent detection, and context management.
  • IBM Watson Assistant: IBM Watson Assistant is another natural language processing service that can be used to build conversational interfaces. It offers a number of features that are similar to DialogFlow and Amazon Lex, as well as some unique features, such as the ability to integrate with IBM's Watson AI services.

Although all three are backed by some of the biggest names in tech, Dialogflow offers a number of advantages over its competitors. Here are a few reasons why we use Dialogflow in Mosaicx products:

  • Powerful natural language processing: Dialogflow uses Google's advanced natural language processing technology to understand user queries accurately and respond in a natural way. This makes it a great choice for voicebots that need to handle complex or open-ended queries.
  • Wide range of integrations: Dialogflow integrates with a wide range of other Google Cloud Platform services, as well as with third-party services. This makes it easy to connect your voicebot to your existing systems and data sources.
  • Global reach: Dialogflow supports a wide range of languages and dialects, making it a great choice for voicebots that need to be used in multiple countries.
  • Scalability: Dialogflow is designed to scale to meet the needs of even the most demanding voicebot projects. You can easily add more agents and users as your chatbot grows.

In addition to these advantages, Dialogflow is also backed by Google's expertise in natural language processing and artificial intelligence. This means that you can be confident that your Dialogflow voicebot will be able to handle even the most complex queries and provide your users with a great experience.

Here is a table that compares Dialogflow to some of its top competitors:

Feature

Dialogflow

Amazon Lex

IBM Watson Assistant

Ease of use

Easy

Medium

Difficult

Natural language processing

Powerful

Good

Good

Integrations

Wide range

Limited

Limited

Global reach

Yes

Yes

Yes

Scalability

Yes

Yes

Yes

 

At Mosaicx, we believe Dialogflow offers the best technology, experience, and value for voicebot projects of all sizes. That's why we incorporated Dialogflow into our own products. There's no need to create an all-new language model. Using Dialogflow allows us to do what we do best: create industry-specific, ready-to-use products and offer ongoing consultation, optimization, and support.

What Language Model Do Dialogflow Voicebots use?

Speaking of language models, Dialogflow uses the BERT large language model (LLM) for its NLU capabilities. BERT was first introduced in 2018 by Google AI, and it has since become one of the most popular and widely used language models for natural language processing (NLP) tasks.

BERT is a bidirectional encoder representation from transformers model. This means that it can learn the context of words in a sentence, both before and after the word. This makes it more powerful than previous language models, which could only learn the context of words in a sentence from left to right.

Dialogflow CX, the newer version of Dialogflow, uses a BERT-based NLU model by default. Dialogflow ES, the older version of Dialogflow, can also use a BERT-based NLU model, but it is not enabled by default. To use a BERT-based NLU model in Dialogflow ES, you need to enable the "Experimental Features" flag in the Dialogflow console.

Here are some of the benefits of using the BERT language model in Dialogflow:

  • Improved intent detection: BERT can better understand the context of words in a sentence, which helps Dialogflow to more accurately detect the intent of user queries.
  • Improved entity extraction: BERT can better identify entities in user queries, such as names, places, and products. This information can be used to provide more relevant and personalized responses to users.
  • Improved question answering: BERT can better understand the meaning of user queries, which helps Dialogflow to provide more accurate and informative answers.

Overall, the use of the BERT language model in Dialogflow makes it a more powerful and versatile NLP platform for building conversational interfaces.

BERT VS. GPT And Other LLMs

BERT is very large. The base version of BERT has 110 million parameters, and the larger version has 340 million parameters. This makes it more computationally expensive to train than smaller language models, but it also allows it to learn more complex relationships between words.

BERT has been shown to be effective at a variety of NLP tasks, including:

  • Text classification
  • Question answering
  • Natural language inference
  • Named entity recognition
  • Sentiment analysis

The best large language model is a matter of opinion, but some of the most popular and well-regarded models include:

  • GPT-4 is a large language model developed by OpenAI. It is one of the most powerful language models available, and it has been shown to be effective at a variety of tasks, including text generation, translation, and question answering.
  • Turing NLG is a large language model developed by Microsoft. It is designed to generate human-quality text, and it has been used to create a variety of products, including chatbots and virtual assistants.
  • PaLM 2 is a large language model developed by Google AI. It is one of the most recent large language models, and it is already setting new benchmarks in a variety of tasks. It is the LLM that powers Google's Bard.
  • LaMDA is another large language model developed by Google AI. Unlike most other models, it was trained on dialogue, allowing it to pick up on nuances that other models may miss.
  • Llama 2 is a recent large language model developed by Meta and Microsoft. It has been released as an open-source model, allowing the tech community to use and modify it as they see fit.

Ultimately, the best large language model for you will depend on your specific needs and requirements. If you need a model that can perform a variety of tasks, GPT-4 or Turing NLG may be a good choice. If you need a model that can generate human-quality text, PaLM or LaMDA may be a better option. It is important to note that large language models are still under development, and they are constantly being improved.

Although GPT-4 has gotten a lot of attention, it's not always the best LLM. BERT excels at tasks that require understanding the context of words in a sentence, such as question answering and natural language inference. This is because BERT is a bidirectional model, which means that it can learn the context of words in a sentence from both before and after the word.

GPT-4 is better at tasks that require generating text, such as summarization and translation. This is because GPT is an autoregressive model, which means that it can predict the next word in a sequence given the previous words.

Here is a table that summarizes the key differences between BERT and GPT-4:

Feature

BERT

GPT-4

Model type

Bidirectional encoder

Autoregressive

Strengths

Understanding the context of words, question answering, natural language inference

Generating text, summarization, translation

Weaknesses

Generating text, summarization, translation

Understanding the context of words, question answering, natural language inference

 

In general, BERT is better for tasks that require understanding the meaning of text, while GPT is better for tasks that require generating text. These differences make BERT (and Dialogflow) a better option for chatbots and voicebots.

For all these reasons, we trust Dialogflow when building Mosaicx voicebots. But choosing a conversational AI platform is just step one. Next come the four steps to build, train, test, and deploy a conversational AI solution.