VoIP for Gaming 101—Part 1

It's time to add voice chat to your game, but it can be complicated figuring out how to introduce voice chat in a way that will be simple, cost effective, and safe. At Modulate, we’ve worked with developers across a range of platforms, genres, and scales to augment their voice chat, so we’ve become quite familiar with the ins and outs of the different voice chat systems. And in doing so, we’ve repeatedly heard the same concerns – that VoIP is a huge engineering investment, that the concepts are complex and unfamiliar, and that the costs are unmanageable in the long run. So we wanted to do our part by putting together everything that we’ve learned about how to implement VoIP reliably, cost-effectively, and quickly.

We’ll break this guide down into two posts where we’ll share our guidance and opinions on:

Part 1: 

What is a VoIP? First we’ll cover the basics and common terminology you’ll likely come across as you research voice chat technologies.  

How to choose the best VoIP service for gaming: Next we’ll introduce you to the most common VoIP providers. 

Comparison of gaming VoIP services: We’ll then present a quick feature comparison between those offerings for your reference.

Part 2: 

How to set up a VoIP server for gaming and how to optimize your VoIP settings: To wrap up this series, we’ll go a bit deeper into how you actually plug these VoIP solutions into your game and walk through the most fundamental concepts in more detail.

Without further ado -- let's get into it!

What is a VoIP?

Voice over Internet Protocol (VoIP) is technology that allows voice communication over the internet, rather than through traditional phone lines. VoIP converts your voice into digital signals and transmits them over data networks, enabling phone calls and video calls through internet-enabled devices. If you’ve used Zoom to meet with coworkers, Apple’s FaceTime to chat with family, or called a friend using Discord’s voice chat, you’ve used a VoIP!  

A Few Top VoIPs

There’s a huge number of voice chat providers out there, but in this blog post, we’ll be focusing on the four most common low-level libraries: Photon Voice, Epic Online Services Voice Chat, Vivox, and Agora Voice. These three offer good coverage for most online games, and even more specialized libraries (like High Fidelity for high-def spatial audio, or Photon Voice for a more abstracted interface) will make use of many similar concepts. 

Photon Voice

While primarily known for their multiplayer networking ecosystem, Photon also provides a capable voice chat framework. For games already using Photon networking, Photon Voice will look very familiar. Photon also provides convenient Unity plugins to simplify the integration process for games built in Unity. Photon Voice works across desktop, mobile, and console platforms making it a good choice for cross-platform games. Modulate's partnership with Photon also allows ToxMod to be quickly and easily integrated into any game already using Photon Voice -- VoIP and voice chat moderation services all in one? That's value!

Drawbacks to Photon Voice

While the Photon Voice SDK can be integrated in Unreal Engine, drag-and-drop plugins aren’t available the same way they are for Unity. Proximity chat in Photon Voice is implemented by using Photon’s networking ecosystem to keep track of audio source locations. This offers a lot of advanced possibilities, but works best for games already using Photon for networking. Games which want to use proximity chat will first have to integrate Fusion, Photon’s networking product.

Epic Online Services Voice

Epic is the most recent provider to start offering a VoIP solution – though the infrastructure has been thoroughly battle-tested within Fortnite itself before Epic opened it up for use by other developers in July of 2021. It’s also rapidly gained popularity among indie developers; in particular for the simple fact that Epic Online Services (EOS) Voice is completely free for any number of players.

Since EOS Voice was originally used locally within Fortnite, its core functionality is reliable and tested, but supporting game engine plugins are new and in active development still. Most notably, the Unity EOS plugin does not offer full Voice support or complete platform support yet, although progress on those fronts is rapid and both can be expected to be completed very soon.

Modulate has also built our own game engine plugins, for both Unreal and Unity, which showcase the use of EOS Voice together with Modulate’s platform for our partners. 

On the flip side, integrating with EOS can also open the door to many other Epic services such as multiplayer, achievements, matchmaking, and anti-cheat. The interoperability between their Lobbies service and their Voice service can be especially appealing since it can simplify the integration of Voice into some titles.

Drawbacks to EOS Voice

It's important to note that EOS Voice does come with some service limitations, most notably that voice chat rooms can be no larger than 16 people. It's also worth noting that EOS Voice does not have certain niche features offered by longer-standing competitors like Vivox, such as positional audio, though we expect Epic to continue extending the EOS Voice solution moving forward. Finally, because it's a new product, EOS Voice support for older game engine versions is not available as of yet. If you're looking to integrate EOS Voice into an existing game, make sure you're using Unreal 4.27+, Unreal 5.2+ or Unity 2020.1.x!

Vivox

Vivox (which was purchased by Unity in 2019) offers a traditional, standard approach for voice chat. It has been used in large games such as League of Legends and Valorant, and includes quite a few features. Vivox provides game engine plugins for both Unity and Unreal, as well as example projects of how to use the plugins in one use case. As with EOS Voice, Modulate has also built our own game engine plugins which demonstrate the use of Vivox on both platforms in combination with our own tools.

Vivox does charge for the use of their VoIP systems for any games with more than 5,000 peak concurrent users – so the choice between Vivox and EOS Voice often depends heavily on whether Vivox’s additional features provide enough value to offset that introduction of costs.

Vivox’s additional features include support for text chat, text to speech (TTS) and speech to text (STT) in addition to voice chat. This diversity of features can be quite powerful, but does come with its tradeoffs. 

Drawbacks to Vivox

Some developers we’ve worked with have expressed confusion when navigating Vivox’s different features or searching for appropriate documentation - especially if they are looking to implement a relatively straightforward voice chat ecosystem that bypasses some of the more advanced options.

Agora

While EOS Voice and Vivox are multi-platform VoIP tools that started in the PC space, Agora is a mobile-native VoIP solution. Agora isn’t specifically targeted at the gaming space, but is used in a variety of social mobile apps like Clubhouse as well as by some game developers like Super Evil Megacorp in their mobile MOBA Vainglory.

Like Vivox, Agora charges customers on a pay-per-minute structure for usage that exceeds a free tier (in Agora’s case, this free tier is 10,000 minutes per month, a slightly different approach than Vivox’s peak concurrent users count.) As mentioned above, Agora is also optimized for mobile, using a peer-to-peer architecture to scale efficiently. 

In terms of features, Agora offers a strong suite of tools including high fidelity sound quality, spatial audio, and noise cancellation. Additionally, Agora supports video streaming (for a higher price). For most games this isn't a need, but if you want video calls between players this feature could separate Agora from the rest of your choices. 

Drawbacks to Agora

Agora’s game engine support is a bit sparser due to their broader focus – they provide a Unity plugin and a raw C++ SDK, but do not currently offer Unreal Engine support. 

Comparing gaming VoIP services

Each VoIP provider offers a few unique features, but there are also a few standard components that typical game developers are looking for. Positional audio (or what we call Proximity Chat), for instance, helps immerse players more deeply into the game environment, especially when players are notoriously wary of poor voice quality or noisy channels interfering with their ability to chat. 

The below chart compares these most standard features in a simple form to help you identify the best provider for you. Note, the best way to confirm specific features and availability is to reach out to these providers directly!

What's Next?

Now that we've compared some of the more popular and well-known VoIP services for gaming, how do you integrate it into your game? Stay tuned for part 2 of this blog post series where we'll share our tips for authenticating, testing, and implementing safeguards.