Setting up a VoIP server for gaming and optimizing VoIP settings
Last month, we shared our first part of this guide to VoIP for gaming where we gave an overview of what a VoIP is and shared some insights into some common VoIP services for games.
Let's say you’ve picked your VoIP provider. Now you’ll need to actually integrate voice chat into your game. This process can be intimidating, especially due to some of the new or inconsistent vocabulary used by VoIP providers, but can actually be done in a matter of days or even hours once you gain comfort with the basic components of each system. Below, we walk through the basic steps that you’ll be required to follow for any provider, and attempt to clarify the different terminology or slightly different framings that each provider uses for these steps.
Authenticating With Your Provider
When initializing VoIP within your game, you’ll need to authenticate with the VoIP provider before you can begin transmitting audio over their network. This authentication is typically done on a per-player basis so is best handled as part of your game client whenever a player chooses to participate in voice.
Each provider has a slightly different authentication methodology, but in practice they all look fairly similar - you’re given an API key of some kind, you send that up to the provider's servers along with player information, and you get back a per-user token if everything worked.
For Vivox, this is pretty much exactly the process - you’ll pass their server a key from your game and receive back a unique token for each authenticating player. Epic Voice provides two different authentication methods. The first way is to use the EOS Lobbies SDK to manage rooms. In this case the Lobbies SDK manages per-member tokens which Voice can use directly, effectively bypassing the need for extra authentication - but of course, this depends on you having already utilized the Lobbies SDK. The second way is more similar to Vivox, in which you create a dedicated trusted server instead of the Lobbies SDK which can request per-user tokens from the EOS backend and distribute them to players. This second approach can be a bit more complex, so this is the approach we demonstrate in the Modulate demo projects. Finally, Agora also uses tokens as the preferred authentication method, and provides example code in a number of languages to run your own server to issue tokens.
Who’s Speaking?
Every voice chat has some notion of having separate groups of people talking amongst themselves. Before integrating, make sure you’ve thought about who these groups are in your game. Depending on the specifics of your title, these may be individual teams working together; everyone in a given match; anyone within a certain server or spatial in-game area; or just whichever friends decided they wanted to chat.
Each provider uses different terminology for these groups - for Vivox, they are Channels, while Epic Voice calls them Rooms and Agora users the term ‘Streams.’ But the interface for each is pretty consistent. You’ll define some unique ID for each group of people, and then have “join” and “leave” events for each player when they enter or exit the group chat. Once a player has joined the chat group, you’ll usually just execute a simple call from your provider’s API to begin uploading audio from the player’s microphone, which your provider will distribute to anyone else in the same group to hear.
Implement Safeguards
Voice chat is powerful, but can also be a medium for abuse. While it’s clear that the benefits of voice chat outweigh the costs, there are also a number of simple steps you can take to empower your players and reduce the risk of misbehavior causing harm to your community.
First off, all three major VoIP providers support the ability to not only mute a player within a whole group, but to specifically mute them for any individual member of the group. Make sure you expose UI elements for your players to mute individuals who are trolling, harassing, or otherwise harming others within the group. Some games tie this to additional reporting functionality, which can be a powerful step to better identify who the most aggressive bad actors are on your platform, but make sure that you’re not adding extra hoops for your players to jump through in order to protect themselves.
Extending that logic, while muting is great for putting control back in the hands of your players, you also don’t want to make it their job to clean up your community. Some bad behavior (like child grooming or radicalization) involves victims who don’t realize they are in trouble until it’s too late, and even if your players are fully aware that an aggressor deserves to be punished, they may not have the energy, confidence, or motivation to take action against them - and may instead simply churn away from playing your game moving forward. So make sure you consider proactive steps that you can take to prevent misbehavior in your community. This is Modulate’s specialty, and we’ve worked with a variety of game studios to implement multiple tools to ensure voice chat stays safe and immersive for your whole community. Our ToxMod voice moderation solution identifies misbehavior – as defined by your game’s code of conduct – and flags it to your moderators even in the absence of additional player reports.
Test and Deploy
Once the pipes are all in place, be sure to take some time to test voice chat features. It’s well worth the time to test on each hardware platform, as audio parameters can vary (different microphones, sample rates, buffer sizes etc) in ways that are hard or impossible to catch from documentation or code alone. Make sure to verify basic functionality (is audio coming through, no stuttering/glitching) as well as making sure the experience is pleasant (speakers are all similar in volume, audio UI is consistent). Finally, just as you’d buckle your seatbelt before you begin driving, make sure to test that your safeguards are working as intended before you go live. If you’re working with Modulate, we’ll collaborate with you to test our integrations - in ToxMod’s case, we’ll ensure ToxMod catches some mock toxic audio you say in a test chat; while for VoiceWear, we’ll want to validate both the quality of the converted voice as well as ensuring that everyone else in the group hears the skinned audio and not the original.
Unlocking the Potential of Your VoIP
Integrating voice chat can seem daunting at first, given the wide variety of providers each using different terminology and offering a diverse set of features. But the good news is that, as you’ve hopefully seen, VoIP integration in most games can be boiled down to a few simple steps which developers can complete in days or hours, not weeks or months. We hope this guide has been a useful tool for you to identify your ideal provider and get a sense of what that integration will actually require - and encourage you to reach out if you’re interested in learning more or testing out Modulate’s ToxMod technology.