Updated on: February 1, 2024
In an a recent SafetyDetectives interview, Mike Pappas, co-founder of Modulate, shares the company’s journey from the halls of MIT to pioneering AI voice moderation in online gaming. The interview delves into Modulate’s primary product, ToxMod, addressing safety concerns in the evolving landscape of online gaming. Pappas discusses key indicators for monitoring player interactions, the delicate balance between proactive moderation and player privacy, and the future of player-driven moderation tools. His insights provide a compelling look into Modulate’s mission and its pivotal role in creating a safer and more inclusive online gaming community.
Can you talk about your journey and what caused you to start up Modulate?
I met my eventual co-founder, Carter, as an undergraduate at MIT. I was walking down the halls of MIT and saw someone working on an interesting-looking physics problem. I mustered all of my social courage, and I stood awkwardly behind him for 10 minutes until I found his missing minus sign. We quickly bonded with each other and found that we really enjoy working with each other. We had a good way of talking through technical problems and breaking them apart.
By sophomore or junior year, we said we wanted to go into business together. Our goal was to do something that matters together; however, we didn’t really have any idea what that would be at the time. We had a bunch of grand ambitions and sort of philosophical ideas. It wasn’t until we kind of left school and went out to industry that we started to come across technology and ideas that really motivated us in terms of what we could actually do to make an impact.
The specific idea for Modulate originally came from Carter’s research looking into neural style transfer. It was originally something like uploading a photo of yourself and then removing the glasses from the photo using AI. Carter was wondering what would happen if you applied that technology to audio – could you use that to change characteristics of someone’s voice? Which is where Modulate got started, and that’s where the name Modulate comes from.
We went and chatted with a lot of industries about how they would use voice changing. We were looking for a use case that really spoke to us, and the one that really resonated was in gaming. We kept hearing of games turning into social platforms and the need for people to be able to feel safe participating socially. The platforms were saying, “Maybe you could let us have kids or women or whoever disguise their voice so they don’t get harassed.”
We looked at that and said, “Okay. That problem statement sounds like something we could be really passionate about. We don’t love that solution to it, which is making the victim go and hide. So, let’s learn more about other ways of solving the problem.” We dug deeper into voice moderation, and ultimately we were able to repurpose a lot of the technology for that area.
Can you tell me a little bit about Modulate and what you do?
We’re a 40 person company based in Boston, Massachusetts. Our primary product is something called ToxMod, which is an AI voice moderation solution. ToxMod works in three high-level stages:
Triage lets us know which conversations are worth taking a closer look at for various reasons, including privacy and costs. We don’t want to be processing everything that everyone is ever saying on the platform. Therefore, we have these very light-touch triage models; you can think of it like watching out of the corner of your eye. It sends a warning, “Hey, something about this conversation looks worrying.” Maybe it’s a young kid that just got into a conversation with a much older person they’ve never spoken with before, or it detects a bunch of emotions that are clearly riling high, even if we don’t know exactly what’s being said.
Those prompts trigger us to say, “Let’s take a closer look and do a more detailed analysis, to see if there is anything harmful or not.” If we’ve determined that there is something harmful, then we send it back as a report to the studio so they can decide how to act on it.
How do you see the landscape of player behavior evolving in online gaming, and what challenges does this pose for Trust & Safety teams?
I think we’re in the middle of a really big transition. If you go back 5 or 10 years ago, games were curated experiences. You came in to play a very specific thing or do a specific thing. So, games thinking about codes of conduct could afford to be fairly narrow. They could say something like, “If you’re playing this game, it’s because you’re here to compete. Let’s just not talk about this other stuff.” For example, if you’re only an esports platform, you don’t really need a code of conduct about how to have detailed nuanced discussions about complex political topics. You can just say that this is not the place for that type of conversation.
But over the last few years, there’s been this transition to games as social platforms. People aren’t just there to fulfill a purpose or play a game; they’re there because it’s where to hang out with the community. It’s a place to spend time, foster relationships, and enrich relationships with each other.
As games shift to being these social platforms, more akin in certain ways to X/Twitter or Facebook, suddenly there is a need to allow people to have a much wider range of conversations on those platforms.
I think that’s the transition that gaming’s been in the midst of: needing to go from what is “good sportsmanship” as a trust and safety philosophy, to solving a much broader problem about how do you bring lots of people together from different corners of the world, different perspectives, and have them all coexist.
What are some key indicators or red flags that trust and safety teams should be vigilant about when monitoring player interactions?
There are so many different answers to that, depending on what they’re trying to look for, right? For example, if they’re looking for the safety of kids, because it’s a child-focused platform, then you’re going to need to think about who is reaching out to these kids to start conversations. Are there third parties in the conversations, are people trading gifts in the game, and all this kind of stuff? Because when you’re thinking about protecting kids, a lot of the greatest dangers come from people that the kids are deceived into trusting.
In games that have a more mature audience, they need to be aware of the fact that kids are going to get a hold of their parents’ systems. There’s going to be kids on the platform no matter what, but they can tune their focus a little bit and focus on the kinds of conflicts that you tend to see in more mature audiences. That’s going to be things like hate speech and that kind of riling up towards violent extremism and those kinds of behaviors.
The first piece of advice, I know it’s not quite a key indicator, but the first piece of advice I give every studio is – figure out what kind of a venue you are. If you’re the town library versus a playground versus a sports bar, then you need to look for different things. You need to decide what kind of space you are, and then we can talk about what to find for you. However, many platforms haven’t thought about it that way and don’t yet have a clear answer in their head about what kind of space they want to offer in the first place.
How do you strike a balance between proactive moderation and respecting players autonomy and privacy within the game?
It’s a huge part of the design, as I mentioned with our triage system earlier. The analogy I use is if you take your kid to the playground, you don’t spend the whole time looming behind them, watching and listening to every single thing they do. At the same time, it’s not okay for parents to just abandon their kids at the playground and go home to respect their privacy.
The compromise that we find in the physical world is the parent hangs out with the other parents over at the sidewalk, watching out of the corner of their eye. You won’t know every single thing your child is doing, but you’ll be in a position to see that there’s a suspicious adult walking up to them, or there’s a bunch of other kids pointing and laughing, and use that as a prompt to get closer.
Our philosophy is to take inspiration from the physical world, where we’ve had to find that balance between privacy and safety. We had to figure out how far we’re willing to go and say, can we replicate that same sort of trade-off in the virtual world that makes sense in the physical world to balance between these two really important concerns.
In your view, what’s the future of player driven moderation tools and how can they be improved to better serve the community?
I’m going to start with “how can we improve” first, because for players to take more autonomy and engage more deeply in the moderation conversation, they need to be able to talk to platforms about what kind of content they want or expect and what they view as toxic. That’s not solely on players, but there’s complex vocabulary here. When we talk about hate or harassment, what exactly does that mean?
The first thing that platforms need to do to equip their players to engage more meaningfully is transparency in the code of conduct and in their moderation policies in general. They need to lay out for their players, here are the things that are and aren’t allowed in a language that their players can understand and engage with.