So, voice is coming to Second Life. But is that a good thing or a bad thing?
For me, perhaps, it’ll just be “business as usual.” Danielle and I are routinely linked via Skype these days while in-world, and even sometimes while one of us is not in-world (such as last night, when I was busy with a large DVD-burning project for most of the evening). We find it highly convenient, allowing us to interact faster, especially on cooperative building projects, where it’s faster for us to speak than type in collaboration. (“OK, you can go align the wall on that side.” “This one over here?” “Yes.”) It also gives us a “back channel” of communication even when we’re not in the same place in-world, which can be useful in our various managerial, DJing, and other activities. Sometimes, we use Skype’s conferencing feature to add similarly-equipped friends to our conversation. This is often useful and occasionally hilarious.
The modes of communication to be supported by SL’s voice chat implementation will include similar capabilities (2-way private communications and multi-way conferencing), but will also include an in-world mode where avatars on “voice-enabled land” (whatever that means) will hear each other with spatially-correct audio (e.g., avatars on your left will sound like they’re on your left, and distant avatars will sound more faint than ones close by). The actual voice processing will occur on servers other than the sim servers, so it shouldn’t contribute to server-side lag; as for client-side lag, it should be no harder on a client system than running Skype alongside SL, and likely less so.
The question is, is this too big a break in the “fourth wall” of SL, detracting too much from an immersive experience? The upstanding inventrix, Ms. Ordinal Malaprop, has neatly summarized many of the objections to built-in voice, such as:
- It would ruin immersion to have voices that were inappropriate for an avatar’s appearance. Not too big a concern for me or Danielle, but all those guys playing as female avatars must be quaking in their boots (or stiletto heels, as appropriate 🙂 ).
- “People are idiots.” Having previously been an Xbox Live user, where voice is part of the experience, I can vouch for this…
- It will make group discussions more complicated, and too easy for someone to literally “shout down” others.
- Busy areas would be a nightmare. I think of the Gin Rummy with a packed house and voice chat running, and I can understand this.
- Voice spamming might become a problem.
- Non-native English speakers will have a harder time, because of the problem of accents, and also because there’s no Babbler for voice. (Well, there’s kind of no Babbler for text right now, either…Google Translate, the back-end engine Babbler uses, seems to be groaning a bit under the strain. Max Case, Babbler’s inventor, is aware of the problem and is trying to find workarounds.) This would be something of a concern at the GR, too, as we seem to have a certain degree of popularity among Brazilians at the moment.
- It’s harder to log voice chat; you’d have to be making a recording, you couldn’t just copy and paste the logs. And there’s also no easy way to grep voice chat logs for specific keywords or phrases.
(This just glosses the surface; her post on the subject is well worth your time to read. Go ahead. I’ll wait.)
Many of these objections could be leveled against the use of Skype with SL, too…and I’ve certainly dealt with at least one idiot on Skype. Yet, for every person like Ms. Malaprop that finds the idea of built-in voice an annoyance at best, there’ll probably be one who will see it as a Godsend. The Reverend Triste Bertrand, for instance, will probably find it extremely useful for running his weekly Bible study. Language classes in SL will also benefit from the ability to pipe in the voices of native speakers without the use of external software. And certainly, other online worlds, such as There, seem to have integrated the use of voice successfully. For myself, I’m interested to see whether the SL solution is better or worse than the Skype-based system Danielle and I use now. (I haven’t yet tried to sign up for the beta; for all I know, it may be too late to do so. Danielle has, though.) We may continue to use it in much the same way we use Skype now, and just ignore the other modes if they get too out of hand.
Of course, your mileage may vary. Ms. Malaprop, for instance, probably won’t bother with the voice support. I would remind everyone not to judge others by what form of communication they will or won’t accept; the polite thing to do is to simply use the mode of communication that best suits both you and the person you’re speaking to. (This is, incidentally, why I start writing Victorian prose in comments over on An Engine Fit For My Proceeding; it fits the setting and the person to whom I’m addressing my comments. It’s also why I try to dress properly before TP’ing over to any location on Caledon.) “Manners matter,” in the words of Queen Clarice of Genovia, and not just in Caledon, either.
Of greater concern to other people, such as Alexander Lapointe, is that LL seems to have its priorities inverted:
What really bugs me about this is the fact that Linden Lab is doing this now instead of pushing all of their efforts into making the grid more stable. […] come on LL, could you please focus on making the growing grid a little more stable first? Please?
I don’t know as I see the two as being mutually exclusive. We already know that the voice service will run through servers that are not part of the SL Grid proper; it shouldn’t have any more of an impact on server stability than, say, the ability to attach music stream URLs to a parcel (which is similar in some respects, and is a capability used daily by DJs and musicians across the Grid without incident–indeed, it’s sometimes the most reliable aspect of an event). The major growing pains for SL seem to involve scalability of the architecture as concurrent users increase, and as they move to hosting in multiple data centers…and these would be issues with or without added voice. And trying to throw every single engineer LL has onto those problems would probably be the sort of situation I refer to as a “fustercluck”; Brooks’ Law dictates that you’ll never get as much out of that kind of radical focus as you expect you will. The people that are working on Grid stability are no doubt continuing to do so; I doubt that the voice addition requires the attentions of more than one or two engineers.
Anyway, ready or not, voice is coming. I will be watching developments in this area with interest…but tempered by a dose of healthy skepticism.