Putting the 'role' back in role-playing games since 2002.
Donate to Codex
Good Old Games
  • Welcome to rpgcodex.net, a site dedicated to discussing computer based role-playing games in a free and open fashion. We're less strict than other forums, but please refer to the rules.

    "This message is awaiting moderator approval": All new users must pass through our moderation queue before they will be able to post normally. Until your account has "passed" your posts will only be visible to yourself (and moderators) until they are approved. Give us a week to get around to approving / deleting / ignoring your mundane opinion on crap before hassling us about it. Once you have passed the moderation period (think of it as a test), you will be able to post normally, just like all the other retards.

The beginning of a new Era or the final defiliment - The project to give voice all Morrowind's dialogues with AI has begun

FriendlyMerchant

Guest
It is just an advanced version of microsoft sam though. After you put in the voice, you just put in text, then it plays it back like microsoft sam.
"This improved version of text-to-speech tech is just an improved version of older text-to-speech tech.".

Yeah, NO FUCKING SHIT, autistic Sherlock.

The point is where the improvement comes from.
We moved from a flat delivery on a monotone robotic voice to a software that can clone the voice timbre of the sample offered and create variations in tone according to the context of the text.

If you don't understand why that's a whole fucking leap in complexity I don't think you understand what we are even talking about.
I don't understand why you're so angry when you're conceding to my points. Also, this has been around since the Jordan Peterson deepfakes. Also, Microsoft Sam hasn't been monotone and robotic for at least a decade.

The fact of the matter is that advanced sound profiling has been around for a couple decades. It was already pretty good more than 15 years ago. This is just a minor improvement over than anyways. But then you're fetishizing this stuff and go to pretend that people who address the truth about this minor advancement that's hyped up as a big thing don't understand anything.
 
Last edited by a moderator:
Joined
Jun 6, 2010
Messages
2,385
Location
Milan, Italy
I don't understand why you're so angry when you're conceding to my points.
Because you're being pedantic and petulant without having a good one to make.
No one was under the impression that text-to-speech is entirely new stuff.
The point is how more powerful, flexible and convincing it became with these recent iterations based on AI/machine learning.

Also, you happen to be the one who started going off on a tangent about not liking cartoons (who cares?) and throwing your "retarded" reactions around, so I'm not sure why you are asking for an accommodating reaction in return, now.
 
Last edited:

Squirrel

Novice
Joined
Apr 10, 2013
Messages
9
Location
The Tree
It's flavor of the month and I honestly could not care less.

It reminds me of this recent trend of "this and that as an 80s movie" or whatever pictures that flooded the internet a while back. At first it was cool because nobody had ever seen anything like that before but it got boring and old really fast. Once you've seen them a couple of times, once the novelty factor wears off, what else is there? Right now, it's already so overused and all over place, it has practically gotten to the point where it almost invokes a negative reaction out of me whenever one pops up. Just how many times can one watch stuff like fake Joe Rogan interviewing cartoon characters or Dagoth Ur talking about something stupid before it becomes, well, annoying and lame? It's like every other joke. It does get old. It does get overplayed. And with the intrigue pretty much boils down to only the novelty of it, it gets there even faster, because of high replicability and rampant overusage. And this technology is only going to improve and become more and more readily available and easy to use as time goes on, so it's only decline from here.

Yes, it WOULD be cool to have every Morrowind NPC voiced, but then what? They may sound a little bit wooden now but eventually that's going to be fixed as well and they'll sound just as good and natural as top of the line voice acting. And what happens then? Let's go one step further. Let's say we're 20 years into the future and now every single game can have perfect AI voice for every NPC. Well, what then? It's just going to become something that people take for granted, an afterthought, something cheap and uninspiring, and even worse, outright offputting, when it is overexposed and turns into yet another thing used as a substitution for quality. The same evolution has already occurred with other technologies.

It's pretty much like graphics, except that it's not visual but audio. With each passing day, I can see the parallels between the two becoming more and more indistinguishable. Digital graphics were the craze back in the day and with so much attention put into them, they were ridiculously improved upon in the past I don't know how many years. We've gotten to the point where CGI, especially the high budget crap in movies, is more than capable of capturing photorealism and in many cases literally looks way grander and more detailed than real life. And what has that brought us? Absolutely nothing. Look at this fully computer generated city, or this digitally recreated this actor that looks 100% real. Or look at this video game, look at that normal map, look at that little rock rolling down the hill, etc. Who gives a shit?

At the end of the day, it's basically just dressing up Morrowind with a DX11 shader but for sounds. Better graphics are always good, but at a certain point, you got to play the damn game. No amount of graphics or AI voice lines can help you with that.
 

Seethe

Arbiter
Joined
Nov 22, 2015
Messages
989
Just how many times can one watch stuff like fake Joe Rogan interviewing cartoon characters or Dagoth Ur talking about something stupid before it becomes, well, annoying and lame? It's like every other joke. It does get old. It does get overplayed. And with the intrigue pretty much boils down to only the novelty of it, it gets there even faster, because of high replicability and rampant overusage.
Because "being an adult" is only a state of mind, and an overwhelming about of people are still 13 year old psychopaths with underdeveloped brains in adult bodies, who have no regard for the consequences of their actions. I'm not going to say "most", but definitely a significant portion of mankind is like this. You thought that the internet is going to elevate humanity by offering them constant up-to date information and knowledge? Think again. It has amplified the ugly human nature (just look at social media), and this AI shit will be that but on steroids.
 

Jarmaro

Liturgist
Joined
Dec 31, 2016
Messages
1,481
Location
Lair of Despair
New Voice Packs pop out, this time Cirilla female voice from The Witcher 3 for Skyrim. Sounds quite good.
https://www.nexusmods.com/skyrimspecialedition/mods/86759?tab=description



Now that I see, there are several new Voice Packs. Of course, there is also Geralt Voice Pack.
https://www.nexusmods.com/skyrimspecialedition/mods/86660?tab=description


Yennefer can also be found...
https://www.nexusmods.com/skyrimspecialedition/mods/86923?tab=description



Kratos...
https://www.nexusmods.com/skyrimspecialedition/mods/86651?tab=description


Damn, there's even Master Chief.
https://www.nexusmods.com/skyrimspecialedition/mods/86916?tab=description






In other news, AI Voice pack for Morrowind is at 22.4% of completion.
 

0sacred

poop retainer
Patron
Joined
Feb 12, 2021
Messages
1,888
Location
MFGA (Make Fantasy Great Again)
Codex Year of the Donut
yeah well there's very little inflection in all of the things I've heard so far, no variance in speech patterns. It's the uncanny valley of hearing, convincing at first but then you notice how monotonous it sounds, especially with longer texts.
 

dreughjiggers

Maidenhaver
Joined
Dec 26, 2022
Messages
261
Location
Vvardenfell
That's only because the uploader was lazy and didn't use the sliders. It takes several tries, and yes: the smaller the samples, the better the results.
 

None

Arbiter
Joined
Sep 5, 2019
Messages
1,993
yeah well there's very little inflection in all of the things I've heard so far, no variance in speech patterns. It's the uncanny valley of hearing, convincing at first but then you notice how monotonous it sounds, especially with longer texts.
dreughjiggers is right. The creator was just being lazy. Provide the correct voice samples to match the emotion or inflection you want to convey when you generate a line, splice the results together and edit in pauses or whatever else you need. It isn't going to be 100%, but with only a small amount of work you can make the results sound far less monotone.
 
Joined
Jun 6, 2010
Messages
2,385
Location
Milan, Italy
I've already uploaded examples of 3-10 second sound clips which sound like professional recordings.
What's possibly even more relevant is that in any given moment you can open any game and find examples of "professional recording" that sound far worse.

Of course you aren't getting "Mark Hamill as the Joker" tier quality yet, but I'd say we are way past the point of being in the middle of an "uncanny valley for audio".
 
Joined
Oct 18, 2022
Messages
422
Provide the correct voice samples to match the emotion or inflection you want to convey when you generate a line, splice the results together and edit in pauses or whatever else you need. It isn't going to be 100%, but with only a small amount of work you can make the results sound far less monotone.
It's almost like these generative AIs are tools that require knowledge, skill and effort to use effectively.
 

0sacred

poop retainer
Patron
Joined
Feb 12, 2021
Messages
1,888
Location
MFGA (Make Fantasy Great Again)
Codex Year of the Donut
I've already uploaded examples of 3-10 second sound clips which sound like professional recordings.
What's possibly even more relevant is that in any given moment you can open any game and find examples of "professional recording" that sound far worse.

how is it relevant that humans can speak monotonously and without inflection? Point is, they can also do it better. None of the samples I've heard had that.
 

0sacred

poop retainer
Patron
Joined
Feb 12, 2021
Messages
1,888
Location
MFGA (Make Fantasy Great Again)
Codex Year of the Donut
Don't delude yourselves into believing it's anything but shit tier voice acting at this point. It's good at mimicking human voices, but the voice acting itself is terrible. Imagine a whole game filled with it and then tell me how it's not gonna be uncanny valley material.
 

Jarmaro

Liturgist
Joined
Dec 31, 2016
Messages
1,481
Location
Lair of Despair
New Updated Version has been released:


Here is a list of the new features:
1. MCM menu to switch which voice pack you want to use while in game, this means you can now have multiple voice pack installed at once, you can have up to 127 mod pack installed at once and unlimited amount of patches for those voice pack can also be installed.

2. Lip syncing, you will now see the lips of your character move when in third person if they are talking.

3. Dynamic response time, this makes it so if your character speaks a longer dialogue line it wont be cut off by the NPC and if you speak a shorter dialogue line, the NPC will respond to you quicker.

4. Localized mappings, allowing non English subtitles to play the voice lines in game, this must be enabled in the MCM menu.
It seems the mod is pretty much feature-complete as it is, only took a few months for the moss to grow massively. I've seen literally dozens of different voice mods, many with patches for major Story mods. What a time to be alive.
 

As an Amazon Associate, rpgcodex.net earns from qualifying purchases.
Back
Top Bottom