Nvidia may be best known for graphics cards you can’t find in stores, but the company also makes some interesting software tools. An example of this is the noise removal feature known as RTX voice, which was upgraded to work with all GeForce cards earlier this year, and does an excellent job of cleaning up background noise.
Now Nvidia (Thanks, 80.lv) has been showing off a new tool in beta this year involving sound. Audio2Face is an impressive looking auto rigging process that runs within Nvidia’s open real-time simulation platform, Omniverse. It has the ability to take an audio file, and apply surprisingly well matching animations to the included Digital Mark 3D character model.
It does this automatically, works well with most languages, and can be adjusted using sliders for more detail.
To do this, Nvidia has implemented a deep neural network that matches the facial animations to the audio in real time. When you first start the program it may take a moment to build the Tensor RT Engine which optimises the neural net for your hardware. From there, you should be able to see changes in real time or bake them in, as long as your hardware can hack it.
As an added bonus, Nvidia on Demand is the company’s tutorial website, and it features heaps of videos on Omniverse and Audio2Face. The first video on Audio2Face is pretty far back in March 2020, but more recent tutorials even detail how to export the process to other tools like Unreal Engine 4.
After watching a few videos, it does seem to depend a bit on the quality of the audio used. Charlie Chaplin’s uncharacteristically famous speech from the film, The Great Dictator is featured in one video and the movements just don’t seem as clear as they do with crisper, modern recordings. That being said, unlike some audio quality, this tool is likely to only get better with time.