Sony Might Add Speech Recognition To Create Assets In Its Games

Sony Interactive Entertainment is supposedly working on a video game development technology that uses "natural language input" for creating and modifying in-game assets through machine learning.

Sony
UNCHARTED™: Legacy of Thieves Collection | Source: Steam

Rundown:

  • Sony Interactive Entertainment may be working on a video game development technology that uses “natural language input” for creating and modifying in-game assets through machine learning.
  • The video game developer could provide an auditory description of what they want to implement within the computer simulation, and as a result, a modified asset may be generated by the neural network, which would be provided with a description of the desired digital asset by the physics engine.
  • The patent also discusses modifying existing digital assets with a physics engine within a computer simulation through speech recognition to include certain constraints, which could either be physical or geometrical.
  • The patent discusses maintaining a “library of assets” that the technology could use upon the request of a digital asset instead of generating another one from scratch, which may require more resources.

In recent years, video game development technologies have advanced significantly to not only make video games more accessible and appealing to players but also to make it more convenient for video game developers utilising those video game development technologies. Some video game developers have even come forward to say that next-generation video game engines, like Unreal Engine 5, have made it possible for indie game developers to make video games that would’ve been unimaginable even a decade ago.

With such revolutionary advancements in video game development technologies, video game companies are constantly trying to compete with each other to release the next revolutionary technology that would change the way video game development works. One such video game and digital entertainment company is Sony Interactive Entertainment, that’s constantly working on new features and technologies to incorporate into its video game franchises and technologies.

Recently, the company published a patent to track the distribution history of in-game digital assets, such as cosmetics or even gameplay moments, by assigning unique tokens to them, and of course, this isn’t the only thing that it’s currently working on.

Besides innovative features that the company may be incorporating into its video game franchises soon, it seems like Sony Interactive Entertainment may also have some revolutionary video game technologies in development at the moment, and one of them is rather unusual.

Earlier today, we came across a recently published patent from Sony Interactive Entertainment under the name of Sony Interactive Entertainment Inc. titled “VOICE DRIVEN MODIFICATION OF PHYSICAL PROPERTIES AND PHYSICS PARAMETERIZATION IN A CLOSED SIMULATION LOOP FOR CREATING STATIC ASSETS IN COMPUTER SIMULATIONS,” which was filed back in May 2021 and published only a couple of days ago. From what it seems, the company aspires to create a video game development technology that uses speech recognition to process natural language into digital assets within computer simulations, such as video games, through machine learning.

Sony
Example screenshot prompting a person to enter speech for text identification of a computer simulation asset. | Source: PATENTSCOPE

The abstract of the patent reads, “A computer simulation object such as a chair is described by voice or photo input to render a 2D image. Machine learning may be used to convert voice input to the 2D image. The 2D image is converted to a 3D object and the 3D object or portions thereof are used in the computer simulation, such as a computer game, as the object such as a chair. A physics engine can be used to modify the 3D objects.” Hence, not only could speech recognition be used to create two-dimensional and/or three-dimensional assets through natural language processing within video games, but it could also be used to modify such assets using a physics engine.

One of the implementations of this technology that the patent discusses is by “processing information associated with the asset using at least one physics engine to render an output.”

Hence, the video game developer could provide an auditory description of what they want to implement within the computer simulation, and as a result, a modified asset may be generated by the neural network, which would be provided with a description of the desired digital asset by the physics engine.

Furthermore, the patent also discusses modifying existing digital assets with a physics engine within a computer simulation through speech recognition to include certain constraints. “If desired, the method may include inputting at least one rule to the neural network for use in generating the 3D object. In non-limiting implementations the method includes rendering the output of the physics engine at least in part by modifying the information associated with the asset to maintain zero torque on the asset,” it reads. “The method can include inputting to the neural network information defining how the asset absorbs force.”

Additionally, these constraints may also be geometrical, as the physics engine could determine whether to “move or deform the asset,” based on the instructions provided by the video game developer. Furthermore, the modifications are not restricted to physical constraints only but may also include “changes to size, shape, color, style of certain parts of the asset (but not to all parts of the asset), texture of the surface of the asset, etc.”

Sony
Example logic in example flow chart format for converting text from speech to location and parts of a 3D asset. | Source: PATENTSCOPE

Nonetheless, three-dimensional objects aren’t the only digital assets that the video game developer could create with this technology, as “an artist may also vocally describe a desired background terrain, e.g., “dirt” or “palace marble” or other terrain.” Furthermore, the video game developer may also specify the dimensions of the desired digital assets within the computer simulation. “Also, as mentioned the size of an asset may be specified by the artist. For example, the artist may specify a chair that is twenty feet high,” the patent explains.

However, there are certain limitations that Sony Interactive Entertainment mentions regarding its technology, as artificial intelligence isn’t completely accurate. “An AI-only approach can be used to meet more qualitative requirements, like chair with a wide seat, or a tall back,” it reads. Hence, the technology can only predict how a digital asset reacts to physics based on its physical features, but constraints may need to be imposed manually to ensure an accurate result.

Lastly, the patent discusses maintaining a “library of assets” that the technology could use upon the request of a digital asset instead of generating another one from scratch, which may require more resources.

“A search of the library may first be made for images matching the keywords and only if no match is found may the AI engine generate, based on supervised or unsupervised training in human language, an image of the asset using a text to 2D or 3D generative model,” it reads.

Sony
Overview of a technique for 2D to 3D asset generation. | Source: PATENTSCOPE

While it’s unclear as to exactly how Sony Interactive Entertainment desires to implement this video game development technology into its video game and physics engines, it will certainly make video game development more convenient for video game developers, not to mention reduce the time it takes to develop video games as well. However, how this video game development technology is actually used, if at all, only time will tell.

What do you think about this? Do tell us your opinions in the comments below!

Similar Reads: EA Patent Discusses Skill-Based Automated Controller Settings

Was this helpful? 🎮

Good job! Please give your positive feedback ☺️

How could we improve this post? Please Help us. 🤔