Building the immersive digital future with WebRTC

 

GenUI

GenUI partners with product visionaries to execute on their big ideas. We build innovative solutions that accelerate technology roadmaps and deliver real impact for our clients and their customers.

Updated Jul 27, 2022

The way we work and access media is evolving. Remote collaboration is still developing, promising to enable increasingly seamless and lifelike conversations and interactions. The frontier of spatial computing is unfolding, and the potential for augmented and virtual reality applications is growing. The rapid progress of these technologies means that streaming protocols are ripe for innovation.

At GenUI, we’ve helped our clients innovate with the WebRTC streaming protocol to push the boundaries of immersive audio and visual media. With the capacity to pack different types of data into a container and deliver it in streaming fashion, WebRTC can enable many new kinds of remote collaboration and shared presence applications. Real-time streaming of complex media including volumetric and spatial data will undoubtedly play a part in the future of how humans interact with digital experiences.

Real-time Streaming with WebRTC

WebRTC is a transmission protocol that allows web browsers and mobile applications to perform real-time communication through APIs. By enabling peer-to-peer communication, media can be streamed directly in a browser without the need for plugins. The specification can also be integrated in an application or device without the use of a browser.

The real-time transmission of video, audio, and data that is commonly used for web conferencing applications like Zoom and Microsoft Teams is supported by WebRTC. The specification can be applied to products reliant on streaming through a variety of open-source and commercial libraries or customized to enable novel technology.

Choosing The Right WebRTC Library for Your Project

As a free, open-source framework, WebRTC has spawned a variety of open-source projects as well as commercial offerings that can be used to build new products, software, and applications. Web developers already enjoy the native support of WebRTC on all browsers via JavaScript, while C# developers can choose from a slew of available libraries. To find the right library one your project, you should consider your project requirements in terms of codec selection, customization, and remaining up-to-date with releases.

Here's what you should know about your options:

  • Windows Mixed-Reality WebRTC offers reliable performance and a solid codec selection along with a sophisticated .Net-oriented wrapper around the native APIs. As a downside, configuring codecs can be challenging, and HW-accelerated support for the latest codecs is lacking. While the platform is not actively being developed, self-hosted modification is possible as the project is open-source.
  • Unity WebRTC is an open-source library with a thin wrapper around native APIs. It offers good performance, and you can easily stay up-to-date with official WebRTC releases. While the platform has a sufficient codec selection, codec configuration is not possible without a custom build.
  • WinRTC is an open-source library with the thinnest-possible wrapper around native WebRTC API. In theory, this project will offer you excellent performance, but getting it to work can be tricky in current versions of Visual Studio. Although, according to MSFT DevDiv, we can look forward to better support for WinRTC assembly registration soon. The exceptionally thin wrapper around native APIs allows you to keep up-to-date with the WinRTC release line easier than with other solutions.
  • Native WebRTC offers the best performance available, but we only recommend it for expert .Net developers. It requires knowledge of native interop coding, memory management, and more. The best approach to building a native SDK is to use one of the above projects’ fork of the code (Unity and WinRTC hew closest to official) and build with their build scripts. Native WebRTC is open-source, but it is rigorously reviewed and controlled because changes are released to every browser on every platform.
  • Liveswitch WebRTC is a commercial product that offers good performance. Its fully C# native implementation means it doesn’t use any official or native APIs. Liveswitch has experience modifying their offering to support spatial and volumetric video streaming through private builds.

No matter which library you decide to use for your project, the capabilities of WebRTC have a lot of potential for innovating digital collaboration, including applications for the metaverse. 

Exploring the technology to develop new experiences

GenUI develops software that relies and expands on WebRTC capabilities. We have built software that streams volumetric data—in particular, depth or point clouds, which help to enable real-time 3D virtual reconstruction nearing the metaverse. We capture depth data with sensors—such as Apple’s LiDAR Sensor and TruDepth technology or Microsoft Kinect and Intel Realsense RGB+Depth cameras—that are the foundation of cutting-edge real-time communications interactions.

We are developing software solutions with the goal of creating more satisfying video conferencing experiences. Research, as well as practical experience, reveals that the flat-grid display of contemporary video chat applications, especially combined with the typical monaural audio, is a poor substitute for in-person meetings. Users find the existing experiences draining and often more of a barrier to collaboration than an enabler. We’re testing software that can change this paradigm by integrating volumetric data and spatial audio streamed through WebRTC.

Realistic eye contact with your video-call conversation partner can help minimize the fatigue and disconnection common to virtual experiences. Depth sensors can reconstruct their face and body in 3D while tracking each of your gazes to simulate real eye contact. 

Now, consider the possibility of hearing a coworker on the right side of your screen as if they were sitting to the right of you in a meeting room. They might even be able to have a side conversation that doesn’t interrupt the broader meeting. These rich details, facilitated by augmented WebRTC capabilities, enable a more realistic and tolerable telepresence.

We have found that the addition of spatial and volumetric reconstruction of participants and a more natural spatial audio experience increases user comfort and engagement. Additionally, the lack of physical restrictions in the virtual reconstruction environment enables various forms of interaction, information sharing, visualization, locomotion, organization, and communication. On the horizon of this technology are fully reconstructed and livestreamed conference rooms. As we get closer to a natural and realistic metaverse, WebRTC will be an essential tool for innovating on media streaming.

The future of WebRTC applications

While the capabilities of WebRTC will enable software that takes immersive media to the next level, we still remain on the cutting edge of fulling bringing real time communications to augmented and virtual reality applications.

The arts and entertainment industry has already benefited from spatial and volumetric capture, providing consumers with immersive audio and visual experiences. AN immersive spatial audio experience is possible without any enhancements to the WebRTC specification. Similar to the spatial audio software that GenUI has developed for video conferencing, audio can be streamed in either stereo or mono, then mixed and rendered at the consumer’s end to produce a nearly limitless variety of natural soundscapes.

However, the current state of WebRTC restricts frame rates and limits the transmission of the volumetric data that makes 3D virtual reconstruction possible. To make meaningful advances in real-time communications, developers will need modifications to WebRTC that support volumetric data that is temporally synchronized with video streams as well as the high frame rates needed for VR headsets. From improving live performances with 3D human modeling to enhancing situational awareness capabilities, the opportunities to expand on virtual experiences are broad.

Augmenting live experiences

WebRTC can also enable live augmented reality. Full-scene, multi-angle augmented reality is already being used by VR game streamers, but not in real-time. The current software requires green screen capture of a person’s live performance while body-tracking produces a virtual “skeleton.” Then, stitching with videogame footage generates a post-produced video. By streaming multiple views in separate video streams, modifications of WebRTC can make this happen live. This technology will be the backbone of live performances or events in the burgeoning metaverse. WebRTC is the open standard that should enable this broadly.

Spatial and situational awareness for various high-cognitive jobs is also possible with WebRTC. Medical, surgical, manufacturing design, architecture, and other fields which often rely on models—from paper and digital models to high-fidelity real-time visualization—are already leveraging these protocols.

Currently, militaries are attempting to field AR technology to assist in battlefield situational awareness. The network communication needs of a system like this are the same as any live-streaming spatial or volumetric system. In this case, the project has the added demands of robustness and security. It requires multiple streams of both incoming and outgoing video, incoming and outgoing volumetric data, incoming and outgoing audio, and data channels—all of which are supportable using WebRTC and COTS hardware and software.

Let’s push the bounds of what is possible with WebRTC

Innovations on WebRTC can enable boundless possibilities in media streaming. Modifying and expanding on this technology will be an essential step to reaching true live metaverse experiences. Join us as we discover the vast potential for connecting people digitally. If you’re interested in innovating with WebRTC, contact GenUI today.