Posts

Showing posts from 2024

OpenAI WebRTC API Review

Image
There is a new interface added to OpenAI RealTime models. Now it supports WebRTC! Given the people working on it I'm sure it has to be great so as usual let’s take a look and see what is under the hood in terms of audio transmission. Signalling or Establishment of the connection There are two options for the establishment of a RealTime session with the OpenAI servers: WebSocket signalling : much nicer API without ugly SDPs involved but less suited for public networks. HTTP/WebRTC signalling : has an uglier API including SDP offer/answer negotiations but can work well in real networks that is critical for most of the use cases. In the rest of the post we will focus only in the later (HTTP/WebRTC) that is the most interesting one. Authentication The first step to use these RealTime APIs sending audio data directly from clients to OpenAI servers is to obtain an ephemeral key using you OpenAI API Secret. This is a simple HTTP request that for testing you can do from the command line: `...

Target Bitrates vs Max Bitrates

Image
  Not all the simulcast layers have the same encoding quality When using simulcast video encoding with WebRTC, the encoder generates different versions or layers of the video input with varying resolutions. Using this techniques a multiparty video server (SFU) can adapt the video that each participant in a room receives based on factors such as available bandwidth, CPU/battery level, or the rendering size of those videos in each receiver. How simulcast works with an SFU forwarding layers selectively These different versions of the video have varying resolutions, but what about their encoding quality? For example, if a user is receiving a video and rendering it in a window of 640x360, would he get the same quality if he receives the 640x360 layer as if he receives the highest layer of 1280x720? To answer this question about the quality of each resolution, we can examine first the bitrates used by each. But the interesting thing is that the bitrate of each resolution is not always th...

The Impact of Bursty Packet Loss on Audio Quality in WebRTC

Image
 Ensuring high-quality audio in WebRTC encounters a pivotal challenge amidst less than ideal network conditions, predominantly driven by the burstiness of packet loss. This phenomenon is prevalent in congested networks, areas with low mobile coverage, and public Wi-Fi setups. Within the WebRTC framework, an array of strategies exists to mitigate packet loss, yet their efficacy varies depending on the specific network dynamics. Among the most prevalent techniques are: OPUS Forward Error Correction (FEC): Each audio packet incorporates low-bitrate data from preceding packet, facilitating potential recovery in the event of a single packet lost. Packet Retransmissions: Leveraging standard NACK/RTX mechanisms, the receiver requests retransmission upon detecting packet sequence gaps. Packet Duplication: Sending multiple instances of the same packet aims to compensate for potential losses. It is like sending preemptive retransmissions to mitigate the impact of potential packet loss. Re...

Loss based bandwidth estimation in WebRTC

Image
Measuring available bandwidth and avoiding congestion is the most critical and complex part of the video pipeline in WebRTC. The concept of bandwidth estimation (BWE) is simple: monitor packet latency, and if latency increases or packet loss occurs, back off and send less data. The first part is known as delay-based estimation, while the second part, less known, is referred to as loss-based estimation. In the original implementation of WebRTC, the logic for loss-based estimation was straightforward: if there was more than 2% packet loss don't increase the bitrate sent and if it is more than 10% reduce the bitrate being sent. However, this naive approach had a flaw. Some networks also experience packet loss not due to congestion but inherent to the network itself (e.g., certain WiFi networks). We call that packet loss static or inherent packet loss. To address this issue, the latest versions of Google’s WebRTC library introduced a more modern and sophisticated solution after seve...