What is WebRTC Signaling?
WebRTC signaling refers to the process of setting up, controlling, and terminating a communication session. In order for two endpoints to begin talking to one another, three types of information must be exchanged:
- Session control information determines when to initialize, close, and modify communications sessions. Session control messages are also used in error reporting.
- Network Data reveals where endpoints are located on the Internet (IP address and port) so that callers can find callees.
- Media Data is required to determine the codecs and media types that the callers and callees have in common. If endpoints attempting to start a communications session have differing resolution and codec configurations, then a successful conversation is unlikely. Signaling that exchanges media configuration information between peers occurs by using an offer and answer in the Session Description Protocol (SDP) format.
Why Are Signaling Servers for WebRTC Needed?
When WebRTC applications are said to operate entirely "in-browser," the perspective is taken from the end user's point of view. Yes, WebRTC app users require nothing beyond their browsers; but underneath the hood, developers must craft server-side solutions to get peers (i.e. browsers) to communicate with each other. This is how the infrastructure of a communication platform, such as the OnSIP Communications Platform as a Service (CPaaS), becomes useful.
In a nutshell, WebRTC signaling allows for users to exchange metadata to coordinate communication. RTCPeerConnection is the API WebRTC uses to establish peer connections and transfer audio and video media. In order for the connection to work, RTCPeerConnection must acquire local media conditions (resolution and codec capabilities, for instance) for metadata, and gather possible network addresses for the application's host. The signaling mechanism for passing this crucial information back and forth is not built into the WebRTC API.
WebRTC Signaling and NAT Traversal
The task of getting the initial signaling data from one peer to another seems like it should be a simple process. Perhaps in a perfect world, a WebRTC signaling mechanism would be able to connect peers directly, without any detours or sidetracking. But the modern internet is structured in such a way that makes this sort of easy relay impossible. NATs of all varieties, and firewalls on many devices, will often erroneously filter packets that are not primed to deal with ALGs and other protective measures. Outside of generating the SDP packet itself, the signaling mechanism is also crucially responsible for ensuring that these signaling messages can be shared between peers in the first place.
So, how does a WebRTC signaling mechanism negotiate the perilous maze of the internet? The answer is simple in theory: it utilizes a versatile framework known as ICE. The efficiency of ICE allows it to calculate, with a mere three methods, the quickest and easiest NAT traversal route for a packet to reach its destination. The first method used, and the least likely to occur, is when ICE tries to make a UDP connection using the host address obtained from a device’s operating system and network card. This will inevitably fail on devices behind NATs, and so there are two remaining methods for ICE to employ: a STUN server or a TURN relay server.
86% of all WebRTC calls are established via STUN servers 1. A STUN server operates STUN servers check the IP address and port of incoming requests, and it then sends that address back to the device’s WebRTC application as a response. The WebRTC application thus uses a STUN server to ascertain its own IP port address from a public perspective. This allows the application to offer a publicly accessible address, which is then passed to another WebRTC-enabled peer via the signaling mechanism.
If both methods fail, the final method employed by ICE is a TURN relay server. TURN servers are used to stream audio, video, and other real-time data between peers. Technically speaking, it does not relay signaling information, because it enables actual real-time data exchanges between peers. TURN servers have publicly available addresses, so peers can connect to them even if they are behind NATs and firewalls. TURN servers are costlier to maintain than STUN servers, because they are actually streaming media rather than connecting peers.
A fully functioning WebRTC application requires all of ICE’s capabilities to operate smoothly and effectively. But purchasing and maintaining numerous servers at a significant cost is simply not a feasible option for developers who are looking to make sound economic and personnel decisions. This is why OnSIP's platform is perfect for developers who are looking to harness the power of WebRTC. Our pre-designed, mature SIP network, ensures that developers do not have to build complex server-side architectures to solve basic WebRTC signaling problems. Instead, they can harness the power and reliability of our redundant SIP platform to scale WebRTC applications, bridge compatibility gaps between endpoints, broker connections behind firewalls, and track and report communications with ease. Let us deal with the groundwork, so you can focus on making in-browser applications that are innovative, convenient, and expansive for your users.
1. “WebRTC: Plugin-free realtime communication” - Justin Uberti - WebRTC Tech Lead, Google ↩