In a nutshell, WebRTC is an open-source project aimed at creating a simple, standardized way of providing real-time communications (RTC) over the web. Shortly after Google Chrome was released, its team noticed the the Web’s infrastructure fell short when it came to real-time communications. There was no default implementation in any browser, let alone a standard across all browsers, to allow direct data transfers between people. Google set out to define the necessary specifications for smooth data transfer on a common platform, eliminating the need for third-party apps or plug-ins. Within a few years, Mozilla, Microsoft, Opera, and Apple all joined the project.
When you think of video calling, what’s the first example that comes to mind? FaceTime, Skype, or maybe business conference tools like Join.me or GoToMeeting? All of these are examples of real-time communication, but they fall short of the goals set by the WebRTC project. The first two are proprietary systems that require the person on the other end to use the exact same technology: If you want to FaceTime a friend but he doesn’t have an Apple device, you have to find another platform. When joining a video conference, you have to be ready ahead of time to ensure that the app downloads correctly or the plug-ins are working.
This is where WebRTC was born: the need for technology that could unify real-time communications regardless of device or browser, all without extraneous plug-ins and downloads.
Why Do We Need WebRTC?
As its name suggests, WebRTC was initially established for real-time communications in web browsers. Since that was all the way back in 2011, it’s easier to approach the concept today from two directions: looking back and looking ahead.
In the past, the IoT was nothing compared to what it is today, and the focus was on web browsers. You may be thinking that this seems a tad outdated since video chat and IM have been around nearly as long as the World Wide Web, and VoIP arguably longer than that. You’re quite right—real-time communication has existed for a while, but the native web technology has not.
The Web lacked the ability to enable peer-to-peer connections—one person directly connecting to another—on its own. [Note: WebRTC kind of is, kind of isn’t P2P. Technically, P2P means two end users can directly connect without a server. WebRTC doesn’t meet that requirement, but it still enables a direct connection between people. For that reason, we’ll continue to use the term in this post.] You had to (and often still do!) allow third-party plug-ins like Flash access to your camera/microphone or download external apps like with Join.me or Zoom. This was a time-wasting nuisance and could be problematic for those who didn’t understand which steps they needed to take.
Different browsers supported different codecs and were built with different APIs, the building blocks of software communication. Google saw this as a substantial pothole in the path toward communications innovation and the still-budding softphone era.
The real problem lay in plug-in security issues: Browser developers had no control over these middlemen or their updates, which were riddled with bugs and security flaws. For example, Adobe Flash was practically synonymous with security issues and general clunkiness, to the point that Steve Jobs wrote an open letter detailing why iOS would ban the plug-in from then (2010) on. Adobe stopped working on Flash in 2015 and announced its end of life for 2020.
Moving forward, the world is expected to hit 500 billion connected devices by 2030. Streaming is commonplace and the globe is linked by the IoT, which is only growing. Now that Internet speeds have caught up to properly handle real-time video as well as chat and voice, we need a low-latency (short transmission time) solution to enable it. Competitors will differentiate themselves by how they utilize and innovate upon embedded communications. The future is the IoT, and WebRTC is developing the necessary framework for continued innovation: free, high-quality technology that enables P2P connections—connections that work regardless of device, provider, or preferred web browser.
Why do we mention free? Just like licensing music or character likenesses, many pieces of necessary software have royalties attached, meaning you have to pay for the right to use them. Put very generally, the term “open-source” means the source code is available for anyone to inspect, and the author(s) often allows everyone to use, copy, modify, and/or contribute to that code for the benefit of the public.
Google went this route with WebRTC: Prior to the project’s 2011 launch, they bought the VoIP company that had developed several RTC elements and changed the technology to be open-source. Why, when they could pocket the licensing fees? For one thing, the core tech behind the Internet is free—CERN decided that when they made the World Wide Web available to everyone back in 1993. For another thing, choosing open-source code over royalty-based opens the door for anyone to contribute to the project. After all, a project aimed at uniting people and continuing innovation should remove the barrier of entry.
Unfortunately, there are so many options when it comes to software that there was no uniformity across browsers. We’ll go into the specifics below, but WebRTC functionality across platforms demands standardization in the base technology. Because Google started WebRTC as open source, the main APIs and codecs that make it work are open-source too, but other browsers only supported royalty-based codecs.
Let’s Get Technical: How WebRTC Video Chat Really Works, and More Technobabble
While the endgame of WebRTC is to simplify Internet-based communication, going under the hood can be confusing for anyone not well versed in software engineering. That said, we think it’s important to break down the central components behind the technology. Not only will doing so help explain why some of the big players took so long to come on board, but many of the terms laid out below will undoubtedly crop up in anything you come across related to future WebRTC, 5G, or general Internet developments.
We’ll start by defining a few terms and clarifying some similar ones, then get into how the puzzle pieces fit together to make WebRTC work.
API: Application programming interface. In very distilled terms, an API is software that defines communication protocols between different programs: It tells the system what you want to do and brings you the response. Think of APIs as the building blocks of computer programs that help programmers put everything together.
Codec: Short for coder-decoder. It encrypts and compresses data for transmission across the web, then decompresses and decodes it upon arrival. Audio and video files are massive, and without codecs, they would take so long to send that they’d make dial-up connections look lightning fast in comparison.
Open Source vs. Royalty-Based: The difference between software that’s free for anyone to inspect and that you have to pay to use, respectively. Think of open source like taking a book from your local library and royalty-based like buying that same book at a bookstore.
Plug-In vs. Embedded: Plug-ins are third-party software added to a program to allow specific features, like video streaming. Embedded means the same functionality is built into the original program, removing the need for third-party solutions.
Technology vs. Solution: Technology is the tool at hand, while solutions are how said tools are implemented to solve a specific problem. WebRTC is a piece of technology; how people use it in app and web development are solutions.
Vocabulary lesson over—let’s get into how the technical pieces function within WebRTC.
The APIs Behind WebRTC
These are the central APIs behind WebRTC (For more technical explanations, see https://webrtc.org/):
- getUserMedia(): Requests access to a device’s camera, mic, and/or screen to capture audio and video.
- MediaRecorder: Records audio and video.
- RTCPeerConnection: Does the heavy lifting to stream audio and video between users (peers).
- RTCDataChannel: Streams data directly between users.
WebRTC and Video Codecs
The codecs are where things get tricky. One of the main reasons WebRTC has taken so long to get off the ground is that not all of the major browsers were on board right away, and no one could decide which video codec to use. The two most common ones are H.264 and VP8: The former is more widespread, but the latter is open source. Eventually, the Internet Engineering Task Force (IETF) decided that all browsers should support VP8 as well as H.264. The IETF and the World Wide Web Consortium (W3C) are the organizations that develop Internet standards. When it comes to WebRTC standardization, the IETF focuses on network issues, and the W3C defines which APIs developers can use on top of web browsers.
Apple was the last holdout. They finally announced WebRTC support with Safari 11 but continued to stick with H.264 at that time. In 2018, Apple announced Safari 12 would support VP8. With the last main browser finally on board, this is the breakdown of the most common video codecs and who supports them (not including AV1):
Codecs are crucial to WebRTC because they affect latency: the amount of time (read: delay) it takes for captured video to appear on the other person’s screen. The lower the latency, the better. Obviously, real-time communications rely on low-latency solutions; no one wants jittery, delayed video during a call. Beyond the immediate experience in the real world, once latency gets too high (usually above .2 seconds), there’s a sharp increase in technical problems around noise and echo cancellation.
Between codec wars, prominent browsers holding back support, the extensive standardization process leading to WebRTC 1.0, and a handful of blockers related to security, it’s no surprise that WebRTC has taken so long to properly get off the ground. But now with 5G around the corner and everyone on board with the core technology, WebRTC has reentered the public eye.
WebRTC and SIP
The RTCPeerConnection API allows P2P data streaming, but it still needs a signaling protocol to initiate and manage the communication session between users. WebRTC doesn’t require a specific signaling protocol, allowing for even greater flexibility among developers.
SIP, one of the signaling protocols behind web-based real-time communications, is the basis of OnSIP’s entire network and the VoIP industry as a whole—and VoIP is one of the most well-known examples of RTC. Together, WebRTC and SIP make a beautifully symbiotic match for developers working on scalable, reliable communications applications—not to mention the end users.
WebRTC & Your Business: Industry Examples
At the start of this post, we mentioned VoIP, video chat, and smart devices as a few examples of real-time communications. These were merely the first ripples in the giant lake that is WebRTC. This technology is already deeply entrenched in office communications and will only continue to grow. No matter which industry is on your mind, WebRTC is there. Here are a few industries where WebRTC is already in play and continues to advance their capabilities.
OnSIP, a VoIP company actively working on WebRTC-based products, is a living example of the rapid transformations in telecom in recent decades. From fixed lines to mobile networks, circuit-switched systems to VoIP, telephony has had more than a few drastic changes of late. WebRTC is the current driving force behind telecom innovation. The ability to conduct your calls straight from a web browser, like with sayso, eliminates the need to install softphones.
IoT and Browser communications will drive growth in telecom, and according to experts here at this outstanding conference, IoT spend will be 15X greater than the mobile + Internet + PC waves combined. I believe WebRTC will be the platform of choice, especially as 5G and Over-the-Top (OTT) continue to flourish.” - IoTEvolution
A few seconds of delay is no big deal when sending an email or cat GIF, but is starkly noticeable when engaging in RTC like a video conference. Instead of having to download bulky plug-ins and apps, or dealing with jitter from a codec that can’t handle the high-level data streams to which we’ve become accustomed, your conference calls—whether audio only or video too—can proceed smoothly.
The video side of WebRTC isn’t limited to business conferencing and social chats, either. It’s easy to forget that file sharing is a major aspect of WebRTC when the focus is so often on calling tools. Recording, sharing, and hosting, however, are ripe for innovation and mass expansion.
Once a video call is done, it’s over and gone. Creating and hosting video on your website helps you embolden your brand and foster professional yet customizable relationships with consumers. Instead of shooting off an email response to a support request or question about an online class, you can easily record and share a video response of your team mulling the issue over and responding with empathy and clarity.
Native advertising is advantageous because it inserts a product naturally into your view. What if you could click on an ad and immediately be connected with the company rather than waiting for the web page, scrolling around figuring things out for yourself, and then looking for a phone number you have to type into your phone? WebRTC changes how people interact with ads and the companies behind them.
Healthcare providers and startups have jumped on WebRTC. Telehealth companies like Teladoc have made massive strides in healthcare accessibility, and surgeons are even turning to virtual reality and remote robotics.
WebRTC and 5G together will bring significant improvements to modern healthcare, particularly by continuing to ensure that your location is a non-factor. If you need to see a specialist on the other side of the world, WebRTC video streaming means your feed is less likely to cut in and out while describing symptoms—and is more secure than if you needed a plugin. If you have to be at work during an elderly relative’s home visit, you can stream in and attend virtually. Robotic surgery is on its way to becoming normalized. WebRTC is helping to push medical innovation to everyone: Your location or mobility shouldn’t interfere with your ability to receive healthcare.
The IoT is about connecting things to the Internet to access data. WebRTC is about connecting people over the Internet. Together, the possibilities are endless. After reaching 20 billion devices in 2020, we mentioned above that 2030 is forecasted to have more than 500 billion connected devices. We’ll let you do the math on that rate of growth.
The initial focus was on social apps like Facebook Messenger, Discord, and Houseparty. Now smart home devices have expanded in recent years: doorbells that stream video to your phone, remote building security and access, even Alexa telling you the weather. (We won’t pin that creepy laugh on WebRTC, though.) Imagine what the world will look like when all of these devices can do more than connect—when they can communicate seamlessly on a standardized platform. Software engineers and dev teams around the world will have the same technology platform, so they’ll find their edge by innovating on what they can do with said platform.
How OnSIP Contributes to WebRTC
Click-to-call probably seems like a minor thing—to anyone who has yet to experience it. OnSIP developed sayso, a click-to-call button that uses WebRTC to function. With sayso, we’ve eliminated the barriers between your website visitors entering your domain and actually contacting you.
Real-Time Communication: Keeping Us Connected in a New Era
WebRTC may be a complicated matter under the surface, but for end users around the globe, it’s an incredible tool that helps instantaneously connect people from anywhere. Some of the most illustrious organizations in the history of web technology are working on this project, and with standardization imminent, the way we use the Internet is about to change significantly for the better. Yes, real-time communication is already rooted in our professional and personal lives to some degree, but we’re poised on the precipice of a new era of speed and simplicity in daily communications.