How Junction Networks deals with HD Voice
Before I discuss High Definition (HD) voice, we have to define a few terms:
User-agent
A User-agent is a phone. It can be a desk phone, a soft phone or even an app on an iPhone. A SIP User-agent is a phone that works on the SIP protocol. The SIP protocol is the de facto protocol for VoIP. Below when we talk about connecting to 'phones' we really mean SIP User-agents.
Hosted SIP
Junction Networks provides a Hosted PBX service called OnSIP. OnSIP is a specific type of Hosted PBX called Hosted SIP. Hosted SIP means that our service is entirely based upon the SIP protocol. That means that when two SIP phones need to talk to each other we allow them to negotiate between themselves as to what type of phone call it will be, i.e. voice-only or video and voice, and what 'language' they'll speak. Technically, the language is called the 'codec.' A VoIP codec is similar to the difference between MP3 and WAV audio files. Yes, they're both audio files, but in order to properly decode the file, you need to know the specific format. The same thing happens with codecs. The two phones need to agree on a format - a codec - that they both have in common prior to setting up the call.
Back to Back User Agent
The other common type of Hosted PBX is called a back-to-back (B2B) user agent. A B2B hosted VoIP provider is always in the middle of the call. A B2B server connects to each phone individually creating two individual call legs and then connects those two call legs together. The advantage to B2B is that the two end points do not have to agree on a common codec. Each end point has only to support a codec that the B2B provider supports and the B2B provider will handle the translation from one codec to the other. The downside of this is that whenever there is a new codec, the B2B provider must enable all of their servers with this new codec AND enable the ability to translate from that codec to every other codec they support.
Codecs
Let's quickly talk codecs. The everyday standard-quality codec is called G711u (aka ULAW). The current reigning high definition codec is called G722. Our personal favorite low-bandwidth codec is called GSM.
How Junction Networks Handles HD Phone Calls
There are three main types of calls on the OnSIP network that handle HD calls differently: SIP-to-SIP calls, SIP to PSTN calls and Conference Bridge Calls.
SIP-to-SIP Calls
SIP-to-SIP calls include all extension-to-extension calls within a PBX and any call to/from a SIP-based phone anywhere on the Internet. Since OnSIP is a Hosted SIP provider, we allow the SIP phone, be it a desk phone, a soft phone, or an iPhone app, to negotiate call setup with each other. Then, once the call is set up, the two phones send the media - the voice and/or video - directly to each other. If Junction Networks needs to be in the middle of the call to handle NAT issues for one or both end points, we simply hand off the media from one phone to the next.
This is a huge advantage of being a Hosted SIP provider. If two SIP end points want to have a video call, great. As long as they can agree on the media, we can support it. Same with high definition. If two phones want to hold a G722 (HD) phone call, it’s no problem. We did not have to do anything to our network to support HD calls to and from any extension. By being a Hosted SIP provider, it just works.
SIP-to-PSTN Calls
The PSTN is the public switched telephone network. Whenever a call goes to the PSTN it is, by default, in the standard G711u codec. The PSTN cannot currently, and never will be able to, handle HD calls. Most phones, if they support G722, also support G711, so there is no problem with a HD phone calling out to the PSTN, but the call will not be a HD call.
SIP-to-Conference Bridge
Here is where some work had to be done. A conference bridge acts more like a B2B agent. Whenever a SIP phone calls the bridge, the bridge sets up the call between the bridge application and that phone. If the bridge does not support the G722 (HD) codec, even if all the callers on the bridge do, the call will not be in HD. Therefore, in order for a conference bridge to support HD, it must be able to establish individual HD-capable (G722) calls with the individual callers for the inbound audio AND be able to take the audio from all the other callers and convert it to HD audio to send back to the phone.





