Yesterday, President Obama came to New York to witness the mass destruction caused by Hurricane Sandy. It's been a little over two weeks since Hurricane Sandy stormed through the eastern United States, leaving 400,865 homes and businesses without power (Huffington Post). The President visited several homes and businesses that have yet to recover from this unfortunate event. (One of our own employees is still displaced from his home in Battery Park.)
On Wednesday, October 31st, we published a brief blog to let our customers know that our business was up and running and their phone service remained unaffected throughout the week. In a bit more detail now, this is the story of how OnSIP weathered the storm.
"[Some of our building] lobbies are destroyed," said an employee of a building management business that's an OnSIP customer. "Our carrier PRIs at [multiple downtown Manhattan buildings] are dead because 75 Broad, where all the carriers sit, is dead. We’re running on a backup ISP, but it’s on generator power, so not the greatest. But of all the dial tones that were available… OnSIP is alive and working great!"
At OnSIP, we believe reliability of service is paramount, which is why we have architected our systems with high redundancy and chosen best-in-class colocation facilities to host our services. We are headquartered in Downtown Manhattan, and our main colocation facility is located on 60 Hudson St, New York, NY. When water gushed through the streets, taking down the subway system and power in the area for a week, both our employees and those who manage the 60 Hudson St. facility were ready.
To ensure the connections to carriers can be utilized during a geographic outage, much of our core systems are distributed. Registration, SIP-SIP calling, PSTN termination (outbound), and inbound bridges are distributed services. For those services, we moved customers to utilize our LA data center in advance in preparation for the storm.
The remaining systems are located in our main colocation facility, 60 Hudson, which switched to backup diesel generator power when Lower Manhattan entered a week of darkness. For those services, such as voicemail, auto attendants, our website, and more, we maintained a contingency plan in case the NYC colocation center's backup generators failed: Warm and hot standby applications in our Los Angeles data center were prepared for a cut-over.
"OnSIP was not affected by Sandy. I have other VoIP lines with them, and they work fine, never went down," tweeted a customer just after the hurricane.
When it comes to calling, we've maximized redundancy wherever possible. We purchase 4 types of services from multiple upstream carriers: IP service, termination service (outbound calling), local origination service (inbound calling), and toll-free origination service (inbound calling). With connections to all, we dynamically route calls to avoid an upstream carrier with a service issue.
The only service that we cannot remedy is when an upstream carrier has a service outage is local origination service (inbound calling to a local number). This is not an OnSIP-specific problem; it is unfortunately a design limitation of the Public Switched Telephone Network (PSTN) and a common problem for telecommunications carriers. If a call comes in to a number via an upstream carrier, and that carrier's system is down, we simply won't get the call.
Unfortunately, an outage did occur for an upstream origination carrier on Friday evening, due to a power failure in a NYC datacenter that was running on emergency generators. Again, though, when we found the carrier's services were down, we shut off the interconnect for outbound calling to that carrier and routed calls accordingly, keeping their outbound calling up for customers.
On Friday afternoon, November 2nd, the colocation facility notified us that they would cut over from emergency generator power to utility power that evening. We had the majority of our engineering team on standby. Our employees who had power continued to work at home. Other employees fled to friends', family's, and coworkers' homes to take advantage of their power and Internet. Everyone stayed connected via my.OnSIP's built-in web-phone and chat.
When at 11:20 pm EST, November 2nd, our servers lost power, we deployed our contingency plan and had critical services such as inbound/outbound calling, voicemail, etc. up and running within the hour. Our website was restored, hosted in our LA colocation facility soon after. If we had notice that this would have taken place earlier in the week, we would have deployed our contingency plan the prior evening, but we felt Friday evening was one of the best possible times and thus waited to do so.
"Part of our mission at OnSIP is to deliver an Internet service which is truly distributed and scalable with no single points of failure," commented John Riordan, CTO of OnSIP. "One of the benefits of this approach is that service can continue to be provided in the face of localized Internet facilities failure— A benefit our users experienced this week."