VoIP Resources VoIP Fundamentals Developer Blog

Trusted Payments With SIP: Technical Overview

by Will Mitchell

Many applications connect you with a premium endpoint, at a cost.

Last weekend, I ventured to Spain to attend TADHack 2014. Overall, it was a great event; developers from all around the world gathered in Madrid as well as the many satellite locations across the globe to share creative new ideas spanning topics like distributed GSM networks for rural villages, SMS APIs to summon garbage pickup, and, of course, WebRTC. One of the themes included Bitcoin and payment systems, so I asked myself what happens when you combine instant communications in the browser with distributed, electronic crypto-currencies. What if you could tie payments directly into a session?

Many applications connect you with a premium endpoint, at a cost. Bitcoin could be used to eliminate the need for trust. You wouldn’t have to worry about paying for an hour, just to have the person hang up after two minutes. At the same time, the tutor wouldn’t have fear that you will talk with them for hours and never send payment.

The result was my hack: Trusted Payments With SIP.

The Bitcoin Part

Bitcoin is an electronic currency that allows for instantaneous submission of transactions between two parties. This is ideal for a real time communications application. It would be simple to send many micropayments from one party to the other as the call progresses. Each transaction would carry a tiny value associated with the minutes or seconds that have elapsed since the previous payment.

In Bitcoin, though, each transaction also requires a small fee which goes towards keeping the network running and secure. While this fee is tiny compared to that of traditional payment services, sending micropayments every few seconds would quickly accumulate fees and drain your wallet. Luckily, people smarter than I am have already thought of this, too. A great wiki article on Bitcoin contracts explains how to achieve this exact scenario with just two fee-bearing transactions.

Here’s how it works:

  1. Alice wants to pay Bob some amount of money. The exact amount is unknown, because she doesn’t know how long she will talk to Bob. She starts by asking for Bob’s public key.
  2. Alice prepares to set aside a pool of coins using both her and Bob’s public keys as signatures. The coins will not be able to be spent unless she and Bob agree. This is only preparation at this point; she doesn’t commit coins to this pool just yet.
  3. Alice prepares a refund transaction, time locked to some future date. This will allow her to eventually receive her money back if she and Bob cannot agree how to spend it.
  4. Alice asks Bob to sign the refund transaction. If he does, Alice broadcasts the coin pool. At this point, the money is locked in. Coins cannot leave the pool unless both sides agree or the time lock expires.

Once the pool is set up (and verified by Bob), the call can begin. As Alice and Bob chat, Alice can sign over transactions to Bob, splitting the money from the pool between her and Bob. These transactions aren’t actually broadcasted to the Bitcoin network, but instead held by Bob until the end of the call. For example, if Alice is paying Bob 1 BTC per minute (wow!) and the pool contains 10 BTC, transactions may look like this:

Minute 1:  9 BTC -> Alice, 1 BTC -> Bob
Minute 2:  8 BTC -> Alice, 2 BTC -> Bob
Minute 3:  7 BTC -> Alice, 3 BTC -> Bob

At any point, Bob is free to spend the transaction that is best for him. So when the call ends (whether mutually or from Alice disappearing suddenly), Bob is guaranteed to be paid for the time he has been on the call. Similarly, Alice only pays for the call as it happens, and if Bob never takes his money, she can reclaim her refund after the time lock expires.

For a more detailed description of this interaction, read on at the Bitcoin Wiki.

Rapidly-adjusted micropayments to a pre-determined party for BitCoin.

The SIP Part

WebRTC DataChannels allow for data to be passed from peer to peer in addition to the audio and video media of the session. This is ideal for including payment information directly in the call itself, and it’s easy with SIP.js. Support is good, too; DataChannels are provided in recent versions of Chrome, Firefox, and Opera. If I could just get the browser to talk to the Bitcoin network, WebRTC could take care of sending the data between peers. But how to connect to Bitcoin?

Bitcoind, the reference daemon that runs Bitcoin, provides a local JSON API which could be accessed if it is running on the same machine. While your desktop browser could connect to that to access the Bitcoin network, I was already thinking of future integrations. WebRTC is supported on most Android browsers, so would it be possible to access this application from your phone? What about calling in from the PSTN or traditional SIP endpoint? Can we use the same Bitcoin mechanism to create peering arrangements between federated SIP networks? Can we provide access to the Bitcoin network in a way that is compatible with all of these things?

As it turns out, we can. SIP.js is not just a library for making WebRTC applications; it is a fully capable SIP stack written entirely in JavaScript. By running SIP.js on a Node.js server, we can access server-side functionality through a SIP API. Node.js also has several libraries for interacting with Bitcoin. Perfect!

Putting It Together

While not all of the pieces came together in time for TADHack, I really wanted to get an application running that would showcase how simple this would look from a user’s perspective. I focused on the following functionality:

    • Calling a SIP address with WebRTC. This is the most basic scenario, creating an anonymous user agent with SIP.js and calling through the OnSIP Network.
    • Connecting a SIP user agent in the browser to one running on a Node.js server. On the browser side, this was as easy as calling a SIP address. On the Node side, it required installing the WebRTC binaries as well as a WebSocket client, then running SIP.js to connect (again) to the OnSIP Network.
    • Registering and setting a price to receive premium calls. This is a trick I have been using in several hacks now. By registering a user agent to a shared address, applications can broadcast information to others, and in turn discover other endpoints available for call. It is effectively a cheap pub-sub mechanism, and a topic for another blog post.
    • Discovering and calling a premium SIP address, watching your Bitcoins drain (and theirs overflow).

Want More?

The full source code of the hack is available on my GitHub page. Feel free to read through it, fork it, or comment below with any questions.

If you are interested in more interesting ideas for telecom applications, check out the TADHack YouTube channel. Each presentation from the hackathon was recorded and posted, and there were so many great ideas presented. And hey, if you have an idea of your own, let us know!

Learn more about VoIP Fundamentals