First posted on Planet Cassandra.
Christian Hasker Editor at Planet Cassandra, A DataStax Communtiy Service
John Riordan CTO at OnSIP
Christian: John, thanks for joining us today. Could you briefly describe your role at OnSIP and what OnSIP does?John: I am the Chief Technology Officer at OnSIP. My role at OnSIP is to run the engineering group, including research development, design, implementation, and operations of the service. I’m also one of the co-founders of the company, so I’ve been around since the beginning.
What OnSIP provides is a business phone service to multi-size businesses and a suite of communication services as well; these services include instant messaging, presence, voice, and video services almost exclusively based on SIP protocol, which is where our name comes from.
Christian: That makes sense! So why the move to Cassandra? What were you on before, what’s your background, did you use anything else alongside it?
John: The move to Cassandra was made because of the desire to horizontally distribute our platform. Performance and price played a huge factor as well; delivering a higher quality service at a lower cost to our customers was key. It’s an economic driver at the end of the day.
A little bit of our background story: the short of it is, we still make use of MySQL… all of our data was largely stored in MySQL and when we originally rolled out our systems they were driven by data that was stored there and we continue to use MySQL.
However, some aspects of our system weren’t a fit for MySQL; we wanted to be able to operate our platform in multiple locations simultaneously without having to be tied to a single MySQL instance. For example, we have a data center here in New York and we also have a data center in LA; we have customers who are located in California and they route all their traffic through the Los Angeles data center. We also needed to be able to serve customers through New York and talk to customers through the New York data center, so the information needs to flow between both data centers.
The location of everyone's phone is stored in the a database; their phones connect to us, they store their location information in our database - this is a phone registration. When other people want to reach those phones they need to do a database lookup get retrieve the location information of the phone they are trying to reach. In the original design, we were storing it in MySQL, in a single master server that was located in the New York data center.
Doing so created a host of problems, particularly: if there’s a network partition, the LA users can no longer get their locations updated and basically the LA users becomes completely dependent on the New York data center. Then this problem just expands as we add data centers to new locations. Cassandra helped us solve this problem.
Christian: How many data centers are you in now? You mentioned LA and New York. Are you expanding to any others?
John: We currently are working on expanding into Miami and then we have plans on continue onward. That sort of expansion plan is what drove our consideration to move to Cassandra, so that we can continue to expand out by just opening more data centers and plugging-in.
We have a Cassandra ring running between New York and LA. We actually have two data centers in New York, so there are three data centers total with a Cassandra ring that runs between all three of them. We’re writing location information locally, and the Cassandra ring takes care of distributing the information around. This works really well for us because in the event that our LA data center is disrupted, the other data centers take over and the LA customers can continue to call each other… there’s no lock-up.
Christian: Are you running on your own hardware or are you in the cloud?
John: It’s critical for us to be able to run and operate the network at a level where we weren’t dropping any packets. Bot packets and jitter add to call quality issues, so we made a decision early on to build our own data center. The data centers we're in are at 60 Hudson Street and One Wilshire Street; they're telecom hotels and some of the major telecommunication hubs for interconnection between telephone carriers and data carriers.
We’re co-located in these facilities, so we purchase power, racks, cooling, and internet connections. But we own and operate our own network equipment and hardware; we’re a Juniper shop on the networking side and the rest of our system’s built on Dell equipment. We run on CentOS and we’re heavy users of virtualization. On top of that, we make use Puppet for configuraiton management. That’s sort of a short rundown of the underlying system.
Christian: What else did you look at besides Cassandra?
John: Well, we looked at lots of things. We didn’t have anything particular in mind when we set out to solve our problems, although, we knew what problem we wanted to solve. We ended up doing a lot of research and some of the stuff that ended up moving us forward was our research into Amazon’s architecture. There’s a lot of good papers written on that: the whole notion of distributed data and things like CAP Theorem.
We went through a number of these services, but ultimately what we were looking for is a service that allowed us to sacrifice consistency for availability; something that would allow us to continue to operate in the case of a network partition.
That largely classed the types of platforms we were looking at and if they fit in our needs category; that was a big factor in driving our selection. Specifically for our use case, imagine we have a phone in LA registering a location with the LA system, it doesn’t really matter that there’s any sort of instantaneous update with respect to what the people in New York see; if it takes some time before some person in New York can reach that phone, that’s okay.
We ended up with Cassandra pretty quickly after doing our homework. Once we figured out what it was we needed and the type of platform that would solve our problems, it became an easy choice for us. We were also looking at open source and happy to get in there. So that's where we ended up, and we’ve been happy.
Christian: Excellent. Then the last thing is how was adopting Cassandra? Is there anything you would have done differently, data models, things like that?
John: Yes, we’re up-to-speed now. One of the things that slowed the process down for us was the column oriented data structure; specifically naming was difficult. I think a lot of people have preconceptions about what a particular word means and whatnot. Just being able to map the naming of some of the concepts to what was actually going on, once we published this thing. Once we got comfortable with that it was not a problem at all. The structure makes a lot of sense and works well for us but literally the naming of things was a stumbling block for a little bit, just getting through the concept.
Which typologies to use for what we’re doing was an issue for us at first. We struggled a little bit with caching. One of our uses is to store every single SIP packet that runs across our platform in Cassandra… and it’s a lot of data. We have been storing about a week of it and then let it fall off the end.
One of the great features we love is being able to set a TTL on a piece of data and have it automatically clean out. But while figuring out how to deal with compaction, we were running up to these packs and wound up with larger and larger files; we’ve gotten through that now. We’re up-to-speed now but the “out-of-the-box” thing didn’t necessarily work for us.
vFor our programmers who are familiar with relational database models, and working in that kind of environment, the key concepts for us are: “what kind of data am I getting?” “What do I want?” “What do I need?” Then working from that to “okay, I just laid my data out in Cassandra this way in order to make that effective.”
vIt’s really a shift, in the way of looking at the problem. Historically in MySQL, we’ve always gotten the relationships set up and then we’ll worry about the programming problems later.
Christian: Yes, it’s a different way of thinking from relational, that’s for sure. Thanks so much for joining us today John and best of luck with your continued use of Cassandra.
John: Thank you