By Allan Leinwand - 2014-01-28
That’s right… what’s old is new again. In 2014, the standard of POTS (Plain Old Telephone Service) could be considered our goal. If you think about it for a second: Even in the midst of the rolling blackouts we lived through in California a few years back, we always had a dial-tone when we picked up our landline. In our opinion, that’s the platinum standard of enterprise IT cloud availability we seek.
The goal we’ve set is to engineer the ServiceNow cloud infrastructure to make customer instances always available. What might be good enough uptime for departmental SaaS is not for enterprise IT. If HR or Finance goes down, there’s certainly some associated pain, but the business can still operate. Not so with enterprise IT which drives the fundamental infrastructure of business and operations.
To set and achieve what we see as new industry standards for enterprise IT cloud availability and then exceed them for our cloud infrastructure, our engineering team needs to be constantly focused on improving operations. Nothing can be taken for granted.
Our cloud infrastructure is built using two redundant data centers in each operating geography. Each data center is staffed 24x7 and features the highest standards for reliable power, fire suppression, redundant fiber networks (including redundant connections across each geography) and physical security. At the server tier, we have failover and redundancy built in throughout. And in each data center, ServiceNow has dedicated physical space, all under our control.
We connect to the Internet using redundant BGP connections to multiple Tier1 Internet Service Providers. We also have redundant stateful firewalls and load-balancers at the edge of our network in each location.
With all of these redundancies in place, we’re working to set the industry’s most aggressive standards for a highly available cloud infrastructure. Yet, we recognize that our goal can be nearly unachievable. Even with the absolute best infrastructure engineering there can be downtime. While we are not perfect, maintaining the availability of our customers’ instances is the absolute top priority for our cloud infrastructure teams. And because of this priority, we have engineered automation using our own Service Automation Platform to move customers out of harm’s way should an event ever occur, something we call Advanced High Availability (AHA).
We are not done, and will never be - we will continue to set new standards and provide greater assurance for our customers who rely on us to perform billions of transactions every month on our enterprise IT cloud.