josiahsullivan
Tera Contributor

One of my least favorite questions from potential suppliers is whether ServiceNow is CAPEX or OPEX driven, as if excelling in one forgives sins in another. My answer is always the same: we account for both.

 

Most of us know the old joke: lies, **** lies, and statistics. I would happily add case studies and TCO models to that list. The assumptions and data elements that feed them are contextual, and the vast majority of sales and marketing content completely ignores this context.

 

So how then does ServiceNow make rational decisions about what to buy when building our cloud? And how can you?

  • Know your numbers
  • Develop the model
  • Be consistent

 

** Disclaimer: Math ahead. Actual costs and data points obfuscated for confidentiality. **

 

 

Know your numbers

 

ServiceNow has 2-3 primary server roles, depending on the generation:

  • front-end App servers
  • back-end Database (DB) servers
  • Backup servers

 

Each server has or can support these elements

  • capital acquisition cost (CAPEX)
  • operational cost (OPEX)
  • customer capacity
  • performance

 

All four elements must be quantified.

 

CAPEX is usually easy: It's what the quote says. Most vendors come back at a later date with better pricing so it is helpful to be transparent up front about the size of the potential business so that pricing doesn't shift much later in the process.

 

OPEX gets dicey the moment people costs are included. Since people costs can be calculated many different ways, we generally exclude them from evaluation cost models and only account for them at the end of a decision cycle that recommends a significant shift in operational practice. If we are simply refreshing a hardware platform, modeling people impact is overkill.

 

For most online services, data center space, power and cooling are the dominant server operational costs. ServiceNow uses a weighted cost per watt (W) that includes all three to account for OPEX over a 3 year server lifecycle. For example:

  • Weight and average the cost per kw per month from our providers (e.g. $800/kwm)
  • Divide into W (e.g. $0.80 per watt-month)
  • Multiply by an estimated 36-month lifecycle (e.g. $28.8/W).

 

Your data center contracts may be written differently, but the goal is to normalize these and arrive at a cost per watt over a defined lifecycle. A longer term obviously increases Total Cost of Ownership (TCO), but weights it more heavily toward OPEX.

 

Here is a sample weighted and averaged kw/m table based on a random set of rack draws and quantities that results in an $800/kwm weighted average.

KW Cost.png

If space is accounted for separately from power and/or cooling, amortize it across the planned draw of the cage/contract.

 

Now you need to know how many watts your devices use, and how heavily the devices will be used. Most manufacturers provide "nameplate" guidance and power calculators (e.g. dell.com/essa, shown below). Simply enter the configuration you are testing into the calculator, and it should provide the max potential draw at 100% utilization.

 

A few optimized services can achieve 100%, but most can't. ServiceNow intentionally uses a 60%** utilization target for power and over-provisions performance to handle transient customer load demands. For example:

 

A sample App server draws 200W at 100% and 150W at 60%.

 

power calc.png

 

Customer capacity will be application dependent, and vary depending on the roles in your environment. In our case, App sizing is memory based (e.g. 1GB RAM/Customer, so an App server with 32GB could support 32 customers) and DB sizing is based on a combination of size and transaction rate. The important thing is to have this defined before building your cost model.

 

Performance testing will depend on your application, but each configuration under review should have the same test run against it. For example, let's assume our App server can process 5,555 transactions per second (TPS). We make sure to use benchmarking tools that represent customer load as accurately as possible. We tune public tools like SysBench MySQL and SpecJVM to match our application and also use a variety of in-house benchmarking tools to gauge device capability.

 

Develop the model

 

Next time we will plug these elements into a cost model. It may be difficult to assemble these numbers, but without them it will be extremely difficult to develop any useful conclusions.

 

** Utilization targets are not linear. If a device uses 100W at 100% util, 60% util could very well be in the 70-85W range, not 60W. A mistake at this point could eventually lead to significant under- or over-provisioning capacity.

2 Comments