Find your people. Pick a challenge. Ship something real. The CreatorCon Hackathon is coming to the Community Pavilion for one epic night. Every skill level, every role welcome. Join us on May 5th and learn more here.

Luis Estéfano
ServiceNow Employee

LuisEstefano_0-1760690507828.png

 

ServiceNow provides powerful REST APIs to integrate with external systems, but like any enterprise platform, it enforces rate limits to ensure performance, stability, and fair usage across tenants. This article breaks down the core concepts of rate limiting, throttling, and how to plan for API consumption effectively.



Key Concepts of Rate Limiting

1. How Rate Limits Are Enforced

 

ServiceNow REST API performance is governed by a combination of system resources and subscription-based limits. While your licensing tier and node count define the infrastructure capacity of your instance, rate limits are enforced at the user or integration account level, not as a fixed throughput ceiling per node.

 

Rate limit rules (sys_rate_limit_rules) control the number of inbound REST API requests processed per hour for specific users, users with specific roles, or all users. Each node maintains a rate limit count per user, committing to the database every 30 seconds. This means that a single integration user hitting the instance is governed by its own rate limit rule — and scaling throughput is achieved by distributing requests across multiple integration users with dedicated rate limit allocations, not simply by adding nodes.

 

⚠️ Important clarification: The approximate figures below (e.g. ~25,000 requests/hour per node) represent general planning references based on typical governance defaults — they are not hard throughput ceilings. Actual achievable throughput depends on how rate limit rules are configured, the number of integration users, and the complexity and duration of each request. Thinking of capacity as “RPM × N nodes” is a common misconception — it is an architecture + governance exercise, not a node-counting exercise.

 

The actual throughput per second is also influenced by dynamic factors such as concurrent sessions and the availability of semaphores.

 

Semaphores are internal controls that manage how many transactions can run in parallel. When transactions are long-running or complex, they occupy these resources for longer periods, which can slow down or delay new incoming requests. If the system becomes saturated, it starts queuing transactions. Once the queue reaches its limit, additional requests may be rejected—typically with an HTTP 429 Too Many Requests error.

 

Because of this behavior, it’s difficult to define a fixed number of requests per second that any instance can reliably handle. Instead, throughput varies depending on the volume, complexity, and duration of each request.

 

Approximate Planning References:

• Per Instance: ~100,000 requests/hour (Enterprise plans, aggregate across nodes and users)

• Per User/Integration: ~25,000 requests/hour (typical default governance threshold)

• Burst Limit: ~50–100 requests/second depending on performance tier

Example Throughput Planning:

Plan Type
Approx Requests/Hour
Requests/Second

Small Instance

25,000

7

Medium Instance

50,000

14

Large Enterprise

100,000+

28+

 

 


 


2. Concurrent Requests, Queuing & Throttling

 

ServiceNow uses semaphores to manage concurrent transactions. When all active semaphores on a node are busy, incoming requests are queued rather than immediately rejected. Only when the queue itself is full does the platform return a 429 error. This queuing behavior means the real constraint on sustained throughput is concurrency and request duration, not the hourly rate limit envelope.

Limits (Example):

• 10 concurrent API threads per node

• Default Semaphores: 16

• Semaphore Queue Depth: 150

• Max Concurrent Transactions: 166 

ExplanationThis example configuration allows a maximum of 166 concurrent transactions: 16 active + 150 queued = 166 concurrent transactions. Then, the 167th transaction will be rejected with an HTTP 429 error. Requests may also return HTTP 202 Accepted if queued

 

Fast transactions (e.g. simple CRUD operations completing in <200ms) cycle through semaphores quickly, supporting high RPM with modest concurrency. Longer-running operations (search, bulk export, complex business rules) hold semaphores longer and fill the queue faster — which is why request design matters more than raw RPM targets.

 

You can monitor semaphore usage at:

https://<INSTANCE>.service-now.com/stats.do

 

 



3. Scaling Throughput: Architecture, Not Node Counting

 

A common misconception is that scaling API throughput is a matter of multiplying per-node numbers. In practice, achieving high sustained RPM requires an architecture and governance approach:

 

Separate API users per workload — Each integration user gets its own rate limit allocation. Distributing calls across dedicated users (e.g. one for transactional CRUD, one for search, one for bulk sync) multiplies effective throughput.

Tuned rate limit rules — Configure sys_rate_limit_rules per user or role to match your actual workload requirements, rather than relying on defaults.

Bulk and async patterns — Use Import Sets, batch APIs, and asynchronous processing for heavy data operations instead of synchronous per-record CRUD.

Concurrency management — Design integrations to keep semaphore queue healthy: paginate large queries, avoid long-running synchronous calls, and implement retry logic with backoff.

 

With these patterns, enterprise customers routinely achieve multi-thousand RPM sustained throughput. The key is treating throughput as an architecture + governance conversation with your platform team, not a licensing tier lookup.

 



4. Handling Throttling & Errors

 

When limits are exceeded, ServiceNow may respond with:

HTTP 429 – Rate limit exceeded (or semaphore queue full)

HTTP 202 – Request accepted but pending execution (queued)

 

Best Practices:

• Implement retry logic with exponential backoff

• Ensure third-party systems can handle 429 responses and honor the Retry-After header

• Avoid flooding the instance with simultaneous REST calls from a single user

 


 

Viewing & Configuring Rate Limits

 

You can view or configure rate limits using the sys_rate_limit_rules table:

https://<INSTANCE>.service-now.com/sys_rate_limit_rules_list.do

 

Create custom rules based on your integration needs and licensing agreements. Rules can target specific users, roles, or all users — with user-level rules taking precedence over role-level, and role-level over all-users.

 


 

Inbound vs. Outbound API Quotas

 

Inbound Calls:

• No hard quotas, but subject to performance limits (semaphores, rate limit rules)

• Throttling may occur under high load or when user-level rate limits are exceeded

 

Outbound Calls:

• No quotas unless using IntegrationHub

• For Integration Hub limits, refer to: IntegrationHub Licensing

 


 

Further Reading & Resources

 


 

Final Thoughts

Understanding and planning for ServiceNow REST API limits is essential for building scalable, resilient integrations. The key takeaway: rate limits are a governance mechanism applied per user/integration, and sustained throughput is achieved through architecture — dedicated integration users, tuned rules, async patterns, and concurrency-aware design. Always monitor usage, implement smart retry strategies, and consult your account manager for licensing-specific thresholds.

 

 

 

 

We hope this article has been useful. If it truly addressed your needs, please consider marking it as helpful. If not, we’d greatly appreciate your feedback so we can improve and better support our community. Feel free to reach out with any questions.

 

Thank you!

 

#servicenow #workflow #automation #rest #soap #api #limit #concept #bestpractice #outbound #inbound #call #integration #ratelimit #concurrency #semaphore #throughput

 


Kind regards,

Luis Estéfano

Version history
Last update:
2 weeks ago
Updated by:
Contributors