- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Here’s a breakdown of how we technically implemented a Centralized Outbound API Maintenance module in ServiceNow — the same one that finally ended those 2 AM manual retries and “who re-sent that payload?” conversations.
- Architecture Overview
At the core, the design revolves around three major components:
- Outbound API Log Table – captures every outbound call made from the platform.
- Retry & Recovery Engine – a scheduled job that re-processes failed calls based on retry policies.
- Health Check Monitor – a proactive job that halts retries when a downstream system is unavailable.
Together, these provide full visibility, self-healing retries, and audit-ready traceability.
- Outbound API Log Table
We created a custom table called Integration_API,
Key Fields
Field | Description |
u_api_name | Integration Point/API Name
|
u_http_method | GET/POST/PUT/PATCH. |
u_endpoint_url | Target endpoint URL. |
u_request_payload | The JSON/XML payload sent. |
u_response_body | Response returned by the target system. |
u_status_code | HTTP code for quick filtering. |
u_result_state | Success, Failed, Retrying, Maxed Out, Skipped. |
u_retry_count | Number of retry attempts so far. |
u_next_retry_time | Timestamp for next scheduled retry. |
u_integration_owner | Reference to the team responsible for the integration. |
This table acts as a single pane of glass for all outbound API traffic — whether it’s triggered via Scripted REST APIs, IntegrationHub, or Flow Designer actions.
- Logging from Integrations
Instead of embedding retry logic in every Script Include, we centralized it into a utility script:
var OutboundAPIUtils = Class.create();
OutboundAPIUtils.prototype = {
initialize: function() {},
logOutboundCall: function(apiName, method, url, payload, response, status) {
var log = new GlideRecord('integration_api);
log.initialize();
log.u_api_name = apiName;
log.u_http_method = method;
log.u_endpoint_url = url;
log.u_request_payload = JSON.stringify(payload);
log.u_response_body = response ? response.getBody() : '';
log.u_status_code = response ? response.getStatusCode() : '';
log.u_result_state = (status == 'success') ? 'Success' : 'Failed';
log.insert();
}
};
Every outbound integration simply calls:
new OutboundAPIUtils().logOutboundCall('CRM Incident Sync', 'POST', targetURL, requestBody, response, result);
This ensures every API transaction is captured consistently — with no developer guesswork.
- Retry & Recovery Engine
Next, we built a Scheduled Script Job that runs every 30 minutes.
Logic Summary:
- Query u_outbound_api_log where
- u_result_state = Failed
- u_retry_count < Max_Retries
- u_next_retry_time <= now()
- For each record:
- Attempt to resend using the original payload.
- Increment u_retry_count.
- If successful, mark as Success.
- If not, reschedule next retry with exponential backoff (e.g., 15 → 30 → 60 minutes).
- If the max retry limit is reached, set to Maxed Out.
Code Snippet (simplified):
var retry = new GlideRecord('u_outbound_api_log');
retry.addQuery('u_result_state', 'Failed');
retry.addQuery('u_retry_count', '<', 3);
retry.query();
while (retry.next()) {
try {
var response = new sn_ws.RESTMessageV2();
response.setHttpMethod(retry.u_http_method);
response.setEndpoint(retry.u_endpoint_url);
response.setRequestBody(retry.u_request_payload);
var res = response.execute();
if (res.getStatusCode() == 200) {
retry.u_result_state = 'Success';
} else {
retry.u_retry_count++;
retry.u_next_retry_time = gs.minutesAgoStart(retry.u_retry_count * 30);
}
} catch (ex) {
retry.u_retry_count++;
}
retry.update();
}
This job single-handedly removed dozens of ad-hoc retry scripts across different modules.
- Health Check Monitor
To prevent “API storms,” a health check job runs every hour.
It pings each unique endpoint from integration_api using a lightweight GET request.
If an endpoint returns consistent failures or timeouts:
- The job updates a flag in a companion table u_integration_health_monitor.
- The retry engine then skips retry attempts for that endpoint until it’s marked healthy again.
Admins get an alert that says:
“Retries paused for CRM API – target system unreachable.”
Once the system responds successfully, retries resume automatically.
- Manual & Bulk Retry UI
We added two UI actions on the log table:
- Retry Now → reprocess a single failed record immediately.
- Bulk Retry → reprocess all failed calls for a specific API or time window.
Both use the same utility functions as the scheduler, ensuring consistency between manual and automated runs.
Each manual retry is logged in an Audit Table (u_api_retry_audit) with:
- Who triggered the retry.
- Timestamp.
- Result of the action.
That transparency has been a game-changer for governance and audit reviews.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.