Jakob Anker
ServiceNow Employee
Options
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
01-27-2023
12:57 AM
The Council
In ServiceNow, we have an architectural council review where the most complex issues are brought up in an expert forum of Platform Architects. The goal is to uncover potential steps to solve the presented situation.
Here I would like to share some topics from the sessions without disclosing any identifying information about the client.
Introduction
Yet again, we are dealing with one of the largest enterprises you can imagine. This particular enterprise manages a vast amount of financial transactions and is responsible for avoiding and detecting fraudulent behavior.
Our client is using performance analytics and would like to lean into the advanced features herein, as well as a good UX and visibility made available by ServiceNow.
To achieve this, we need to make the data of another large database available for ServiceNow.
Additionally, it is a wish that the solution in ServiceNow is to process the transactional data by comparing them with a sort of benchmark data set, from where the potentially malignant deviations can be determined.
Problem Statement
We need to determine the best way to deliver the transaction data to the Performance Analytics application of ServiceNow.
We want to keep a query time that doesn't negatively impact the end user's experience.
We expect as many as just below 100K queries/day (worst-case scenario). This is a lot.
Considered Options
At a high level, there are possible directions to serve the data:
-
data-on-demand
-
data-synchronization
Both would entail a database sharding: partition of the table in ServiceNow by rotation or extension.
The log table of ServiceNow uses rotation, where the records are in transit through seven (IIRC) data tables before being archived. Otherwise, ServiceNow would quickly turn into 90% logs.
This risk exists in the extension method, where each partition of data is kept in a new table. It has superior audit trails (which is a plus if it was not available from the source database), though potentially a lot heavier storage-wise.
The data partition would also enable faster reporting performance, allowing the reports to be built on specific data ranges instead of considering the entire mass of transactional data.
Data-on-Demand
Two flavors can be found here.
-
REST-queries
In essence, both methods provide the data in the same way. On-demand.
The data obtained by the remote table might be reused as it can be cached, though realistically, the data queries are for unique context, which removes this as an option.
Additionally, the data of the remote table might be overridden over and over by the many queries, defacto making the data unavailable for meaningful advanced reporting and machine learning.
REST queries that would post into the data table in ServiceNow would be another option. By doing this, we build a repository containing a subset of the transactional data available on the source database.
Doing this enables us to retain a dataset that would be meaningful for Performance Analytics and Machine Learning.
The drawbacks are here:
-
A subset may be a constraint for identifying fraudulent patterns in transactional data.
-
We are still looking at queries at a mass scale that could impact the performance of ServiceNow and the UX hereof - even before considering the potential impact of Performance Analytics and Machine Learning processing the large data sets.
Data-Synchronization
In data synchronization, we are building on some of the same principles as on-demand REST queries sans the on-demand.
Instead of doing it on demand, we could consider making a complete mirror image of the transactional dataset.
To decrease performance issues, it would be wise to consider utilizing delta data sets (e.g., only sending new/updated data) and merging the incoming data with the current information that has already been synched to ServiceNow.
The benefit of this method would be:
-
The entire data set is available for Performance Analytics and Machine Learning (range depending on rotation or extension architecture)
-
The number of queries per day could potentially be decreased to a much smaller number; instead of being on demand from ServiceNow queries, the integration would be initiated by the source database as soon as new/updated data is available.
I am leaning toward the last option here, as that would provide optimal performance from my point of view.
A scheduled job could fetch a delta a couple of times a day, though that might not be feasible if the end-users depend on real-time information.
1 Comment
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.