Integration Hub synchronous REST - question about large amount of slow messages.

Tom Sienkiewicz · ‎09-16-2022

Hi All,

we have a setup where a Scheduled Job will be triggering a couple thousand of synchronous payloads once per month.

Those payloads are sent to an external system using Subflow/Integration Hub REST step. They go via a middleware into the target system, then a response comes back and is logged in SN.

The problem is that after running initial e2e handshake test, we found out each payload takes ca. 15 seconds to be processed (this is at the middleware and unfortunately cannot be changed).

So I am going to try and run some stress loads next, but I would like to ask for your opinion too, those of you who had a similar scenario. Is this going to create timeouts in SN? Is there a limit as to how many connections Integration Hub can keep open at a time?

What would be the best way to mitigate this? I can see the below options:

switch to asynchronous REST (which means not use iHUB I guess, as I think REST Step can only use sync REST). But then we lose the responses and any error handling.
switch to async as above, and expose an additional endpoint to receive async responses (which will allow for error handling but requires total architecture change also involving middleware and target)
change the logic so that the payloads are not sent once per month but in "real time" as soon as some records are created - this would require business logic change which is also a tough one.

Is there any other potential workaround we can use which will not involve too much extra work in the middleware or changing business requirements?

Thanks!

Tony Chatfield1 · ‎09-16-2022

Hi, as a default I would always look at using asyn messages via the ECC queue and I would always consider real-time processing to avoid the prolonged\continuous resource consumption incurred by bulk processing.
If the target platform provides poor response times, then I would think both of these requirements would be critical to a reliable solutiom.
Context is important for any delay\performance issue as your ServiceNow instance has finite resource and so if resource is tied up by a poorly performing integration, then it is not available for other functionality and this may potentially result in other delays within the platform.

View solution in original post

Tony Chatfield1 · ‎09-16-2022

Hi, as a default I would always look at using asyn messages via the ECC queue and I would always consider real-time processing to avoid the prolonged\continuous resource consumption incurred by bulk processing.
If the target platform provides poor response times, then I would think both of these requirements would be critical to a reliable solutiom.
Context is important for any delay\performance issue as your ServiceNow instance has finite resource and so if resource is tied up by a poorly performing integration, then it is not available for other functionality and this may potentially result in other delays within the platform.

Tom Sienkiewicz · ‎09-21-2022

@Tony Chatfield thanks again, in the meantime I run some stress tests and have some intertesting findings. Perhaps you can give me some opinion from your side on this.

Let me illustrate the setup first.

1. Scheduled job creating a big array and then triggering a Flow Designer Subflow per each element of the array.

2. Subflow using REST Step to hit middleware, once response received Subflow logs the response.

I checked based on 500 payloads so far. The processing took 22 minutes, when I checked Outbound HTTP logs, I saw that ca. 3-4 payloads were sent every 10-15 seconds (thats more or less the middleware processing duration). I checked stats.do, API_INT semaphores did not look overloaded (although runing at 4 all the time, max queue depth not reached).

Which makes me think that due to a for loop in the scheduled job or some specifics of Integration Hub itself, it does not trigger all payloads at once, but generates few first Subflows/payloads, waits till they finish then generates next ones etc. I'm not sure where exactly to attribute this behaviour, but it seems good for API semaphores...

Any other place where I might check for potential issues? Many thanks in advance for any input!