S-NOW Performance degradation.

Satyaki Bose1 · ‎03-06-2014

We are facing degradation in the app everyday at a particular time.

The homepage does not load, and it shows the running time for more than 1000+ secs.

After 30 mins, nodes start getting offline. We restart the nodes manually to get the app back on line.

Any idea, as to where do we need to mainly focus.

--> We have been trying to get the database transaction log/ database process list/ logs at the time of outage, but no concrete reason have been found.

--> We have increased the semaphores from 8 to 16/node.

--> changed the sql worker threads for MS SQL.

Request all to share there views in this.

NeilH2 · ‎03-07-2014

Just a few steps i would take to check performance issues.

1. Log a ticket with ServiceNow if you haven't already done so they can check instance performance and setup monitors for the time you state.

2. Check for any scheduled jobs running at the time or scheduled to run around that time.

a. These would not necessarily show up in the transaction/event logs and one piece of bad code could be to blame.

3. Check the server overview and ServiceNow Performance pages these will give you some idea of whats happening with the database.

a. https://INSTANCE NAME.service-now.com/home.do?sysparm_view=server_overview&sysparm_cancelable=true

Normally i would also say check its not your connection but as you stated your nodes are going down it looks to be instance related.

Hope this helps.

Neil

Satyaki Bose1 · ‎03-07-2014

Hello Neil,

Yes we do have a couple of scheduled jobs which keep running during the time of outage.

--> text index events process

This job is scheduled to run every 30 secs, but when I check the age it shows me around 1hr+ time.

We try to kill the job, but we cannot.

I have attached, the performance for one of the nodes.

What do you figure out from these.

I do see a spike in the database. Whenever we had these slow performance in SNOW, we did see spikes in the database.

david_legrand · ‎03-10-2014

As Neil said, if you have the issue on a specific time, infinite loop (or something similar) in the schedule jobs will be probably the cause.

The first thing you could do is to deactivate the scheduled jobs before it has to run (when you experienced the issue), this way you'll firstly have the faulty script and then you'll be able to find the root cause.

And if the script is the root cause but is as well critical and if you don't find easily the reason, maybe you could try to implement a temporary script for the most critical part (as a workaround).

And as Lawrence said, you could open an incident on hi support (even if the script is probably custom made)

Regards,

lawrence_eng · ‎03-07-2014

Hi SatyakiBose,

If you're noticing a performance issue, please open an incident via our online Technical Support system.

thanks,

Lawrence

--

Online Community Program Manager, ServiceNow