Performance issues across entire system all users

Katie A · ‎03-16-2016

We are having some very impactful performance issues in our instance. We are concerned because we are on the verge of populating our CMDB with multiple data sources -- but we aren't comfortable proceeding considering the system performance is already terrible. Most of the time pages are just very slow to load, but other times the page does not load at all and the browser crashes.

The issue affects ALL users in ALL areas of the system. The issue is affecting production as well as dev and test instances.

We are seeing client transaction times around 15 seconds, some as high as 20 seconds.

I opened a ticket with support but they gave us the run-around with insufficient answers, such as checking homepage refresh times. They were dismissive and tried to mark the incident as resolved without giving us any real solutions.

They checked out the health of our nodes and explained that everything seemed normal. We moved data centers about a month ago and we have requested moving back to the old data center to see if that might have been the cause of the issue.

Still, we are skeptical and extremely unhappy.

The issue cannot be homepage refresh times since the same issue is impacting both Dev and Test where there is only ONE user working there on a daily basis.We are a small company and we only have about 40 users in the system.There is no way that homepage refresh times are causing such response issues.

I checked all of the performance graphs on CPU, JVM memory, SQL transactions, etc. I don't see anything obvious in those graphs to indicate a problem.

Attached is a screenshot of the client response times which is very slow.

We are on Fuji Patch 10. The issue started about a month ago and has gotten worse in the past few weeks.

Has anyone experienced similar issues? We are not sure how to proceed considering the lack of real help from support.

Graham18 · ‎01-30-2017

Hi Jan,

We are also a on-premise instance and we are having major issues with our performance on the application.

Is there anything you can recommend we need to check on either the Linux or Front-End side?

Any assistance will be greatly appreciated!

Thanks,

Graham

JC Moller · ‎01-30-2017

Hi,

Is the performance issue affecting all users/all transactions or have you managed to narrow down the issue?

If you have access to the localhost logs on the back-end, you could start with greping 'Slow business rule' and 'EXCESSIVE' strings from the nodes and forward the output to a textfile. All Business rules that execute for more than 100 ms have this value written into localhost-log. All client transactions that execute for more than 5 sec have the EXCESSIVE string written to the log when the transactions finnishes.

grep 'Slow business rule' localhost_log.2017-01-30.txt > /tmp/slowbr.txt or count for occurence with grep 'Slow business rule' localhost_log.2017-01-30.txt | wc -l

grep 'EXCESSIVE' localhost_log.2017-01-30.txt

Have you looked at semaphores and session wait statistics, stats.do and Performance Home Page. Anything there?

- Jan

Graham18 · ‎01-30-2017

Hi Jan,

Thank you for your reply.

I have looked under "Slow business rule" and Metric Definition is there several times and the most proliffic.

I have also looked for EXCESSIVE and most of them take well over 1600ms to run, even 10,000.

We have looked at the stats.do page and home page but nothing really stands out, struggling to see anything obvious.

Are you able to share the specs of the App/DB Server you have? ie: CPU/RAM etc.

Thanks,

Graham

JC Moller · ‎01-30-2017

Hi,

Have you looked at the ERROR and WARNING rows in the localhost logs? Are the amounts normal?

Make sure no integration user id is looked out. This is what you would expect to see in the logs. I have seen situations where a single locked out MID user has generated millions of failed logins and blocked normal users' transactions from executing properly.

17:05:08.696 Info	API_INT-thread-4	*** Script: Basic authentication failed for user: xxxxxxxx
17:05:08.700 Warning	API_INT-thread-4	WARNING * WARNING * Failed authorization by script include 'BasicAuth' for value 'aW50ZWdyYXRpb25feWxlX2Jpejo='

Grab a 30 min snapshot from the localhost log and go through it line by line. Maybe something pops out from the logs.

Here is an example -> egrep '^2017-01-30 12:[0-2]' localhost_log.2017-01-30.txt > /tmp/30_min_snap.txt

- Jan

JC Moller · ‎01-30-2017

Hi,

Grep for the "glide.memory.watcher" value from the localhost and you will get list of the application node's memory usage every 5 minutes.

Are you seeing some high memory usage all the time or from time to time?

Timestamp	Level	Thread name	Message
20:05:39.018	Info	glide.memory.watcher	Currently using 29% of max memory, last minute's minimum usage was 27% of max memory
20:10:39.873	Info	glide.memory.watcher	Currently using 29% of max memory, last minute's minimum usage was 24% of max memory
20:15:39.910	Info	glide.memory.watcher	Currently using 30% of max memory, last minute's minimum usage was 25% of max memory
20:20:39.941	Info	glide.memory.watcher	Currently using 30% of max memory, last minute's minimum usage was 26% of max memory
20:25:39.972	Info	glide.memory.watcher	Currently using 30% of max memory, last minute's minimum usage was 27% of max memory
20:30:40.657	Info	glide.memory.watcher	Currently using 26% of max memory, last minute's minimum usage was 24% of max memory
20:35:40.722	Info	glide.memory.watcher	Currently using 27% of max memory, last minute's minimum usage was 25% of max memory
20:40:40.753	Info	glide.memory.watcher	Currently using 28% of max memory, last minute's minimum usage was 27% of max memory
20:45:40.794	Info	glide.memory.watcher	Currently using 31% of max memory, last minute's minimum usage was 28% of max memory
20:50:41.385	Info	glide.memory.watcher	Currently using 27% of max memory, last minute's minimum usage was 25% of max memory
20:55:41.416	Info	glide.memory.watcher	Currently using 30% of max memory, last minute's minimum usage was 26% of max memory
21:00:41.458	Info	glide.memory.watcher	Currently using 30% of max memory, last minute's minimum usage was 27% of max memory

- Jan