How to configure loadbalancer F5 health check (on-premises installation)

JC Moller · ‎10-14-2014

Hi,

Our loadbalancer (F5 Big) is doing health checks via a http session every 5 sec.

This seems to lock the session into Catalina's session manangement (example below) for 120 minutes (glide.ui.session_timeout = 120)

This causes the JVM to be filled up with sessions that are not needed for inactivity monitoring (and causes performance issues).

What is the recommended way for the F5 to do the health check? Any ideas how this is optimally done.

Regards,

Jan

2014-10-14 10:03:03 (560) http-14 SYSTEM Session created: 63F56634E5BF61005C791C90498D5EDF, timeout after 120 minutes of inactivity

2014-10-14 10:03:03 (591) http-14 63F56634E5BF61005C791C90498D5EDF Parameters -------------------------

2014-10-14 10:03:03 (601) http-14 63F56634E5BF61005C791C90498D5EDF *** Start #499,488, path: /navpage.do, user: guest

2014-10-14 10:03:03 (609) http-14 63F56634E5BF61005C791C90498D5EDF User agent with HTTP/1.1 and no encoding: null

2014-10-14 10:03:03 (655) http-14 63F56634E5BF61005C791C90498D5EDF [0:00:00.046] getRealForm

2014-10-14 10:03:03 (698) http-14 63F56634E5BF61005C791C90498D5EDF [0:00:00.040] slow evaluate for: getBannerText()

2014-10-14 10:03:03 (742) http-14 63F56634E5BF61005C791C90498D5EDF [0:00:00.044] slow evaluate for: getBannerSrc()

2014-10-14 10:03:03 (750) http-14 63F56634E5BF61005C791C90498D5EDF *** End #499,488, path: /navpage.do, user: guest, time: 0:00:00.182, render: 0:00:00.178, network: 0:00:00.006, chars: 0, SQL time: 101 (count: 14), business rule: 0 (count: 2)

2014-10-14 10:03:03 (750) http-14 63F56634E5BF61005C791C90498D5EDF User agent with HTTP/1.1 and no encoding: null

2014-10-14 10:03:03 (750) http-14 63F56634E5BF61005C791C90498D5EDF /navpage.do -- transaction time: 0:00:00.190, waited: 0:00:00.000, source: 10.133.18.232

Alex North · ‎11-24-2014

Hi Jan,

What performance issues are you seeing? Having multiple idle sessions with the JVM should not be causing you any headache. They should just get tidied away after the timeout period has elapsed - it's not like they're doing anything to consume resources.

Alex

Staff Engineer, Service Now

JC Moller · ‎11-24-2014

The glide.ui.session_timeout parameter had for some reason been set to 480 min. This caused that TomCat's session management was hanging onto more than 30.000 guest user sessions generated by LB's health checks at any given time. By setting the timeout value to 1 hour and pointing the LB's health check to the SCN server's and app node's stats.do pages, we now have about 500-600 sessions in the session management per app node, which I think sounds more normal. The top high value of 30.000 sessions was causing noticeably performance issues. We run into this issues after upgrade to the Dublin version.

One question comes to mind. How frequently should you do a health chech for the server's URL and the various app nodes via the load balancer. Once a second, every five seconds or some other value? Any recommendations for this from ServiceNow's side?

Regards,

Jan

Alex North · ‎11-24-2014

Hi Jan,

I believe the recommendations are being 'formally' documented and will be posted in the selfhost.service-now.com portal in the not too distant future.

In the meantime I would have the F5 poll each of the nodes stats.do once a minute to determine whether to keep that node in the member pool.

Alex