ServiceNow REST Message - Mimic Python WebScraping

Jordan Rose1
Kilo Expert

I am trying to accomplish a web scraping process of getting behind a password-protected site and pulling back HTML from the site.  I am able to do this with no authentication through a REST message, but when the site I am trying to authenticate against is using an apache token and needs to be accessed via a form post, I am unable to pass through that authentication gate.  I am able to work around this hurdle using the following Python script:

import mechanize
import cookielib
from bs4 import BeautifulSoup
import html2text

# Browser
br = mechanize.Browser()

# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)

# Browser options
br.set_handle_equiv(True)
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)

br.addheaders = [('User-agent', 'Chrome')]

# The site we will navigate into, handling it's session
br.open('https://www.acmecorp.com/login')

# View available forms
for f in br.forms():
    print f

# Select the second (index one) form (the first form is a search query box)
br.select_form(nr=1)

# User credentials
br.form['userName'] = 'test user'
br.form['password'] = '12345'

# Login
br.submit()

print(br.open('https://www.acmecorp.com/records.do').read())

My question is, can I duplicate this functionality from Python somehow in Javascript/REST?  I'd like to avoid relying on Python script calls if possible.

Thanks

5 REPLIES 5

Yep, i did that myself before. The issue is complex authentication with the site I need to access.  Python allows me to make post from the login form and then access another page after being authenticated.  Unfortunately this site I am accessing does not have REST/SOAP API available, so i'll probably just end up using a scheduled Python script from a server to parse through the HTML and then write to the ServiceNow REST API.