The Zurich release has arrived! Interested in new features and functionalities? Click here for more

Travis Toulson
Administrator
Administrator

Implementing Camera Controls  for CreatorCon C3.png

Check out other articles in the C3 Series

 

When we set out to build camera functionality for CreatorCon C3, our requirement seemed straightforward: take the user's picture for AI avatar generation. We knew we needed browser-based photo capture that worked within ServiceNow's Service Portal, but we had no idea what challenges awaited us in the diverse landscape of real-world devices and browsers.
 
What started as a simple "take a photo" feature evolved into a complex journey of discovering mobile device differences, permission handling edge cases, and conference environment realities. This article chronicles both our technical implementation and the unexpected challenges we encountered along the way providing a realistic roadmap for ServiceNow developers venturing into camera integration.
 
Our resulting solution leveraged Angular Directives to create modular, reusable camera components, but the path to get there taught us valuable lessons about the gap between "works in testing" and "works for thousands of conference attendees with diverse devices."
 

Solution Overview

 

How data flows through the Camera parent widget and angular providersHow data flows through the Camera parent widget and angular providers

 
At a high level, the solution involved just a few pieces:
 
  1. tcgCameraPicker: Angular Directive that enumerates available cameras and provides selection interface
  2. tcgCamera: Angular Directive that manages camera stream, video display, and photo capture
  3. Parent Widget: Coordinates the data and events between the two Angular Providers, the intake UI, and the server upload
Implementing the solution as a pair of Angular Directives kept the camera logic separate from the business logic while also making the camera UI elements reusable and easily portable. There were a lot of unknowns when we built the initial intake UI, so keeping things modular was extremely important. Plus, it's super fun to see <tcg-camera-picker></tcg-camera-picker> in my widget HTML.
 

Camera Directive Implementation

 

Camera Directive HTML and visual structureCamera Directive HTML and visual structure

 

The main camera directive handles stream management, video display, and photo capture:

/** 
 * Element directive that supports showing a selected Camera
 * Usage <tcg-camera id="c.cameraId" photo-requested="c.photoRequested" photo="c.photo"></tcg-camera>
 *  {String} id - The Device ID of the camera retrieved using navigator.mediaDevices.enumerateDevices() 
 *  {Any} photo-requested - A unique value that when changed will trigger taking a photo
 *  {String} photo - A Base 64 Data URL representing the photo taken
 */
function() {
    // Note the video feed can be updated using constraints such as: video.srcObject.getVideoTracks()[0].applyConstraints({ aspectRatio: 16/9 })
    let videoEl;
    let canvasEl;

    function selectCamera(cameraDeviceId, $scope) {     
        if (navigator.mediaDevices && navigator.mediaDevices.getUserMedia) {
            return navigator.mediaDevices.getUserMedia({
                video: { 
                    deviceId: cameraDeviceId,
                    width: 4096,
                    height: 2160,
                }
            })
            .then(function (stream) {
                if (videoEl.srcObject) {
                    videoEl.srcObject.getTracks().forEach((track) => {
                        track.stop();
                    });
                }

                videoEl.srcObject = stream;
                videoEl.play();
            })
            .catch(function (err) {
                    $scope.$emit('CAMERA_ERROR', err);
            });
        }
    }
    
    function getPhoto() {
        // Get the intrinsic dimensions of the video stream
        const vidWidth = videoEl.videoWidth;
        const vidHeight = videoEl.videoHeight;
        // We will resize the image to a cropped square 
        // using the shorter dimension of the video stream
        const size = Math.min(vidWidth, vidHeight); 
        // Set canvas to the desired square size
        canvasEl.width = size;
        canvasEl.height = size;
        
        // Find the x and y coordinates
        // that will allow centering
        // the cropped region
        const sx = (vidWidth - size) / 2;
        const sy = (vidHeight - size) / 2;
        
        let ctx = canvasEl.getContext('2d');
        
        ctx.scale(-1,1); // Mirror image to reflect mirrored video
        
        // Select the region of the video stream
        // from the top left coordinate (sx, sy)
        // extending to the legnth and width = size
        // and draw that region to the canvas
        // starting at the top left coordinate (0, 0)
        // and drawing to the legnth and width = size
        ctx.drawImage(videoEl, sx, sy, size, size, 0, 0, size * -1, size);
        
        // Return a data url of the cropped image
        return canvasEl.toDataURL('image/jpeg');
    }

    function controller($scope, $element, $attrs) {
        videoEl = $element.find('video')[0];
        canvasEl = document.createElement('canvas');

        $scope.$watch('id', function(newValue, oldValue) {
            selectCamera(newValue, $scope);
        });
        
        $scope.$watch('photoRequested', function(newValue, oldValue) {
            if (newValue != oldValue) {
                $scope.photo = getPhoto();
            }
        });
    }

    return {
        restrict: 'E',
        scope: {
            id: '=',
            photoRequested: '=',
            photo: '='
        },
        controller: controller,
        template: `
        <div class="cameraOverlay"></div>
        <video playsinline></video>
        `
    };
}

 

In case you are unfamiliar with Angular Directives, pay close attention to lines 86-98. This object defines the resulting directive:
 
  • restrict: The value 'E' indicates that this directive will create a custom HTML Element that can be used in AngularJS templates like Widget HTML. The name of the directive tcgCamera dictates that the element will be <tcg-camera>.
  • scope: This is the isolated scope of the directive which contains the properties that are passed to the controller. In short, this object defines the HTML attributes on the element and how data is bound from the attribute to the controller.
  • controller: This defines the controller function which is very similar to the Client controller script of Service Portal Widgets. This function is executed when the custom element is added to the page, so all the core logic is here.
  • Template: This is the HTML template that replaces the custom element tag when it is rendered out to the web page. Think of it as similar to using ng-include.
 
Let's dig into the core of this implementation:
 

Scope

 

The directive uses two-way binding (id: "=") to communicate a few different attributes:
 
  1. id: The selected camera id to display in the video. The parent widget communicates the selected camera to this directive via this attribute.
  2. photoRequested: The parent widget communicates to the directive that it should take a picture by changing this attribute. Totally stole this from the folks that implemented our Photobooth Next Experience component.
  3. photo: This attribute contains the Base64 encoding of the most recent image taken. The directive writes this value to communicate the image to the parent widget.

 

HTML Template

 

The HTML template of the directive contains two main elements:
 
  1. Video: The video tag, seen on line 96, is a native HTML element that in this component allows us to show the live camera feed for the user to see themselves as they line up the picture. Webcams provide a continuous video stream from which we can capture still frames / images.
  2. Camera Overlay: Seen on line 95, the camera overlay is a CSS controlled semi-transparent oval overlay that guides the user in lining up their face

 

Controller Function

 

The controller function from line 71-84 is executed when the <tcg-camera> element is rendered in a widget. This function sets up the core behavior:
 
  1. Gives access to the video element so we can manipulate it (line72)
  2. Creates a hidden canvas element (never added to the DOM) so we can capture and manipulate still images from the camera's video stream
  3. Set up a watcher to update the video when the <tcg-camera> tag's id attribute changes (lines 75-77)
  4. Set up a watcher to capture a still image and update the photo attribute to the Base64 encoded image when the <tcg-camera> tag's photoRequested attribute changes (line 79-83)

 

selectCamera

 

Behavior flow on initialization or when a user selects a new cameraBehavior flow on initialization or when a user selects a new camera

 

The selectCamera function uses navigator.mediaDevices.getUserMedia to capture a camera device by id and play it in the video tag. It properly handles stream cleanup by stopping all existing tracks before creating a new stream. This prevents resource leaks and ensures only one camera stream is active at a time. Additionally, it emits an error event if it encounters any issues obtaining a camera device.

 

getPhoto

 

Behavior flow when a user clicks Take PhotoBehavior flow when a user clicks Take Photo

 

The getPhoto function uses the hidden canvas element to write a still image from the video tag. It also does a bit of image manipulation including cropping and mirroring which we will discuss separately. Lastly, it returns the Base64 encoding of the image in the canvas.
 

Camera Picker Directive Implementation

 

Camera Picker directive's HTML and visual structureCamera Picker directive's HTML and visual structure

 

The first time I fired up the camera directive in the browser, I ran into my first big problem... it chose my OBS virtual camera which was turned off. Oddly enough, the AI Image Generator couldn't recognize any faces in a camera that was turned off, so I needed a way for the user to pick the right camera if the default choice didn't work. Enter directive number 2: the camera picker.

 

function() {
    function enumerateCameras() {
        if (navigator.mediaDevices && navigator.mediaDevices.enumerateDevices) {
            return navigator.mediaDevices.enumerateDevices()
            .then((devices) => {
                return devices
                    .filter((device) => device.kind == 'videoinput')
                    .map((device) => {
                        return {id: device.deviceId, text: device.label};
                    });
            })
            .catch((err) => {
                console.error('Enumerate cameras error', err);
            });
        }
        else {
            return Promise.reject('Required mediaDevices API not supported');
        }
    }

    function controller($scope, $element, $attrs) {
        navigator.mediaDevices.getUserMedia({
            video: {
                width: 4096,
                height: 2160,
            },
        }).then(() => {
            enumerateCameras().then(function(cameras) {
                $scope.cameras = cameras;
                $scope.selectedCamera = cameras[0].id;
                $scope.$apply();
            });
        });
    }

    return {
        restrict: 'E',
        scope: {
            selectedCamera: "="
        },
        controller: controller,
        template: `
        <div>
          <label for="cameraPickerSelect">Which camera would you like to use:</label>
        </div>
        <div>
          <select name="cameraPickerSelect" id="cameraPickerSelect" ng-model="selectedCamera">
            <option ng-repeat="camera in cameras" value="{{camera.id}}">{{camera.text}}</option>
          </select>
        </div>
        `
    };
}

 

Scope

 

The directive uses two-way binding (selectedCamera: "=") to communicate camera selection back to the parent widget, enabling real-time camera switching.
 

HTML Template

 

This template is slightly more involved because it is effectively a form field. It includes a label, a select box, and the options for the select box. The select box is populated with the $scope.cameras property and the ng-model is bound to the selectedCamera attribute.
 

Controller

 

The controller function seen on lines 21-34 uses the navigator.mediaDevices.getUserMedia call to request access to camera devices. The object passed to the getUserMediadefines the constraints for the desired device, in this case a high resolution video device (4096x2160). It's important to note that a camera does not have to have that resolution to be included in the results, this request simply ensures that we get the highest resolution the attached cameras can support. A lower value in this constraint object can result in lower resolution streams from the same cameras.
 
Once the user grants permission to access the cameras, the enumerateCameras function is called and the resulting array of camera objects is used to update the select box options via the cameras property. Additionally, the selectedCamera attribute is set to the first camera in the list as a default.
 

enumerateCameras

 

The enumerateCameras function uses the navigator.mediaDevices.enumerateDevices to populate an array of the available cameras. The enumeration filters specifically for videoinput devices, excluding audio inputs and other media devices. Each camera is mapped to a simple object with id and text properties for clean data binding.
 

Parent Widget Implementation

 

The parent widget coordinates between the camera and camera picker directives:
 

HTML Template

<div class="c3CameraWrapper">
    <p class="notifText">
      Please fill the entire oval with your face!
    </p>
    <div class="c3Camera">
      <tcg-camera id="c.selectedCamera" photo-requested="c.photoRequested" photo="c.photo"> 
      </tcg-camera>
      <tcg-camera-picker selected-camera="c.selectedCamera">
      </tcg-camera-picker>  
    </div>
</div>
<div class="cctcgActionContainer">
    <button class="cctcgButton" ng-click="c.currentStage.previous()">
      Previous
    </button>
    <button class="cctcgButtonPrimary" ng-click="c.currentStage.next()">
      Take Photo
    </button>
  </div>

 

The widget's relevant HTML contains the tcg-camera element, the tcg-camera-picker element, and a button that handles the Take Photo logic via c.currentStage.next().
 

Client Controller

 

function takePhoto() {
  c.photoRequested = Date.now();
  goToStage(c.Stage.REVIEW_PHOTO);
}

// Controls which camera is visible in the video
c.selectedCamera = undefined;

// Changing this value triggers the camera to take a picture
c.photoRequested = undefined;

// When the camera takes a photo, it creates
// a data URL with a Base64 encoding of the image file
// stored in this variable
c.photo = undefined;

 

The relevant parts of the Client Controller are simple since most of the logic is contained within the directives themselves. The Client Controller acts mostly as a coordinator, setting up some shared data properties: selectedCamera, photoRequested, and photo. Then it exposes a simple takePhoto function that sets the photoRequested to the current date and time to trigger a new photo being taken by the camera directive and then navigates to the next UI state.
 
The core solution provides:
 
  • Clean separation: Each directive has a single responsibility
  • Reactive updates: Camera switching happens automatically via scope watching
  • Declarative capture: Photo taking uses data binding rather than imperative calls
  • Error propagation: Camera errors bubble up to widget state management

 

Problems and Solutions

 

During the building, integration to AI, testing, and even at the conference we encountered several issues that required some creative problem solving.
 

Logos, Chests, and other impropriety

 

We discovered early on that Replicate AI and our selected prompt had some issues... some "lawsuits and not exactly safe for work type" issues. Despite our best efforts at manipulating the AI prompt, we couldn't quite get rid of them:
 
  • Logos like Superman and Green Lantern appearing on people's chests
  • Shirtless and otherwise beach bodied avatars
  • Comic book... exaggerations of certain body parts
  • Facial recognition errors that resulted in no image at all
 
We tried countless technical solutions, but the best solution ended up being... CSS.  Yeah, take that 'CSS isn't a programming language'.
 
.cameraOverlay {
  position: absolute;
  width: 100%;
  height: 100%;
  background: radial-gradient(ellipse 50% 60% at 50% 50%, 
                                transparent 60%, rgba(0,0,0,0.40) 50px);
  z-index: 1;
}

 

This silly little bit of CSS added an oval overlay to the video stream and users were encouraged to fill the frame with their face. That bit of psychological... encouragement turned out to be the best solution. Not only did it reduce the appearance of those troublesome chests in the images, but it also solved the issue of facial detection by making sure the face wasn't too large to too small to be seen. While we still had the occasional user who ignored the frame, it greatly cut down on the number of issues and reduced the burden on our manual moderation process to a more reasonable level.
 

Front Facing Camera Inversion

 

Another issue we encountered was that it turns out to be really hard to line up your face in a front facing camera frame. Everything is inverted horizontally from what you would expect. So, we did a bit of trickery with... CSS.
 
video {
  transform: scaleX(-1);
}

 

By mirroring the video tag, the directions were more intuitive as the user attempted to line up their face in the frame. We also ended up mirroring it back in the canvas element but that was probably unnecessary and could have ended up being more confusing to the user. Fortunately, we didn't hear many complaints about that.

 

ctx.scale(-1,1); // Flip the final image back to correct orientation

 

Camera Access Errors

 

Samsung. Yes, other devices had issues accessing the camera but none so many as Samsung phones. We still don't know why but many users had issues granting permissions to access the camera on these devices and other older Android devices. Occasionally, mostly in testing, we also had issues with camera access due to another app having control of the camera or due to OS level permissions (looking at you Mac). Sadly, there wasn't a great solution to these issues on the user's side. There were too many devices, too many potential reasons. So, we had to resort to ye 'ol error page which fortunately we baked into our UI state transitions:
 
$scope.$on('CAMERA_ERROR', (evt, err) => {
    console.log('Camera error: ', err);
    goToStage(c.Stage.CAMERA_ERROR);
    $scope.$apply();
});

 

Once we were able to inform users, they made their way to the booth where we improvised some photo kiosks using our laptops at Knowledge 25. This led to the creation of a new Kiosk mode in the intake app post Knowledge but hilariously we had to just improvise a 'Kiosk' using user impersonation. Sometimes we just discover things in production.
 

Cropping the Image Square

 

So, one fun feature of the AI image model is that it tries really hard to imitate the source photo you give it. Really hard. So much so that it will attempt to imitate the aspect ratio of the source image. We needed the resulting avatars to be 1200x1200 squares. And since most webcams are most definitely not square, we had some work to do. Initially, we tried setting the device constraints to a square aspect ratio, but it often resulted in stretching and skewing the source video rather than cropping it.
 
Next, we resorted to CSS... because it seemed to be working for everything else.
 
video {
  max-height: 50vh;
  aspect-ratio: 1 / 1;
  object-fit: cover; 
  display: block;
}

 

This little block of CSS gave us a cropped square video tag from the end user's perspective. Unfortunately, the image sent to the server was still a rectangle. That pesky video stream was still the wrong aspect ratio under the hood and so our hidden canvas tag was receiving the rectangular aspect ratio as well.
 
And that's where I got my hands dirty with the canvas element in the getPhoto function.
 
// Get the intrinsic dimensions of the video stream
const vidWidth = videoEl.videoWidth;
const vidHeight = videoEl.videoHeight;

// We will resize the image to a cropped square 
// using the shorter dimension of the video stream
const size = Math.min(vidWidth, vidHeight); 

// Set canvas to the desired square size
canvasEl.width = size;
canvasEl.height = size;

// Find the x and y coordinates
// that will allow centering
// the cropped region
const sx = (vidWidth - size) / 2;
const sy = (vidHeight - size) / 2;
        
let ctx = canvasEl.getContext('2d');

ctx.scale(-1,1); // Mirror image to reflect mirrored video

// Select the region of the video stream
// from the top left coordinate (sx, sy)
// extending to the legnth and width = size
// and draw that region to the canvas
// starting at the top left coordinate (0, 0)
// and drawing to the legnth and width = size
ctx.drawImage(videoEl, sx, sy, size, size, 0, 0, size * -1, size);

 

With a bit of math, I could square the shortest side of the rectangle and then center the square in the middle of the image. Then I could use the canvas context to draw only the square region of the video frame and presto, we had a 1:1 aspect ratio from both the user perspective and in the resulting image.
 

Conclusion

 

Implementing camera controls in Service Portal requires balancing web API capabilities with real-world device limitations. Our Angular Directive approach provided a clean, modular solution that handled the technical complexity while maintaining good user experience.
 
The key lessons from our journey through unexpected challenges and real-world deployment:
 
  1. Camera integration is more complex than it appears what works in testing may not work in production
  2. Mobile device diversity creates challenges that can't be fully anticipated in controlled environments
  3. User experience psychology (like our CSS overlay) can solve technical problems more effectively than complex algorithms
  4. Pragmatic solutions (staff assistance, kiosk alternatives) may be more valuable than perfect technical implementations
  5. Error handling must be workflow-integrated from the beginning, not retrofitted after problems emerge
 
For ServiceNow developers building camera-enabled applications, this implementation pattern provides a proven foundation, but more importantly, it illustrates the iterative discovery process. The technical solution is only part of the story. Understanding the real-world challenges and having adaptable strategies makes the difference between a successful deployment and a failed one.