Barcode Extraction from PDFs & Images using ZXing on ServiceNow MID Server

karthik65 · ‎04-15-2026

In Previous Part 1 - Extracting Text from PDF Attachments using Apache PDFBox using ServiceNow Midserver, we extracted text from PDF attachments using PDFBox. But many documents contain information encoded in barcodes that text extraction misses entirely. Medical supply orders have UDI barcodes. Shipping manifests carry tracking barcodes. Warehouse receipts use QR codes. This data is invisible to text extractors.
This article adds barcode scanning to the MID Server pipeline using ZXing (“Zebra Crossing”), the most widely-used open-source barcode library. Combined with PDFBox for PDF-to-image rendering, it can detect and decode barcodes embedded in any PDF document or standalone image attachment.
What You Will Build
• A MID Server Script Include that scans PDFs and images for barcodes
• Support for 12 barcode formats: QR Code, Code 128, Code 39, Code 93, EAN-13, EAN-8, UPC-A, UPC-E, Data Matrix, PDF 417, Aztec, and ITF
• Multi-barcode detection per page (GenericMultipleBarcodeReader)
• A combined extraction method that runs text + barcode scanning in a single MID Server call
• GTIN-to-record correlation using serial number prefix matching
Prerequisites
• Completed Part 1 (PDFBox installed, GlobalAttachmentHelper created)
• Or: MID Server with agent/extlib directory access
• Basic understanding of the ECC Queue and JavascriptProbe pattern from Part 1
Architecture
The barcode pipeline extends Part 1 by adding a rendering step between PDF loading and data extraction:
Stage What Happens
PDF Text (Part 1) PDFBox loads PDF → PDFTextStripper extracts text directly
PDF Barcode (Part 2) PDFBox loads PDF → PDFRenderer renders page as image at 600 DPI → ZXing scans image for barcodes
Image Barcode (Part 2) ImageIO reads PNG/JPG directly → ZXing scans for barcodes

Step 1: Install ZXing JAR Libraries
ZXing requires two JARs:
JAR File Size Purpose
core-3.5.3.jar ~600 KB Core barcode processing engine — format detection, decoding algorithms, hint system
javase-3.5.3.jar ~38 KB Java SE helpers — BufferedImageLuminanceSource for converting images to ZXing’s internal format
Download URLs
https://repo1.maven.org/maven2/com/google/zxing/core/3.5.3/core-3.5.3.jar
https://repo1.maven.org/maven2/com/google/zxing/javase/3.5.3/javase-3.5.3.jar
Install
1. Copy both JARs to your MID Server’s agent/extlib/ directory
2. Register each in the ecc_agent_jar table (same process as PDFBox in Part 1)
3. Restart the MID Server
⚠ Reminder: JARs MUST be registered in ecc_agent_jar or FileSync will delete them on restart.
If You Also Need PDFBox
To scan barcodes embedded in PDFs (not just standalone images), you also need PDFBox for rendering PDF pages as images. If you followed Part 1, pdfbox-app-2.0.31.jar is already installed. If starting fresh, install all three JARs.
Validate All Classes Load

Complete Script:
var BarcodeScanner = Class.create();
BarcodeScanner.prototype = {

initialize: function() {
this.ImageIO = Packages.javax.imageio.ImageIO;
this.MultiFormatReader = Packages.com.google.zxing.MultiFormatReader;
this.BinaryBitmap = Packages.com.google.zxing.BinaryBitmap;
this.HybridBinarizer = Packages.com.google.zxing.common.HybridBinarizer;
this.BufferedImageLuminanceSource =
Packages.com.google.zxing.client.j2se.BufferedImageLuminanceSource;
this.DecodeHintType = Packages.com.google.zxing.DecodeHintType;
this.BarcodeFormat = Packages.com.google.zxing.BarcodeFormat;
this.GenericMultipleBarcodeReader =
Packages.com.google.zxing.multi.GenericMultipleBarcodeReader;
this.PDDocument = Packages.org.apache.pdfbox.pdmodel.PDDocument;
this.PDFRenderer = Packages.org.apache.pdfbox.rendering.PDFRenderer;
},

/*
* Configure decode hints with all supported formats
*/
_getHints: function() {
var hints = new Packages.java.util.HashMap();
var formats = new Packages.java.util.ArrayList();
formats.add(this.BarcodeFormat.QR_CODE);
formats.add(this.BarcodeFormat.CODE_128);
formats.add(this.BarcodeFormat.CODE_39);
formats.add(this.BarcodeFormat.CODE_93);
formats.add(this.BarcodeFormat.EAN_13);
formats.add(this.BarcodeFormat.EAN_8);
formats.add(this.BarcodeFormat.UPC_A);
formats.add(this.BarcodeFormat.UPC_E);
formats.add(this.BarcodeFormat.DATA_MATRIX);
formats.add(this.BarcodeFormat.PDF_417);
formats.add(this.BarcodeFormat.AZTEC);
formats.add(this.BarcodeFormat.ITF);
hints.put(this.DecodeHintType.POSSIBLE_FORMATS, formats);
hints.put(this.DecodeHintType.TRY_HARDER,
Packages.java.lang.Boolean.TRUE);
return hints;
},

/*
* Detect ALL barcodes in a BufferedImage
*/
_decodeMultiple: function(bufferedImage) {
var source = new this.BufferedImageLuminanceSource(bufferedImage);
var bitmap = new this.BinaryBitmap(
new this.HybridBinarizer(source));
var reader = new this.MultiFormatReader();
var multi = new this.GenericMultipleBarcodeReader(reader);
var results = multi.decodeMultiple(bitmap, this._getHints());
var output = [];
for (var i = 0; i < results.length; i++) {
output.push({
text: "" + results[i].getText(),
format: "" + results[i].getBarcodeFormat()
});
}
return output;
},

Method 1: Scan Barcodes in a PDF
/*
* Scan all pages of a PDF for barcodes
* Handles: chunk decoding, gzip decompression, page rendering, barcode detection
*/
scanPDF: function(chunksJson, dpi) {
dpi = dpi || 600;
var response = { status: "success", pages: [] };
var document = null;
try {
// Decode chunks and decompress (same as Part 1)
var chunks = JSON.parse(chunksJson);
var decoder = Packages.java.util.Base64.getDecoder();
var baos = new Packages.java.io.ByteArrayOutputStream();
for (var i = 0; i < chunks.length; i++) {
var bytes = decoder.decode(chunks[i]);
baos.write(bytes, 0, bytes.length);
}
var allBytes = baos.toByteArray();
var isGzip = (allBytes.length > 2
&& (allBytes[0] & 0xFF) == 0x1F
&& (allBytes[1] & 0xFF) == 0x8B);
var pdfBytes;
if (isGzip) {
var gzis = new Packages.java.util.zip.GZIPInputStream(
new Packages.java.io.ByteArrayInputStream(allBytes));
var out = new Packages.java.io.ByteArrayOutputStream();
var buf = Packages.java.lang.reflect.Array.newInstance(
Packages.java.lang.Byte.TYPE, 4096);
var n;
while ((n = gzis.read(buf)) != -1) out.write(buf, 0, n);
gzis.close();
pdfBytes = out.toByteArray();
} else { pdfBytes = allBytes; }

// Load PDF and render each page as image
document = this.PDDocument.load(
new Packages.java.io.ByteArrayInputStream(pdfBytes));
var renderer = new this.PDFRenderer(document);
var pageCount = document.getNumberOfPages();

for (var p = 0; p < pageCount; p++) {
var pageResult = { page: p + 1, barcodes: [] };
try {
// Render page at specified DPI
var image = renderer.renderImageWithDPI(p, dpi);
pageResult.barcodes = this._decodeMultiple(image);
} catch (pageErr) {
pageResult.error = "No barcode found";
}
response.pages.push(pageResult);
}
} catch (e) {
response.status = "error";
response.error = "" + e.message;
} finally {
if (document != null) document.close();
}
return JSON.stringify(response);
},
Method 2: Scan Barcodes in an Image (PNG/JPG)
/*
* Scan a standalone image for barcodes
* No PDFBox needed - just ZXing + ImageIO
*/
scanImage: function(chunksJson) {
var response = { status: "success", barcodes: [] };
try {
var chunks = JSON.parse(chunksJson);
var decoder = Packages.java.util.Base64.getDecoder();
var baos = new Packages.java.io.ByteArrayOutputStream();
for (var i = 0; i < chunks.length; i++) {
var bytes = decoder.decode(chunks[i]);
baos.write(bytes, 0, bytes.length);
}
var allBytes = baos.toByteArray();

// Decompress if gzipped
var isGzip = (allBytes.length > 2
&& (allBytes[0] & 0xFF) == 0x1F
&& (allBytes[1] & 0xFF) == 0x8B);
var imgBytes;
if (isGzip) {
var gzis = new Packages.java.util.zip.GZIPInputStream(
new Packages.java.io.ByteArrayInputStream(allBytes));
var out = new Packages.java.io.ByteArrayOutputStream();
var buf = Packages.java.lang.reflect.Array.newInstance(
Packages.java.lang.Byte.TYPE, 4096);
var n;
while ((n = gzis.read(buf)) != -1) out.write(buf, 0, n);
gzis.close();
imgBytes = out.toByteArray();
} else { imgBytes = allBytes; }

var bais = new Packages.java.io.ByteArrayInputStream(imgBytes);
var image = this.ImageIO.read(bais);
if (image == null) throw new Error("Unable to read image");
response.barcodes = this._decodeMultiple(image);
} catch (e) {
response.status = "error";
response.error = "" + e.message;
}
return JSON.stringify(response);
},

type: "BarcodeScanner"
};

Step 4: Test Barcode Scanning
4A: Test with a PDF Containing Barcodes
// Get chunks for a PDF attachment
var helper = new global.GlobalAttachmentHelper();
var chunksJson = helper.getAttachmentChunksJson("PDF_ATTACHMENT_SYSID");

// Send barcode scan probe to MID Server
var script = 'var scanner = new BarcodeScanner();' +
'var result = scanner.scanPDF(probe.getParameter("chunks"), 600);' +
'result;';

var eccId = helper.submitMIDProbe("YOUR_MID_SERVER", "BarcodeScan",
script, JSON.stringify({ chunks: chunksJson }));
gs.info("ECC ID: " + eccId);

// Check result after 30 seconds:
var output = helper.getProbeResult("ECC_ID_HERE");
gs.info("Result: " + output);
Sample Output
{
"status": "success",
"pages": [
{
"page": 1,
"barcodes": [
{ "text": "62449010030203", "format": "CODE_128" },
{ "text": "64624619978346", "format": "CODE_128" }
]
}
]
}
4B: Test with an Image (PNG/JPG)
var helper = new global.GlobalAttachmentHelper();
var chunksJson = helper.getAttachmentChunksJson("IMAGE_ATTACHMENT_SYSID");

var script = 'var scanner = new BarcodeScanner();' +
'var result = scanner.scanImage(probe.getParameter("chunks"));' +
'result;';

// For each text record that has a serial number:
if (record.serial_number && gtinValues.length > 0) {
var serialPrefix = record.serial_number.split("-")[0];
for (var g = 0; g < gtinValues.length; g++) {
if (gtinValues[g].indexOf(serialPrefix) > -1) {
record.gtin_number = gtinValues[g];
gtinValues.splice(g, 1); // Remove matched GTIN
break;
}
}
}
The splice() call ensures each GTIN is only matched once. This prevents a single GTIN from being assigned to multiple records when serial prefixes overlap.

Step 7: Integrate into the Automated Pipeline
Update the Submit PDF Extraction Action
Modify the Flow Designer action to classify attachments and call the appropriate method:
// Inside the action script:
var fileType = "" + gr.file_type;
var dp = new YOUR_SCOPE.DocumentProcessor();

if (fileType === "pdf") {
// Combined text + barcode extraction
eccId = dp.submitFullExtraction("" + gr.sys_id);
} else if (fileType === "image") {
// Image barcode scan only
eccId = dp.submitImageBarcodeScan("" + gr.sys_id);
}
Update the Process PDF Result Action
Handle both PDF and image results:
if (fileType === "pdf") {
// Result has both text and barcodes
var barcodeData = result.barcodes || null;
var records = dp.parseExtractedText(result.text.fullText, barcodeData);
dp.insertRecords(records, emailSysId, fileName);
} else if (fileType === "image") {
// Result has barcodes only
if (result.barcodes && result.barcodes.length > 0) {
for (var b = 0; b < result.barcodes.length; b++) {
// Create record with GTIN from barcode
var gr = new GlideRecord("your_target_table");
gr.initialize();
gr.gtin_number = result.barcodes[b].text;
gr.source_file = fileName;
gr.insert();
}
}
}

Troubleshooting
Problem Solution
No barcode found at 300 DPI Increase to 600 DPI. Small or thin-bar barcodes need higher pixel resolution for ZXing to decode.
ClassNotFound: BufferedImageLuminanceSource Missing javase-3.5.3.jar. This JAR provides the bridge between Java images and ZXing.
PDFRenderer is not a function PDFBox JAR was deleted by FileSync. Re-register in ecc_agent_jar and restart MID.
NotFoundException from ZXing No barcode detected in image. Normal for pages without barcodes. Wrapped in try/catch per page.
Only one barcode found per page Using MultiFormatReader.decode() which returns first match. Use GenericMultipleBarcodeReader.decodeMultiple() instead.
Wrong barcode format detected Restrict POSSIBLE_FORMATS hint to only the formats you expect. Fewer formats = faster and more accurate.
GTIN not matching to records Check serial number format. Prefix matching assumes format like '8346-101219' where first segment is the lookup key.
Large PDF times out Rendering at 600 DPI is memory-intensive. For 50+ page PDFs, process specific pages or reduce DPI to 400.
Image barcode works but PDF barcode fails PDF rendering issue. Check that PDFBox can render the page: try text extraction first to verify the PDF loads.

Key Takeaways
1. DPI is everything: 300 DPI fails. 600 DPI works. This single setting is the difference between barcode detection working and not working on PDF documents.
2. Use GenericMultipleBarcodeReader: The basic MultiFormatReader.decode() only finds the first barcode per image. Real documents often have multiple barcodes per page.
3. TRY_HARDER matters: This hint tells ZXing to spend more time analyzing the image. It catches barcodes that are slightly rotated, partially occluded, or low contrast.
4. GTIN correlation via serial prefix: The most valuable non-obvious insight. Barcode GTIN values can be automatically matched to the correct line item by searching for the serial number’s first segment within the GTIN string.
5. Combined extraction saves round-trips: Running text extraction and barcode scanning in a single MID Server probe call halves the ECC Queue traffic and wait time.
6. Image vs PDF is a routing decision: PNG/JPG images go directly to ZXing. PDFs go through PDFBox rendering first, then ZXing. Your Flow Designer action should classify and route accordingly.