Issues with Data Discovery Jobs when running a Full Scan Type(for Privacy/Classification) Yokohama

Heather White · ‎03-26-2026

When creating a data discovery job with a scan type of "Full", we are running into 2 issues.

1. When running a scan on a single table, the scan is not covering all records on the table. It will run a small number, 3 or 4.

2. When running larger jobs with multiple tables, the jobs are getting hung up and are not completing, they just keep running. I have not been able to determine what the maximum table number is, if there is one, for these jobs.

Any assistance appreciated!

Thank you,

Heather

Heather White · ‎04-01-2026

Thank you for your answer!

We discovered our main issue was this:

In Yokohama, the "Full" scan operates with incremental behavior. Each time a Full discovery job runs, it records the sys_updated_on timestamp of the last scanned record for each table and stores it as scan history. On subsequent runs, the job scans only records that were created or modified after this timestamp. This explains why fewer records are being scanned in the change_request table.

In Zurich, the Full scan behavior has been enhanced to ignore scan history and scan the entire table on every execution. The earlier incremental behavior has been introduced as a separate scan type called "Incremental."

We have adjusted our job sizes however, and enabled a job property that allows parallel job runs, so once we upgrade, we should be all set.

Thanks for your assistance!

View solution in original post

ayushraj7012933 · ‎03-26-2026

Hi @Heather White ,

We’ve seen similar behavior with Data Discovery (Privacy/Classification) in Yokohama, and it usually comes down to platform limits + execution model, rather than the scan type itself

Best Practice Approach

1. Check and Tune System Limits

Review system properties related to Data Discovery / Classification:

Max records per table
Batch size
Worker threads

“Full” scan does not always bypass these limits unless tuned.

2. Avoid Large Multi-Table Jobs

Do not include too many tables in a single job
Recommended:
- 5–10 tables per job max

This significantly improves completion rate

3. Run Jobs in Controlled Batches

Schedule jobs in sequence, not parallel
Avoid overlap with:
- Discovery
- Imports
- Other heavy background jobs

4. Validate Table Size & Performance

Large tables without proper indexing can slow down scans
Check execution details to confirm:
- Records picked vs processed

5. Monitor Execution

Use:
- System Diagnostics → Stats
- Job execution logs

Look for long-running or stuck workers

If this helps, please mark it as Helpful and Accept as Solution.

Thanks!

Heather White · ‎04-01-2026

Thank you for your answer!

We discovered our main issue was this:

In Yokohama, the "Full" scan operates with incremental behavior. Each time a Full discovery job runs, it records the sys_updated_on timestamp of the last scanned record for each table and stores it as scan history. On subsequent runs, the job scans only records that were created or modified after this timestamp. This explains why fewer records are being scanned in the change_request table.

In Zurich, the Full scan behavior has been enhanced to ignore scan history and scan the entire table on every execution. The earlier incremental behavior has been introduced as a separate scan type called "Incremental."

We have adjusted our job sizes however, and enabled a job property that allows parallel job runs, so once we upgrade, we should be all set.

Thanks for your assistance!