- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 weeks ago
When creating a data discovery job with a scan type of "Full", we are running into 2 issues.
1. When running a scan on a single table, the scan is not covering all records on the table. It will run a small number, 3 or 4.
2. When running larger jobs with multiple tables, the jobs are getting hung up and are not completing, they just keep running. I have not been able to determine what the maximum table number is, if there is one, for these jobs.
Any assistance appreciated!
Thank you,
Heather
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Thank you for your answer!
We discovered our main issue was this:
In Yokohama, the "Full" scan operates with incremental behavior. Each time a Full discovery job runs, it records the sys_updated_on timestamp of the last scanned record for each table and stores it as scan history. On subsequent runs, the job scans only records that were created or modified after this timestamp. This explains why fewer records are being scanned in the change_request table.
In Zurich, the Full scan behavior has been enhanced to ignore scan history and scan the entire table on every execution. The earlier incremental behavior has been introduced as a separate scan type called "Incremental."
We have adjusted our job sizes however, and enabled a job property that allows parallel job runs, so once we upgrade, we should be all set.
Thanks for your assistance!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 weeks ago
Hi @Heather White ,
We’ve seen similar behavior with Data Discovery (Privacy/Classification) in Yokohama, and it usually comes down to platform limits + execution model, rather than the scan type itself
Best Practice Approach
1. Check and Tune System Limits
Review system properties related to Data Discovery / Classification:
Max records per table
Batch size
Worker threads
“Full” scan does not always bypass these limits unless tuned.
2. Avoid Large Multi-Table Jobs
Do not include too many tables in a single job
Recommended:
5–10 tables per job max
This significantly improves completion rate
3. Run Jobs in Controlled Batches
Schedule jobs in sequence, not parallel
Avoid overlap with:
Discovery
Imports
Other heavy background jobs
4. Validate Table Size & Performance
Large tables without proper indexing can slow down scans
Check execution details to confirm:
Records picked vs processed
5. Monitor Execution
Use:
System Diagnostics → Stats
Job execution logs
Look for long-running or stuck workers
If this helps, please mark it as Helpful and Accept as Solution.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Thank you for your answer!
We discovered our main issue was this:
In Yokohama, the "Full" scan operates with incremental behavior. Each time a Full discovery job runs, it records the sys_updated_on timestamp of the last scanned record for each table and stores it as scan history. On subsequent runs, the job scans only records that were created or modified after this timestamp. This explains why fewer records are being scanned in the change_request table.
In Zurich, the Full scan behavior has been enhanced to ignore scan history and scan the entire table on every execution. The earlier incremental behavior has been introduced as a separate scan type called "Incremental."
We have adjusted our job sizes however, and enabled a job property that allows parallel job runs, so once we upgrade, we should be all set.
Thanks for your assistance!
