Concurrent imports
Summarize
Summary of Concurrent imports
Concurrent imports in ServiceNow allow splitting incoming data into multiple import sets that are transformed concurrently to reduce processing time. This is particularly useful for large data sets with lengthy scripts where the order of imports is not critical. When order matters, partitioning ensures sequential processing within each partition.
Show less
Concurrent imports introduce additional processing and monitoring overhead, so they should be enabled only after optimizing other parameters such as database indexes and transformations, and primarily used for large data volumes.
Scheduling and Processing
You enable concurrent imports by selecting the Concurrent Import option on the Scheduled Data Import form. When triggered, data is loaded into a temporary staging table and then transformed into the target table.
The system creates multiple import sets concurrently, up to the limit defined by the glide.scheduledimport.max.concurrent.importsets system property (default is 10). The number of import sets can scale with the cluster size.
Each active node runs Import Set Transformer jobs that pick import sets from a job queue and transform them concurrently, depending on available worker threads.
Monitoring and Management
Each concurrent import generates a Concurrent Import Set record showing related import sets, jobs, and transform histories, allowing you to monitor progress and resume or reprocess imports if needed.
The Concurrent Import Sets Jobs queue tracks job types and statuses for ongoing transformations.
Partitioning and Hierarchical Imports
By default, the system distributes records to import sets in a round-robin manner. You can implement custom partitioning scripts to assign rows with the same partition key to the same import set, preserving processing order within that partition.
Hierarchical imports enable scheduling child import sets to run after a parent import completes. For concurrent imports, the next child import starts only after all Import Set Transformer jobs finish, managed via an execution plan that sequences imports.
Synchronized Inserts and Coalesce
Coalesce fields define record uniqueness during transformation, updating existing records or inserting new ones accordingly. Concurrent imports enforce write locks on target tables during inserts to prevent duplicate record creation by parallel import sets.
Key Tables Involved
- Concurrent Import Set (sysconcurrentimportset): Stores details of each concurrent import set.
- Concurrent Import Set Jobs (sysconcurrentimportsetjob): Lists import sets pending processing.
- Execution Context for Scheduled import (sysexecutioncontext): Defines execution context and next import in hierarchical imports.
- Hierarchical scheduled import execution plan (sysexecutionplan): Maintains the execution order for hierarchical imports.
Domain Separation
To support domain separation in concurrent imports, add the sysdomain field to the scheduled import table. Both loading and transform jobs then run in the specified domain context, ensuring proper data segregation.
Split incoming data into multiple import sets and transform the import sets concurrently to reduce processing time.
Running a concurrent import can be helpful when order does not matter and imports take a long time due to large data sets with time-consuming scripts. If order matters, you can split the import into multiple partitions to ensure that each partition is processed in order.
Enable concurrent imports only after fine-tuning all other parameters, such as database indexes and transformations.
Scheduling concurrent imports
You enable concurrent imports by selecting Concurrent Import on the Scheduled Data Import form. For instructions, see Schedule a data import.
When the schedule runs a concurrent import, the system pulls the data from databases, Excel spreadsheets, CSV files, or other sources to a temporary staging table, and then transforms the data from the staging table to the target table.
When you run a concurrent import, the system creates multiple import sets, up to the value of the glide.scheduled_import.max.concurrent.import_sets system property (default = 10). For example, a two-node cluster produces four import sets, and a ten-node cluster produces ten import sets.
Import Set Transformer job
Each active node runs two Import Set Transformer jobs every minute, and those jobs poll the Concurrent Import Sets Jobs queue, pick import sets from the queue, and transform those import sets. All jobs run concurrently, depending on the availability of worker threads.
Concurrent Import Set record
Each concurrent import creates a Concurrent Import Set record. The form view shows all related import sets, concurrent import set jobs, and transform histories.
You can resume or reprocess any import set. For more information, see Monitor concurrent import sets.
Concurrent Import Sets Jobs queue
After loading data, the system adds the import sets to the Concurrent Import Sets Jobs table. The Concurrent Import Sets Jobs table indicates the job type and status of each concurrent import set job.
For more information, see Monitor concurrent import set jobs.
Partitioning concurrent imports
You can partition import sets to maintain the processing order within each partition.
By default, the system allocates records to import sets in a round robin fashion. However, you can write a custom script to define a custom partition key that identifies the target import set. Every row with the same partition key adds to the same import set, and the data in that import set is processed in sequential order.
Hierarchical imports
You can create a scheduled import set hierarchy by scheduling an import to run after another import set completes. One parent scheduled import can have many child scheduled imports, and each child scheduled import executes in the order specified. For concurrent scheduled imports, child scheduled imports can be started only after all Import Set Transformer jobs complete.
The last Import Set Transformer job starts the next import in the hierarchy.
The system generates an execution plan at the beginning of parent import process. Each import process uses the execution plan to fetch the next process to invoke. For concurrent imports, the last Import Set Transformer job fetches the next import and executes it.
Synchronized inserts
Coalesce fields help define uniqueness among records. The transformation process checks for an existing record with the coalesce values and updates the existing record, if it exists, or inserts a new record if none exists. For more information, see Updating records using coalesce.
By default, concurrent imports allow each running import set to insert new records. When an import set inserts a record, it establishes a write lock on the target table to prevent other import sets from inserting the same record.
Tables for concurrent imports
| Table | Description |
|---|---|
| Concurrent Import Set (sys_concurrent_import_set) | Stores details of each concurrent import set in import set records. |
| Concurrent Import Set Jobs (sys_concurrent_import_set_job) | Lists the import sets to be processed. |
| Execution Context for Scheduled import (sys_execution_context) | Specifies the execution context for each scheduled import. The execution context specifies the next scheduled import to use when processing a hierarchical scheduled import. |
| Hierarchical scheduled import execution plan (sys_execution_plan). | Stores the execution plan for hierarchical imports. The execution plan is a tree structure that identifies which scheduled import runs after the preceding scheduled input. |
Domain Separation with concurrent imports
You can add the sys_domain field to a scheduled import table to enable domain separation for the import set. Both import set loading and transform jobs run in the domain specified in the scheduled import set job.