Apache Kafka integration configuration fields

  • Release version: Yokohama
  • Updated February 23, 2025
  • 3 minutes to read
  • Summarize
    Summarized using AI
    This content was generated using new OpenAI-powered functionality. Results are provided on an as is basis and are not guaranteed to be accurate or complete.

    Summary of Apache Kafka integration configuration fields

    This documentation details the configuration fields for integrating Apache Kafka with ServiceNow’s Health Log Analytics in the Yokohama release. It guides you through setting up the integration to pull log data from Kafka clusters into your ServiceNow instance, ensuring proper ingestion and failover handling via MID Servers or MID Server clusters.

    Show full answer Show less

    Key Configuration Fields

    • Integration Name: A required unique name for your Kafka integration, which auto-adjusts the generic name displayed on the form.
    • Execute on: Choose between a specific MID Server or a MID Server cluster to run the data input.
    • MID Server / MID Server Cluster: Specify the MID Server or failover cluster that connects to Kafka. Only MID Servers supporting basic authentication are eligible (mTLS not supported). Log ingestion is automatically enabled if not already active.
    • Service Instance: Bind the incoming log data to a relevant ServiceNow service instance; this is mandatory.
    • Data Source: Read-only field set to Kafka, indicating the origin of the log data.
    • Description: Optional brief description to help identify the integration.

    Data Retrieval Settings

    • Kafka Node Names: Required comma-separated list of Kafka broker hosts and ports (format HOST:PORT). Does not need to include all cluster servers.
    • Topics: Required list of Kafka topics to subscribe to for log data.
    • Kafka Credentials: Select or create SSL credentials for Kafka authentication.
    • Group ID: The Kafka Consumer Group name for managing consumption offsets.

    Advanced Settings

    • Timeout: Polling wait time in milliseconds when no data is available (default 500ms).
    • Default Timezone: Time zone to apply if logs lack a timestamp zone, defaulting to GMT.
    • Sub Sample Receive Ratio and Drop Ratio: Parameters controlling event batching and reduction to manage volume (-1 disables subsampling).
    • Character Encoding: Fixed to UTF-8 for the data input.
    • Node Discovery Timeout: Timeout in milliseconds for discovering Kafka nodes (default 30ms).
    • Max Length in Bytes: Maximum event size, defaulting to 32,766 bytes.
    • Drop if Queue is Full: Option to discard logs if MID Server load is high; default is false to retain logs.

    Practical Considerations for ServiceNow Customers

    • Ensure MID Servers used support basic authentication as mTLS is not supported for log ingestion.
    • Configure failover MID Server clusters properly to maintain continuous data input availability.
    • Adjust the maximum number of data inputs per MID Server as needed by modifying MID Server properties.
    • Use the advanced settings to optimize data polling, event sizing, and handling of high MID Server load conditions.
    • Bind log data to the correct service instance to enable meaningful analytics and reporting within Health Log Analytics.

    Description of the fields on the Apache Kafka integration configuration forms for Health Log Analytics.

    Table 1. Provide details
    Field Description
    Integration Name Unique name of this integration. For example: My Kafka integration. This field is required.
    Note:
    When you fill in this field, the generic name displayed on the form adjusts automatically to match the name you entered.
    Execute on Option to select whether to use a specific MID Server or a MID Server cluster. This field is required.
    MID server name

    (Only when the Execute on field is set to Specific MID Server)

    MID Server to which log data from Apache Kafka is pulled. This field is required.
    Note:
    • You can select only MID Servers that support basic authentication. MID Servers that support mTLS are not listed.
    • The default maximum number of data inputs streaming logs to a single MID Server is 10. You can modify the maximum number by adding the property sn.occ.log_ingestion.max_datainputs_per_mid to the MID Server and then changing the default value.
    • If log ingestion is not enabled for the selected MID Server, Health Log Analytics enables it automatically.
    MID Server Cluster

    (Only when Execute on is set to Specific MID Server cluster.)

    The MID Server cluster to which the log data is pulled. This field is required.

    The data input runs on a single MID Server in the cluster until that MID Server fails. The system then moves all the data input tasks to the next available MID Server in the cluster according to the configured order.

    Note:
    • Health Log Analytics supports only failover MID Server clusters. In these clusters, multiple MID Servers are grouped together for failover protection. When selecting a cluster from the data input or integration form, the MID Server clusters list displays only failover clusters.
    • The MID Server cluster must include only MID Servers that support basic authentication. mTLS is not supported for log ingestion.
    • Log ingestion must be enabled for each MID Server in the cluster. If log ingestion is not enabled for the active MID Server, Health Log Analytics enables it automatically.
    • The default maximum number of data inputs or integrations streaming logs to a single MID Server is 10. A cluster passes capacity validation if it contains at least one MID Server with fewer than 10 data inputs or integrations running on it, even when that MID Server is down.
    For more information about MID Server clusters, see Configure a MID Server cluster.
    Service instance The service instance (formerly the application service) to which to bind the log data. This field is required.
    Data source The source of the log data that the integration pulls to your ServiceNow instance: Kafka. This field is read-only.
    Description Option to add a brief description of the integration to help identify it.
    Table 2. Set data retrieval method
    Field Description
    Kafka node names A comma-separated list in the format HOST:PORT,HOST:PORT. This field is required.

    The list does not have to include all the Apache Kafka Cluster servers.

    Topics A comma-separated list of topics to which the data input must subscribe. This field is required.
    Kafka credentials The Apache Kafka credentials.

    You can select existing Kafka SSL credentials, or create new ones by selecting Create Kafka credentials from the drop-down list. For a description of the fields on the Kafka SSL credentials form, see Kafka SSL credentials fields.

    Group ID The name of the Apache Kafka Consumer Group.
    Table 3. Advanced settings
    Field Description Default value
    Timeout The time, in milliseconds, spent waiting in the poll if data is not available in the topics. 500
    Default timezone The time zone of events that the system will use if a log does not specify the time zone.

    By default, the system uses GMT in such cases, but you can specify a different time zone.

    GMT
    Sub sample receive ratio The number of events to batch together, out of which all but one will be discarded. This setting is used to decrease the number of received events. -1
    Character encoding The character encoding for this data input. This field is read-only. UTF-8
    Node discovery timeout The time, in milliseconds, before node discovery times out. 30
    Sub sample drop ratio The number of events to batch together, out of which one will be discarded. This setting is used to reduce the number of fetched events. -1
    Max length in bytes The maximum length, in bytes, of events. 32766
    Drop if queue is full Option for selecting to discard logs if there is a load on the MID Server. False