Using Stream Connect for Apache Kafka
Summarize
Summary of Using Stream Connect for Apache Kafka
Stream Connect for Apache Kafka connects your Apache Kafka environment to your ServiceNow instance, facilitating the streaming of data between them. This integration allows you to publish and process Kafka events efficiently while leveraging ServiceNow’s capabilities.
Show less
To use Stream Connect for Apache Kafka, an Automation Engine subscription and a Stream Connect subscription are required.
Key Features
- Producers and Consumers: Stream Connect includes Kafka producers to publish events and multiple consumer types to read and process these events, such as Kafka Message triggers and ETL consumers.
- Topics and Namespaces: Events are organized in topics, which can be grouped using namespaces for better management and access control in domain-separated instances.
- Subscriptions: Each consumer has a subscription that stores configuration details, which helps monitor consumer performance through metrics.
- Message Replication: Stream Connect enables message replication between Kafka and ServiceNow using a MID Server, simplifying setup with automatic certificate generation.
- Error Handling: Undelivered messages are stored in a dedicated table, with a scheduled job attempting redelivery, while unprocessed messages can be monitored and managed.
Key Outcomes
By utilizing Stream Connect for Apache Kafka, ServiceNow customers can:
- Streamline event publishing and processing across systems with low latency and high volume.
- Leverage Flow Designer to create low-code flows that automate data handling between Kafka and ServiceNow.
- Efficiently manage data transformations and imports using existing configurations through ETL and Transform Map consumers.
- Gain visibility into integration performance and statistics via the Stream Connect dashboard, enabling better operational management.
Connect your Apache Kafka environment to your ServiceNow instance with ServiceNow® Stream Connect for Apache Kafka.
Apache Kafka is a distributed event-streaming platform that provides a unified way to exchange data across multiple systems. Stream Connect for Apache Kafka links your Kafka environment to your ServiceNow instance, enabling you to stream data between your instance and your external systems.
Benefits
Publish and process Kafka events at scale. Publish events to your Kafka environment from your ServiceNow instance and consume Kafka events from your external systems at a high volume with low latency.
- Build flows that produce and consume Kafka events. Stream Connect is integrated with Flow Designer, providing a low-code way to publish and process Kafka messages.
- Import data from your Kafka environment and process that data using your existing Robust Transform Engine (RTE) or transform map configurations.
- Configure a consumer that uses your own scripts to process data from a Kafka topic.
- Monitor your consumers' performance with detailed reporting of statistics and performance metrics.
Components
Stream Connect has the following components.
- Producers
A producer publishes events to a Kafka environment. Stream Connect has two producers.
- Kafka Producer step in Flow Designer
- ProducerV2 API
- Consumers
A consumer reads and processes events from a Kafka environment. Stream Connect has several consumers.
- Kafka Message trigger in Flow Designer
- Extract Transform Load (ETL) Consumer
- Transform Map Consumer
- Script Consumer
- Topics and topic namespaces
Events are organized and stored in topics. A topic stores events of the same type. Topics are partitioned. Events have a key. Events with the same key are stored in the same partition.
Topics link to a topic namespace. You can use namespaces to organize topics in logical ways. For example, you can group topics together based on which Kafka cluster they come from. You can also use namespaces to configure which domains can access which topics on a domain-separated instance. For more information, see Managing namespaces and topics in Hermes.
- Subscriptions
A subscription is a record associated with a consumer. It stores configuration information about the consumer, such as the name of the Kafka topic to consume messages from and the number of partitions the topic has. The subscription record is created when a Kafka stream is activated.
Each subscription record has several metrics that enable you to view the performance of the consumer reading from the topic. For more information, see Viewing Kafka subscriptions and statistics.
- Partition groups
A partition group is a set of topic partitions. For example, if a topic has six partitions, they can be divided into three partition groups, with two partitions in each group.
- Kafka consumer job
A job that regularly checks Hermes for any new events in a topic. The job picks a free partition group and retrieves its subscription. The subscription gives the topic name, and the job checks the partitions for messages for that topic.
- Kafka streams (not shown in the following image)
A Kafka stream is a record that defines the data stream for a consumer. If you're using the Kafka Message trigger in Flow Designer, the Kafka stream is automatically created for you. If you're using a different consumer, you’ll need to create one manually.
To link your Kafka environment to your ServiceNow instance, Stream Connect uses the Hermes Messaging Service. The Hermes Messaging Service enables your instance to produce and consume large volumes of Kafka events. It manages the flow of data between your Kafka environment and your instance. For more information, see Hermes Messaging Service.
The following diagram shows key components of Stream Connect, how they relate to ServiceNow and third-party applications, and how they connect to your Kafka environment through Hermes.
Stream Connect and Flow Designer
Build flows that produce and consume Kafka events with Stream Connect and Flow Designer. Stream Connect has a flow trigger for consuming Kafka events and an action step for producing them.
Use the Kafka Message trigger to create flows that process Kafka events. You can build a flow that consumes data from Kafka and inserts it into a table, or uses spokes to communicate the data to third-party environments.
The trigger is enabled when the flow is activated. After it's activated, the trigger starts the flow whenever there's a message in the specified Kafka topic. When you use the Kafka Message trigger, you don't need to create a Kafka stream or subscription record. The system automatically creates both when the flow is activated. Messages are read from the topic as long as the flow is active.
Use the Kafka Producer step to create actions that publish events to a topic in your Kafka environment. For example, you can use the step to create a message about an update on an incident in ServiceNow, then push the message to a topic in your Kafka environment.
ETL, Transform Map, and Script Consumers
Import data from your Kafka environment using your existing RTE or transform map configurations. The Extract Transform Load (ETL) and Transform Map consumers simplify your data imports by providing an efficient way to take a payload from a Kafka message, transform the data, and insert or update a record in a table. You can switch from a scheduled data import to one using Stream Connect and process the data with the same configurations.
You can also use the Script Consumer to process data from your Kafka environment. The Script consumer is for more advanced use cases, such as when the data in the message isn't structured, or it requires data lookups using code.
When you Configure an Extract Transform Load (ETL) consumer, Configure a Transform Map consumer, or Configure a script consumer, you also need to Create a Kafka stream.
ProducerV2 API
Publish events to a Kafka topic with the ProducerV2 API.
Stream Connect Message Replication
You can replicate data between your Kafka environment and ServiceNow with Stream Connect Message Replication.
Stream Connect Message Replication enables you to configure and manage message replications directly from your ServiceNow instance. It uses a MID Server to run the data replications, so you don't need to configure or host additional replication services. It also simplifies the message replication setup by automatically generating the required certificates.
For more information, see Stream Connect Message Replication.
Unprocessed and undelivered messages
If a message can't be delivered, it’s stored in the Kafka Undelivered Messages [sys_kafka_undelivered_messages] table. A scheduled job, Kafka Producer Retry, regularly reads this table and tries to redeliver any messages.
If a batch of messages can't be processed because it has timed out, it’s stored in the Kafka Unprocessed Messages [sys_kafka_unprocessed_messages] table. The time-out for a message batch can be set with the com.glide.kafka_consumer.timeout property. The default value is 60 seconds. This table is a rotated table, so it cleans records automatically.
Producer compression formats
- NONE
- GZIP
- LZ4
This property is not in the System Properties [sys_properties] table by default, so it must be added manually. This property sets the compression format for all Stream Connect producers.
Domain separation
Use Stream Connect topic namespaces to configure which domains can access a Kafka topic on a domain-separated instance. Group topics into ServiceNow namespaces, then link the namespaces to specific domains. For more information, see Domain separation and Stream Connect.