Create an agent which rewrite knowledge article for improvements

vinodk830951845 · ‎04-07-2026

We have created Agentic AI "Knowledge Rewrite Agent". This agent will enhances knowledge base articles by restructuring and clarifying content, making information easier to understand for end‑users and support staff.

• The agent is working fine for small knowledge articles.

• But for large knowledge articles, the generation stops mid-way / returns only partial output. This appears related to LLM processing limits on request/response size (token limits).

The agent is not producing the expected result due to the limit constraint, The output will be captured in 'AI Enhanced Draft' field where we observe only partial knowledge article has been enriched.

Note: the input fomat is HTML and it may contain images as well

rpriyadarshy · ‎04-07-2026

@vinodk830951845

LLM has a Maximum Token Limit. It Seems Its exceeding that Limit resulting in Truncation.

You Can refer few KBs of more details- KB2038552, KB2622143

You Can use Chunking Strategy for your use case.

How Token works in ServiceNow

===============================

Tokens are units of measurement that represent the amount of work done to parse the request and the compose the response of a large language model (LLM). Each LLM has a maximum token limit, also known as the context window, which it cannot exceed. This limit is set by the model provider and cannot be altered. You can find this information in the "Max Tokens" field within the sys_generative_ai_model_config table. The equation that is used to ensure requests do not exceed this amount is:

Model maximum token limit = Request token amount + Response token amount + Buffer token amount

===========================================

Regards

RP

rpriyadarshy · ‎04-09-2026

More Input for Same-

Issues with Token Limits
If you are encountering issues with request or response truncation due to token limits being insufficient for your needs, here are a few suggestions for resolving the issue:
Reduce the content in the request. Instead of including an entire record with the prompt, include only the most essential field (e.g., just the description).
Limit the size of the desired output – You can request that the LLM provide a shorter response, such as asking for a concise paragraph instead of a lengthy explanation.

If these adjustments do not resolve the issue, you have the following options:

Consider switching to a model provider with a higher maximum token limit.
Modify the maximum number of response tokens for the prompt.

Regards

RP