Understanding large language models (LLMs)

  • Release version: Zurich
  • Updated November 20, 2025
  • 2 minutes to read
  • Summarize
    Summarized using AI
    This content was generated using new OpenAI-powered functionality. Results are provided on an as is basis and are not guaranteed to be accurate or complete.

    Summary of Understanding large language models (LLMs)

    Large language models (LLMs), such as ChatGPT and Copilot, are advanced AI systems trained on extensive text data to generate human-like language dynamically. Unlike traditional retrieval-based systems, LLMs create responses probabilistically, meaning outputs can vary each time a prompt is submitted. This non-deterministic behavior enables flexibility, creativity, and adaptability in their responses, which is valuable for dynamic interaction scenarios within ServiceNow platforms.

    Show full answer Show less

    How LLMs Work

    LLMs operate by predicting the next word (or token) in a sequence based on learned probabilities from their training data. Rather than storing fixed answers, they generate responses one token at a time, sampling from multiple likely options. This process results in different answers for the same input, reflecting the model’s probabilistic nature.

    Why Results May Vary

    • Probabilistic Sampling: The model selects from several probable words instead of always choosing the single most likely one, introducing natural variation.
    • Temperature Settings: Internal parameters like temperature control randomness—higher temperatures increase creativity, while lower temperatures produce more repetitive responses. These settings can differ across LLMs.
    • Multiple Valid Answers: Many questions have several correct explanations, so the model might phrase responses differently each time.
    • Context Sensitivity: Slight changes in input, such as punctuation or prior conversation context, can influence output variations.
    • System-Level Factors: Backend factors including hardware concurrency and floating-point math can cause minor output differences even with identical prompts.

    This intentional randomness is akin to rolling weighted dice, where the "dice" favor more probable words but allow for variability, ensuring responses remain flexible and creative rather than fixed.

    Implications for ServiceNow Customers

    ServiceNow’s AI Platform integrates multiple LLMs and AI search tools that reflect the probabilistic nature of these models. Customers should expect that repeated queries may yield different but valid responses. This variability is a designed feature that enhances adaptability and user experience. Additionally, discrepancies in search results between different AI search tools within ServiceNow are normal and expected.

    Large language models are generative, not retrieval-based. They create responses dynamically using probability, which means you can’t expect identical outputs every time. This variability is a feature, not a bug, because it allows for flexibility, creativity, and adaptability.

    How LLMs work

    Large language models (LLMs), like ChatGPT or Copilot, are advanced AI systems that are trained on massive amounts of text to understand and generate human-like language. They build a statistical model of language, so they don’t store fixed answers like an encyclopedia. When you ask a question, the model generates an answer one word (or token) at a time, choosing the next most likely word based on probabilities learned during training. This prediction process makes them powerful, but it's also why they are non-deterministic. This means the system does not always produce the exact same result (output) for the same prompt (input).

    Why results may vary

    Even if you provide the same question or prompt twice, the response can differ. Here’s why:
    Probabilistic sampling
    The model doesn’t always pick the single most likely word. It samples from several likely options. This introduces variation.
    Temperature settings
    Temperature controls randomness, and this internal parameter varies among LLM models. A higher temperature delivers more creative responses, while lower temperatures tend to be more repetitive.
    Multiple valid answers
    Many questions have more than one correct way to explain something. The model may choose different phrasing or emphasis each time.
    Context sensitivity
    Tiny changes in punctuation or prior conversation can shift the output.
    System-level factors
    Hardware concurrency, floating-point math, and backend updates can introduce slight variations, even when everything else is fixed.

    For example, think of it like rolling dice to pick words. When you ask a question, the model doesn’t follow a fixed script. Instead, it looks at many possible next words and picks one based on probabilities—like rolling weighted dice. The dice are weighted toward the most likely words, but there’s still a chance for variation. If you roll again (ask the same question), you might get a slightly different sequence, even though the rules didn’t change. This randomness is intentional. It makes the model flexible and creative, rather than rigid and repetitive.

    For more information about supported LLM models, see Large language models on the ServiceNow AI Platform.

    Variations in search

    The ServiceNow AI Platform® offers a variety of search tools, which may return different answers for the same or similar searches. This disparity in results is expected. For more information, see the following topics: