Candy AI Clone Project — Stuck with Response Logic

albertbadger · ‎10-06-2025

Hey everyone,

I’m working in Triple Minds and I’m working on a Candy AI Clone project like this. It’s been exciting so far, but I’ve run into a problem with the AI’s response logic. Sometimes the AI gives unrelated answers or repeats itself, even though the prompts and context seem fine.

I’m using Node.js backend with Open AI GPT API and a React frontend, storing chat history in MongoDB. Here’s what I’ve tried

Cleaning context before sending it to the model
Limiting maximum tokens for better relevance
Adding metadata for user preferences
Still, the AI occasionally loops or drifts off-topic

Has anyone experienced this while building a Candy AI or any AI clone? I’m trying to make conversations smooth, engaging, and less repetitive, but this part is tricky

Would love any suggestions or tips from the community!

Mark Manders · ‎10-06-2025

Can you also relate everything you mention to ServiceNow? It's a well known issue that AI is prone to repeating and giving unrelated answers. Open AI has even posted that their own way of working is the cause of this and that they don't know if it can be resolved.
But from your question it is hard to see what you expect from the ServiceNow Community. The fact that ServiceNow keeps pushing AI, doesn't mean it's an AI platform.

Please mark any helpful or correct solutions as such. That helps others find their solutions.
Mark

jessicag544 · ‎06-17-2026

Hey hi,

I am from fanso.io, even though we faced the same issue while working on a companion platform, here I have tried to give the best answer.

Looping and drift you're describing usually traces back to a small set of root causes, and they're worth checking in order before assuming it's a context problem.

1. Check how you're actually sending history to the API
If you're storing full chat history in MongoDB and just appending every past message to the messages array on every request, that's the most common cause of looping. Long, undifferentiated context causes the model to anchor on whatever phrasing or structure repeated earlier in the conversation, and it starts echoing patterns instead of generating fresh responses. Fix: cap how many raw turns you send (last 8 to 10 is usually enough), and summarize anything older into a short system-level note rather than including it verbatim.

2. Temperature and repetition settings
If temperature is low (anything under 0.7) and you're not setting frequency_penalty or presence_penalty, the model will gravitate toward safe, repeated phrasing, especially in long conversations. For a companion-style bot you want some variability. Try temperature around 0.8 to 0.9, and set presence_penalty to something like 0.3 to 0.6 to actively discourage reusing the same phrases turn over turn.

3. System prompt drift
If your system prompt is long and instructs the model to maintain a persona, tone, and various behavioral rules all at once, the model can lose track of which instruction takes priority as the conversation grows, especially past 15 to 20 turns. This often shows up as unrelated or generic answers. Keep the system prompt short and re-inject a compact reminder of persona and current context each turn rather than relying on one big upfront prompt to hold for the whole session.

4. Token limit cutting context awkwardly
You mentioned limiting max tokens for relevance. Be careful here, if you're trimming from the front of the conversation without preserving a summary of what was cut, the model loses thread continuity and starts responding generically because it's missing the setup it needs. Truncation should always preserve a rolling summary, not just chop oldest messages.

5. Metadata not actually influencing generation
Adding metadata for user preferences only helps if it's being inserted into the prompt context in a way the model can act on, not just stored alongside the conversation in MongoDB. If it's sitting in the database but not pulled into the system message or a few-shot example each turn, it has zero effect on output, which would explain why the bot feels disconnected from stated preferences.

A structural suggestion
Beyond fixing the symptoms, set up your turn structure like this on every request: a short persona/system message, a compact rolling memory summary (3 to 5 sentences max, regenerated periodically), the last 8 to 10 raw turns, and the current user message. This tends to fix looping and drift simultaneously because the model isn't trying to parse a huge undifferentiated blob of history every time.
One more thing worth flagging separately from the technical side: OpenAI's API usage policies restrict romantic or companion-style content of the kind a Candy AI clone is built around, and their moderation layer is tuned specifically to catch this pattern. That's a separate issue from the looping bug, but it's worth knowing since it can affect reliability and account standing independent of how clean your context handling is.