ServiceNow recherche

Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows

Résumé

Large Language Models (LLMs) such as GPT-4o can handle a wide range of complex tasks with the right prompt. As per token costs are reduced, the advantages of fine-tuning Small Language Models (SLMs) for real-world applications – faster inference, lower costs – may no longer be clear. In this work, we present evidence that, for domain-specific tasks that require structured outputs, SLMs still have a quality advantage. We compare fine-tuning an SLM against prompting LLMs on the task of generating low-code workflows in JSON form. We observe that while a good prompt can yield reasonable results, fine-tuning improves quality by 10% on average. We also perform systematic error analysis to reveal model limitations.

Publication
Knowledge Discovery and Data Mining
Orlando Marquez
Orlando Marquez
Applied Research Scientist

Applied Research Scientist at Azimuth located at Montreal, QC, Canada.