Labelling

  • Release version: Zurich
  • Updated February 11, 2026
  • 1 minute to read
  • Data labeling is the process of reviewing and annotating data records to prepare them for AI training or quality validation.

    About data labeling

    Labeled data helps AI systems to understand the data and learn from it. By providing labeled examples, we can teach AI systems how to recognize patterns, categories, features, or relationships in the data. For instance, if we want to train an AI system to identify animals in images, we need to provide it with labeled images of different animals. The AI system can then learn how to associate the labels with the visual features of the animals.

    Labeling is important for AI because it enables us to train and evaluate AI systems effectively and accurately. Without labeled data, AI systems would not be able to learn from the data or make predictions based on it. Labeled data is essential for:

    • Training:

      Training is the process of feeding labeled data to an AI system so that it can learn from it and adjust its parameters accordingly. The more labeled data we provide, the more accurate and reliable the AI system becomes.

    • Validation:

      Validation is the process of using a subset of labeled data to check the performance of an AI system during training. Validation helps us to monitor the progress of the AI system and avoid overfitting or underfitting.

    • Testing:

      Testing is the process of using another subset of labeled data to measure the final performance of an AI system after training. Testing helps us to evaluate how well the AI system can generalize to new and unseen data.

    Labeling in Now Assist Data Kit

    In Now Assist Data Kit, labeling enables you to systematically review datasets, mark data quality, categorize information, and provide structured feedback on individual records.

    To use data labeling, you must have one of the following roles:
    • Data Kit Admin (sn_data_kit.admin): Full access to create projects, manage teams, configure layouts, and review all labeling work
    • Analyst (sn_data_kit.analyst): Limited access to assigned projects only; can view and label tasks but cannot modify project configurations

    Data foundation

    • Datasets

      Collections of records available for labeling (e.g., synthetic datasets from data curation teams)

    • Data Collections

      Packaged combinations of uploaded and AI-generated records (e.g., 100 uploaded records + 500 AI-augmented records = 600-record collection ready for labeling)