Datasets

  • Release version: Australia
  • Updated May 21, 2026
  • 2 minutes to read
  • Summarize
    Summarized using AI
    This content was generated using new OpenAI-powered functionality. Results are provided on an as is basis and are not guaranteed to be accurate or complete.

    Summary of Datasets in AI Risk and Compliance

    Datasets in AI Risk and Compliance are designed to capture and govern the data utilized by AI models, helping organizations evaluate risks, ensure compliance, and maintain transparency throughout the AI asset life cycle. These datasets support governance by recording critical information such as risk assessments, compliance status, ownership, audit trails, and performance metrics, which facilitates oversight, accountability, and informed decision-making.

    Show full answer Show less

    The quality, composition, and management of datasets are crucial since they directly affect AI model performance, fairness, and accuracy. Properly curated datasets ensure models learn meaningful patterns and produce reliable results in real-world applications. Organizations must assess datasets for completeness, accuracy, relevance, and bias, while also tracking data lineage to maintain traceability, transparency, and accountability.

    Compliance with data protection regulations, privacy laws, and organizational data policies is mandatory. Regular reviews and updates help sustain dataset quality and adapt to changing standards or business requirements.

    Key Features

    • Aggregated Risk Score: Each AI dataset record provides an aggregated risk score derived from individual risk scores related to bias, drift, security, and other factors. This consolidated score is accessible under the Details tab of the AI system record when the Advanced Risk application is installed and the appropriate migration property is enabled.
    • Risk Score Rollup: Individual entity risk scores using the Risk Assessment Methodology (RAM) roll up to form the aggregated risk score, enabling a comprehensive view of AI risks at departmental or enterprise levels.
    • Related AI Assets: The dataset record lists associated AI systems and AI models that utilize the dataset, supporting traceability and impact analysis.

    Key Outcomes

    • Improved governance and oversight of datasets supporting AI models.
    • Enhanced ability to assess and mitigate AI risks such as bias, drift, and security vulnerabilities through consolidated risk scoring.
    • Facilitated compliance with data protection and privacy regulations via detailed dataset tracking and audit trails.
    • Greater transparency and accountability in dataset management through lineage tracking and regular quality reviews.
    • Support for enterprise-wide risk visibility, enabling AI Risk and Compliance teams to monitor risks across models, teams, and business units effectively.

    Datasets in AI Risk and Compliance capture and govern the data used by AI models, enabling organizations to evaluate risk, ensure compliance, and maintain transparency across the AI asset life cycle.

    The AI dataset supports governance objectives by capturing key information about AI models, including risk assessments, compliance status, ownership, audit trails, and performance metrics. It also enables effective oversight, accountability, and decision-making within the organization. The quality and composition of a dataset directly impact the performance, fairness, and accuracy of the AI model. Well-curated datasets help verify that models learn meaningful patterns and generate reliable outputs in real-world scenarios.

    Each dataset should be evaluated for completeness, accuracy, and relevance to the intended use case. Bias in datasets can lead to unfair or inaccurate model predictions and should be identified and mitigated. Tracking data lineage helps verify traceability, transparency, and accountability in how datasets are used and maintained.

    Datasets must comply with data protection regulations, including privacy laws and organizational data handling policies. Regular reviews and updates help maintain dataset quality and reflect evolving data standards or business needs.

    The following image shows the overview page of datasets.
    Figure 1. Datasets overview page
    Datasets overview page
    An AI dataset record provides an aggregated risk score. The individual risk scores for entities, that have Risk assessment for AI inventory as the Risk Assessment Methodology (RAM) roll-up and form an aggregated risk score. You can see the aggregated risk score under the Details tab of the AI system record in the Aggregated risk score section. For more information about how risk score is rolled up, see Risk score rollup in Advanced Risk Assessment.
    Important:
    To see the aggregated risk score, you must enable the Migrate to Advanced Risk Assessments (sn_risk_advanced.migrate_to_advanced_risk) under All > Advanced Risk > Properties.
    Note:
    You can see this section only if the Advanced Risk application is installed.

    Aggregated risk score consolidates individual risks such as bias, drift, and security, to inform departmental or enterprise-level AI risk profiles, enabling higher-level visibility and oversight. For example, several customer-facing AI models exhibiting signs of bias can lead to organizational risks. Aggregated risk score enables the AI Risk and Compliance team to obtain a consolidated view of AI risks across multiple models, teams, and business units, moving beyond fragmented risk assessments.

    Related AI assets

    The Related AI assets section lists the following for an AI dataset:

    • AI systems: The AI systems that use this AI dataset.
    • AI models: The AI models that use this AI dataset.