DatasetDefinition - Global
The DatasetDefinition API provides methods to identify a set of records including a table name, columns, and row selection criteria to use as input for ML training algorithms. Datasets don't contain the actual data.
This API requires the Predictive Intelligence plugin (com.glide.platform_ml) and is provided
within the sn_ml namespace. For information, see Predictive Intelligence.
For usage guidelines, refer to Using ML APIs.
DatasetDefinition - DatasetDefinition(Object)
Creates an instance of the DatasetDefinition class, enabling you to define a dataset by table name, fields, and query.
Create your dataset definition by passing a table and a list of fields. You can also pass a query to restrict datasets to include rows with specific characteristics.
Once created, a DatasetDefinition object cannot be modified.
| Name | Type | Description |
|---|---|---|
| config | Object | JavaScript object containing the dataset definition
properties. |
| config.tableName | String | Name of the table for the dataset. For
example, "tableName" : "Incident". |
| config.fieldNames | Array | Optional. List of field names from the specified table
as strings. For example, "fieldNames" : ["short_description",
"priority"].
Default: All fields |
| config.fieldDetails | Array | Optional. List of JavaScript objects that specify field
properties.
Use this property to force machine learning algorithms to interpret fields as being a specific type. You do not need to get field details for every field listed in the fieldNames property. All details must correspond with a field listed in the fieldNames array. |
| config.fieldDetails.name | String | Name of the field defining the type of information to restrict this dataset to. If used, this field name must match the corresponding name listed in the fieldNames property. |
| config.fieldDetails.type | String | Machine-learning field type. Specifying the data type forces the ML trainer to interpret a field as having
that type. If no data type is specified, the system determines the type. Supported types:
These types identify data types from a machine learning perspective. The ML type might differ from the type listed in the source table. A field can be a string type, but its purpose can be to encode a nominal value. For example, t-shirt sizes such as "XL", "L", or "M" are string types in the table, but each value represents a categorgy of a nominal attribute from an ML perspective. |
| config.encodedQuery | String | Optional. Encoded query string in standard Glide format.
See Encoded query
strings.
You can construct the query to be absolute or relative. For example, your query can return rows for the previous 3 months (relative), or for the May through July period (absolute). Whether using an absolute or relative pattern, the data a definition identifies can change if the rows in the underlying table change. |
The following example shows how to create a dataset definition.
var myData = new sn_ml.DatasetDefinition(
{
'tableName' : 'incident',
'fieldNames' : ['category', 'short_description', 'priority', 'assignment_group.name'],
'fieldDetails' : [
{
'name' : 'category',
'type' : 'nominal'
},
{
'name' : 'short_description',
'type' : 'text'
}],
'encodedQuery' : 'sys_created_onONLast%202%20quarters@javascript:gs.beginningOfLast2Quarters()@javascript:gs.endOfLast2Quarters()^state=3'
});
DatasetDefinition - getEligibleFields(String capability)
Returns a list of fields that are eligible as either input fields (features) or predicted fields regarding a solution of a given capability, for example, a classification solution. Eligibility is determined based on the fields having the appropriate glide data types.
| Name | Type | Description |
|---|---|---|
| capability | String | Capability for which to retrieve fields eligible for training. This method
currently only supports classification solutions, any other value for the capability
throws a "capability not supported" exception. Valid values:
|
| Type | Description |
|---|---|
| Object | Object containing eligible input field names and eligible output field
names. |
| <Object>.eligibleInputFieldNames | List of strings indicating input fields eligible for training. Data type: Array |
| <Object>.eligibleOutputFieldNames | List of strings indicating output fields eligible for training. Data type: Array |
The following example shows how to display eligible fields for a classification solution.
var myIncidentData = new sn_ml.DatasetDefinition({
'tableName' : 'incident',
'encodedQuery' : 'activeANYTHING'
});
var eligibleFields = JSON.parse(myIncidentData.getEligibleFields('classification'));
gs.print(JSON.stringify(eligibleFields, null, 2));
Output:
{
"eligibleInputFieldNames": [
"resolved_by",
"short_description",
"description",
"notify"
],
"eligibleOutputFieldNames": [
"parent",
"caused_by",
"location",
"category"
]
}