DatasetDefinition - Global
The DatasetDefinition API provides methods to identify a set of records including a table name, columns, and row selection criteria to use as input for ML training algorithms. Datasets don't contain the actual data.
This API requires the Predictive Intelligence plugin (com.glide.platform_ml) and is provided
within the sn_ml namespace. For information, see Predictive Intelligence.
For usage guidelines, refer to Using ML APIs.
DatasetDefinition - DatasetDefinition(Object)
Creates an instance of the DatasetDefinition class, enabling you to define a dataset by table name, fields, and query.
Create your dataset definition by passing a table and a list of fields. You can also pass a query to restrict datasets to include rows with specific characteristics.
Once created, a DatasetDefinition object cannot be modified.
| Name | Type | Description |
|---|---|---|
| config | Object | JavaScript object containing the dataset definition
properties. |
| config.tableName | String | Name of the table for the dataset. For
example, "tableName" : "Incident". |
| config.fieldNames | Array | Optional. List of field names from the specified table
as strings. For example, "fieldNames" : ["short_description",
"priority"].
Default: All fields |
| config.fieldDetails | Array | Optional. List of JavaScript objects that specify field
properties.
Use this property to force machine learning algorithms to interpret fields as being a specific type. You do not need to get field details for every field listed in the fieldNames property. All details must correspond with a field listed in the fieldNames array. |
| config.fieldDetails.name | String | Name of the field defining the type of information to restrict this dataset to. If used, this field name must match the corresponding name listed in the fieldNames property. |
| config.fieldDetails.type | String | Machine-learning field type. Specifying the data type forces the ML trainer to interpret a field as having
that type. If no data type is specified, the system determines the type. Supported types:
These types identify data types from a machine learning perspective. The ML type might differ from the type listed in the source table. A field can be a string type, but its purpose can be to encode a nominal value. For example, t-shirt sizes such as "XL", "L", or "M" are string types in the table, but each value represents a categorgy of a nominal attribute from an ML perspective. |
| config.encodedQuery | String | Optional. Encoded query string in the standard platform format.
You can construct the query to be absolute or relative. For example, your query can return rows for the previous 3 months (relative), or for the May through July period (absolute). Whether using an absolute or relative pattern, the data a definition identifies can change if the rows in the underlying table change. |
The following example shows how to create a dataset definition.
var myData = new sn_ml.DatasetDefinition(
{
'tableName' : 'incident',
'fieldNames' : ['category', 'short_description', 'priority', 'assignment_group.name'],
'fieldDetails' : [
{
'name' : 'category',
'type' : 'nominal'
},
{
'name' : 'short_description',
'type' : 'text'
}],
'encodedQuery' : 'sys_created_onONLast%202%20quarters@javascript:gs.beginningOfLast2Quarters()@javascript:gs.endOfLast2Quarters()^state=3'
});
DatasetDefinition - getEligibleFields(String capability)
Returns a list of fields that are eligible as either input fields (features) or predicted fields regarding a solution of a given capability, for example, a classification solution. Eligibility is determined based on the fields having the appropriate glide data types.
| Name | Type | Description |
|---|---|---|
| capability | String | Capability for which to retrieve fields eligible for training. This method
currently only supports classification solutions, any other value for the capability
throws a "capability not supported" exception. Valid values:
|
| Type | Description |
|---|---|
| Object | Object containing eligible input field names and eligible output field
names. |
| <Object>.eligibleInputFieldNames | List of strings indicating input fields eligible for training. Data type: Array |
| <Object>.eligibleOutputFieldNames | List of strings indicating output fields eligible for training. Data type: Array |
The following example shows how to display eligible fields for a classification solution.
var myIncidentData = new sn_ml.DatasetDefinition({
'tableName' : 'incident',
'encodedQuery' : 'activeANYTHING'
});
var eligibleFields = JSON.parse(myIncidentData.getEligibleFields('classification'));
gs.print(JSON.stringify(eligibleFields, null, 2));
Output:
{
"eligibleInputFieldNames": [
"resolved_by",
"short_description",
"description",
"notify"
],
"eligibleOutputFieldNames": [
"parent",
"caused_by",
"location",
"category"
]
}