...

User analyzes data using BI Tools and found out that predictive analytics on those data set would be valuable. This step is the traditional step when a user interacts with BI.

(a) Obtain a token a token with permission associated to the user making the request. This token is going to pass to AI allowing the access to the training data with a SQL statement running against the datastore. (b) BI tool, on behalf of the user, requests AI platform through OBAIC, to train/prepare a model that accepts features of a certain type (numeric, categorical, text, etc.)

Expand

title	API to train model using provided dataset

Model configuration is based on configs from the open-source Ludwig project. At a minimum, we should be able to define inputs and outputs in a fairly standard way. Other model configuration parameters are subsumed by the options field.

The data stanza provides a bearer token allowing the ML provider to access the required data table(s) for training. The provided SQL query indicates how the training data should be extracted from the source.

Don't be confused with the Bearer token which is used to authenticate with OBAIC, and the dbToken which is created in 2(a) and AI platform will use that to access the data source for training

HTTP Request	Value
Method	`POST`
Header	`Authorization: Bearer {token}`
URL	`{prefix}/models/`
Query Parameters	`{` `"dbToken": "D41C4A382C27A4B5DF824E2D4F148"; "inputs":[ { "name":"customerAge", "type":"numeric" }, { "name":"activeInLastMonth", "type":"binary" } ], "outputs":[ { "name":"canceledMembership", "type":"binary" } ], "modelOptions": {` `“providerSpecificOption”: “value”` `}, "data":{ "sourceType":"snowflake", "endpoint":"some/endpoint", "bearerToken":"...", "query":"SELECT foo FROM bar WHERE baz" } }`

Expand

title	Alternatively, we may also consider to support SQL-like syntax for Model Training

If we go beyond just REST API, SQL-like is an alternative as the syntax is also well-known

Use BigQuery ML model creation as an example and generalizing

CREATE MODEL (
customerAge WITH ENCODING (
type=numeric
),
activeInLastMonth WITH ENCODING (
type=binary
),
canceledMembership WITH DECODING (
type=binary
)
)
FROM myData (
sourceType=snowflake,
endpoint="some/endpoint",
bearerToken=<...>,
)

AS (SELECT foo FROM BAR)
WITH OPTIONS ();

Expand

title	200: Training is started and the corresponding ID is return for future reference

HTTP Response Value

Header Content-Type: application/json; charset=utf-8

Body

{
"modelID": "d677b054-2cd4-4711-959b-971af0081a73"
}

modelID is generated and returned to the caller if training is started successfully. This will be used to check the status of the training, or for future Inference (see Inference section below)

AI Platform provider the implementation to fulfill the request by connecting to the datasource with the provided token and the set of training data specified in SQL. This step is up to how the AI platform interacts with the data source to performance the training.

BI tool polls for the status or retrieve the training result. If the training is still in progress, the status will be returned. When training is completed, results and performance of the model will be returned.

Expand

title	API to get model status

HTTP Request	Value
Method	`GET`
Header	`Authorization: Bearer {token}`
URL	`{prefix}/modelStatus?modelID=`
Query Parameters	modelID (type: String): The modelID returned from previous OBAIC call either from training or list of Models.

Expand

title	200: Status of the Model returned

HTTP Response Value

Header Content-Type: application/json; charset=utf-8

Body

{
"modelID": "d677b054-2cd4-4711-959b-971af0081a73",

"status": "training",

"progress": "80",
}

modelID is same ID provided in the request
status can be training | inferencing | ready
progress is the estimated progress of the current status

BI tool presents the result to the user in their own way, which is the "secret sauce" and unique to each other.

Protocol - Inference

...

1. BI Tool asks for a list of available model

Expand

title	API to list models accessible to the recipient

HTTP Request	Value
Method	`GET`
Header	`Authorization: Bearer {token}`
URL	`{prefix}/models/{model}`
Query Parameters	maxResults (type: Int32, optional): The maximum number of results per page that should be returned. If the number of available results is larger than `maxResult`, the response will provide a `nextPageToken` that can be used to get the next page of results in subsequent list requests. The server may return fewer than `maxResults` items even if there are more available. The client should check `nextPageToken` in the response to determine if there are more available. Must be non-negative. 0 will return no results but `nextPageToken` may be populated. pageToken (type: String, optional): Specifies a page token to use. Set `pageToken` to the `nextPageToken` returned by a previous list request to get the next page of results. `nextPageToken` will not be returned in a response if there are no more results available.

...

Code Block

language	js
firstline	1
title	GET {prefix}/models/6d4b571a-80ca-41ef-bc67-b158f4352ad8
collapse	true

{
    "id": "6d4b571a-80ca-41ef-bc67-b158f4352ad8",
    "name": "Model 1",
    "revision": 3,
    "format": { 
      "name": "PMML",
      "version": "4.3"
    },
    "algorithm": "Neural Network", 
    "tags": [
      "Anomaly detection",         
      "Banking"                    
    ],                              
    "dependency", "",
    "creator": "John Doe",
    "description": "This is a predictive model, refer to {input} and {output} for detailed format of each field, such as value range of a field, as well as possible predictions the model will gave. You may also refer to the example data here.",
    "input": {
      "fields": [
        {
          "name": "Account ID",
          "opType": "categorical",
          "dataType": "string",
          "taxonomy": "ID",
          "example": "account abc-001",
          "allowMissing": false,
          "description": "unique value"
        },
        {
          "name": "Account Balance",
          "opType": "continuous",
          "dataType": "double",
          "taxonomy": "currency",
          "example": "1,378,560.00",
          "allowMissing": true,
          "description": "Minimum: 0, Maximum: 999,999,999.00"
        }, 
      ],
      "ref": "http://dmg.org/pmml/v4-3/pmml-4-3.xsd"                                                       
    }
    "output": {
      "fields": [
        {
          "name": "Churn",
          "opType": "continuous",
          "dataType": "string",
          "taxonomy": "ID",
          "example": "0.67",
          "allowMissing": false,
          "description": "the possibility of the account stop doing business with a company over 6 months"
        }
      ],
      "ref": "http://dmg.org/pmml/v4-3/pmml-4-3.xsd"                                                       
    }
    "performance": {            
      "metric": "accuracy",     
      "value": 0.85
    },
    "rating": 5,
    "url": "uri://link_to_the_model"  
}

Error - Apply to all API calls above

Expand

title	400: The request is malformed

HTTP Response	Value
Header	`Content-Type: application/json`
Body	`{` `"errorCode": "string",` `"message": "string"` `}`

...

Space shortcuts

Page tree

Versions Compared

Old Version 23

New Version 24

Key

Protocol - Inference

1. BI Tool asks for a list of available model

Error - Apply to all API calls above

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 23

New Version 24

Key

Protocol - Inference

1. BI Tool asks for a list of available model

Error - Apply to all API calls above