Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • We understand that there are 2 key steps in machine learning - Model Training and Result Inference. In this first release of this protocol, we will only focus on inference. Training is provided here but it's subjected to more discussion.

Overall Flow

OBAIC overall flowImage Removed

REST APIs

All of the REST APIs call presented below use bearer tokens for authorization. The {prefix} of each API is configurable in the hosted servers. This protocol is inspired by Delta Sharing.

High Level Protocol - Training

  1. BI tool has some data on which predictive analytics would be valuable. 
  2. BI tool requests AI platform, through OBAIC, to train/prepare a model that accepts features of a certain type (numeric, categorical, text, etc.) by providing a token to allow access to the training data with a SQL statement running against the datastore.
  3. BI tool polls for the status/result of the training. When training is completed, results and performance will be returned.
  4. AI vendor provides predictions on data shared by BI vendor, again using an access token.

Image Added

High Level Protocol - Inference

High Level Protocol - InferenceImage Added

REST APIs

All of the REST APIs call presented below use bearer tokens for authorization. The {prefix} of each API is configurable in the hosted servers. This protocol is inspired by Delta Sharing.

List List Models - Step (1)

Expand
titleAPI to list models accessible to the recipient


HTTP RequestValue
Method

GET

Header

Authorization: Bearer {token}

URL

{prefix}/models/{model}

Query Parameters

maxResults (type: Int32, optional): The maximum number of results per page that should be returned. If the number of available results is larger than maxResult, the response will provide a nextPageToken that can be used to get the next page of results in subsequent list requests. The server may return fewer than maxResults items even if there are more available. The client should check nextPageToken in the response to determine if there are more available. Must be non-negative. 0 will return no results but nextPageToken may be populated.

pageToken (type: String, optional): Specifies a page token to use. Set pageToken to the nextPageToken returned by a previous list request to get the next page of results. nextPageToken will not be returned in a response if there are no more results available.


...

Expand
title500: The request is not handled correctly due to a server error


HTTP ResponseValue
HeaderContent-Type: application/json
Body{
"errorCode": "string",
"message": "string"
}

Potential Future Enhancement


Nest Step

  • Finalize Logo
  • Determine what other AI framework can be supported by OBAIC besides ONNX and PMML

Potential Future Enhancement

  • Formally design JSON in http://json-schema.org/ so that future development can validate the JSON structure
  • Define data pipeline to transform data before running
  • Define containerized
  • Define data pipeline to transform data before running
  • Define containerized model so that prediction can run in BI instead of in AI
  • Define format of nextPageToken
  • Define different types of errorCode and message for each API call

References

Authors

Decision to be made

  • Data file type: What type of data we are supporting: e.g. for Delta needs to be parquet, RDBMS? Can modify the Jeffrey init cut below to support multiple data types, depending on the use case.
    • Inference: Pass by value should be good enough if it's only for predicting 
    • Train: not immediate, maybe later in Phase 2
  • Metadata structure, what kind of JSON schema do we need
  • Do we only support a specific model type (ONNX) or arbitrary number of framework
  • Decouple model (asking the model to predict and train) and data (listing, upload, download)
  • Finalize Logo

FAQ

Why should I share our model to you?

Ownership? Model and Data?

Security?

How can the data be accessed mechanically, for training?

Original content from Jeffrey. To be integrated with the main content 

This is a short doc illustrating a sample skeleton OBAIC protocol. This proposal envisions a data-centric workflow:

...

FAQ

  1. Why should AI share model to BI?
    • The setting of OBAIC assumes an organization owns both the BI Tool(s) and AI platform(s). However, they are 2 (or more) discrete entities and may not have a good way to integrate. Hence OBAIC comes in to connect the dots.
  2. Who owns the model and data?
    • The AI platform owns the model but share with BI tools through OBAIC. The data is owned by the business but BI has been authorized to use it and re-share this to AI for training and inference.
  3. How do you deal with Security?
    • Call will be handled by HTTPS protocol and authorized by bearer token standard 

References

Authors

Name

Affiliation

Cupid ChanPistevo Decision
Xiangxiang MengRedfin
Deepak KaruppiahMicroStrategy
Nancy RauschSAS
Dalton RuerQlik
Sachin SinhaMicrosoft
Yi ShaoIBM
Jeffrey TangPredibaes
Lingyan YinSalesforce

...


...

Train a New Model

function TrainModel(inputs, outputs, modelOptions, dataConfig) -> UUID

...