Amplify'd ML Model Deployment - Part I : Need for a New Deployment Architecture
Analytics has evolved dramatically in the past two decades from dedicated server side machines to cloud and now to the edge (IoT devices, smart phones).
Some of the reasons for this move is
- Speed: Data is processed as soon it is captured and there is no IO throttle as there is no data transfer.
- Privacy: Private Data is not transferred to any server. Hence, very low chances of privacy violations.
- Cost: As the model inferencing is transferred to the edge device (eg. smartphones), it saves data transfer and server cost.
- Connectivity: As inferencing/scoring is run directly on the edge device as compared to cloud analytics, it runs seamlessly in areas of limited or no internet connectivity.
Let us compare the two prominent ways in which models are deployed in a mobile application today:
|Bundled along with app assets (Ex. ||Deployed on a server|
|Model inferencing requires no internet||Model inferencing requires internet|
|Models are strongly coupled with app updates. Running the latest version of model requires app updates||Users are always able to run inferencing on latest models as they are updated server-side|
I came across Amplify DataStore recently which is a persistent storage engine that synchronizes data between apps and the cloud. This on-device data store comes along with a programming model which leverages shared and distributed data without writing additional code for offline and online scenarios.
This quickly ignited some sparks 🔥 and I started wondering -
What happens when we combine Amplify DataStore & ML Model Deployment?
- Model is not bundled with app assets.
- Latest model is updated on the cloud database (DynamoDB) using AppSync GraphQL API
- This latest model is synced with the local data store on-device automatically. User now has the latest model at his disposal without going through the process of app update.
- The model is on-device => Inferencing works offline.
- ML Model Deployment is now Amplify'd - you get the best of both worlds. 🎉
Ecosystem & Architecture
Now let us get started with the process of building 🛠 this capability. Here are the various parts of the ecosystem we will need to build:
- There is a popular model exchange format (
.onnx) for exchanging neural network models, but the format is not database friendly. We need a Model Specification Document which can be followed by any developer. This document will outline the JSON schema of some popular traditional machine learning models so that developers can write their own model exporters for the library/language of choice. The language of this model exchange format is chosen is
jsonas json parsing is well supported by almost all programming languages and it can also be easily serialized into a string that can be stored in a Database.
- The next step is to develop an implementation of the above document in the form of a library. A Python library will be developed which will aid in exporting
scikit-learnmodels into the prescribed JSON Model format.
- A client library which can consume the JSON model and run scoring on new dataset. A dart library will be developed for this purpose.
- A mobile application will be developed which has authentication (Amplify Authentication) in place to provide secure access to the model synced by Amplify Datastore. As Amplify recently made its libraries generally available for Flutter (which is awesome 💙), we will develop a Flutter application which will also use the above dart library for running the model on device.
- Uploading the model via AppSync GraphQL API to the cloud database which syncs with device Data store and makes the latest model available every time.
Let us now begin with the first task of developing the Model Specification Document.