Devops for AI

Your machine learning model distinguishes between breeds of dogs with higher rates of accuracy than seasoned human professionals. Training it required tens of thousands of dog photos, which, though pre-normalized, still required extensive manipulation before becoming useful for supervised training. Now you need to put the model into production where it will see thousands of random photos rife with stray artifacts like people, landscapes… and cats. Many may contain no images of dogs at all, causing problems for your model. In other words, you are moving from the clean-room of the lab to the messy, real world.

Where do you start?

Transitioning from a controlled, lab environment to dynamic, transactional production system involves the same steps with new twists. Data must still be prepared, normalized and transformed but the transformations must happen automatically and at fire-hose scale. There is still an interface to your model, but now, rather than loading pre-normalized images from a filesystem or database, it involves potentially invalid input streaming through an authenticated ReST API and dozens of GPU-enabled, cloud-based virtual machines which hungrily consume data from distributed message queues. Worker processes that seize up during processing need to be gracefully removed from service, failures need to be reported via well instrumented dashboards, performance trends in terms of failures, model accuracy and overall system latencies need to be tracked for both immediate systems scaling and long-term planning.

S3 performs these roles so that your internal team can remain focused on machine and deep learning.

Data Engineering

The one-off scripts created by model developers to help prepare sample data for training runs seldom scale when the time comes to deploy into production systems.  This is ETL (extract, transform, load) at scale.  It is adaptive pre-processing of data, perhaps involving both traditional data processing techniques and additional pre-processing models.  Before submitting a sample image to the breed categorizer a dog/not-dog model may need to participate in the data pipeline.

S3 can help you stay focused on model development, by undertaking the engineering involved in creating a data pipeline.

API Development

These days much of API development involves ReST.  It is not the only model for an interface, but it is pervasive.  Your model engineers, focused as they are on machine and deep learning techniques, are unlikely to be fluent in the relatively pedestrian world of web APIs.  

S3 develops the front-facing systems that feed your models in production.

Devops & Systems Development

The task of building production systems involves stepping back from the function of a machine or deep learning model.  From the scale of the overall system, the model is a black box with certain performance characteristics.  How quickly does it respond when presented with a sample?  Given performance characteristics, how many individual processing units are required to handle typical loads?  Is it possible to predict load bursts, based on scheduled events, time of day, or seasonality?  

S3 handles the systems engineering involved in scaling your services while also providing a feedback loop for ongoing model and business development.

Tooling, Process & Culture

Analysts, Data Scientists, Programmers, Ops — we speak different languages.  We each have our specialized tools and techniques, we have technical and cultural sensibilities unique to our roles.  Model developers live in an exploratory world where the winning technique proves itself only in hindsight.  Devops lives in a world where anything and everything can and will go wrong, where systemic failure is not an option, and thus structure and process is, perhaps ironically, key.  The divide represents tensions which, when successfully bridged, are creative and generative.

Do the model engineers use git?  Are systems programmers able to understand and translate R, Matlab, Octave or the notoriously idiosyncratic and non-pythonic Python used by the model engineers?  What will it take to integrate the afore mentioned modeling languages with systems languages such as Python, Scala or Clojure?  Where are the brides and interfaces?

S3 works with your data scientists and model developers to create an engaged and efficient culture.