A vast majority of machine learning projects won’t make it into production. It’s accurate for commercial projects within companies but also for personal projects developed by hobbyists like you and me. Playing with datasets and trying out different models is fascinating - gathering data (well, that part might not be fascinating at all), training, watching your model learn until it predicts all the answers correctly. But why do we usually stop there? Especially when you can deploy a machine learning model in less than 20 lines of code.
But if this is a universal understanding, that AI empirically provides a competitive edge, why do only 13% of data science projects, or just one out of every 10, actually make it into production? (source)
Putting ML models into production
Frankly I don’t know why so many projects don’t make it into production - that’s how you show others what you’ve been working on lately. Personally I tend to focus on the “machine learning part” so much, because it’s quite exicting to start with a baseline model and then boost it or tweak your data until you reach flawless results.
I never wanted to waste time deploying my model because I was never satisfied with its final performance. I would rather ditch that project and start another one. However, recently I changed my mind - now I think productization is (at least) as much important as training models.
More than that, with proper tools you can deploy your machine learning model in a flash.
Deploying made super easy with Streamlit
Streamlit is a framework for Python developers that lets you to create “data web apps” in minutes. It’s well suited for data science community as it supports Markdown (and LaTeX formulas), displaying Pandas dataframes and (interactive) charts.
It provides all the basic elements to build a website: input fields (for text, numbers or even time), buttons, checkboxes, forms or tables. You can display audio and video there. Upload files, feed them to your model and show results. All of that sounds just like a regular website, so what’s the real power of Streamlit?
With Streamlit adding a widget is essentialy just declaring a variable in Python script (Figure 2) if a simple one-column layout is what you want. Need a sidebar or more columns? Introducing that is yet another variable.
I used Streamlit to deploy my garbage classification model from previous posts. Besides serializing PyTorch model and wrapping it in a GarbageClassPredictor
class, here’s what I wrote to make it usable via web browser:
from PIL import Image
import streamlit as st
from model.predictor import GarbageClassPredictor
st.image('streamlit/header.png', use_column_width=True)
input_image = st.file_uploader('Upload image and click the Submit button')
classifier_btn = st.button('Submit')
with open("streamlit/style.css") as f:
st.markdown('<style>{}</style>'.format(f.read()), unsafe_allow_html=True)
if classifier_btn:
if input_image == None:
st.write("Upload image!")
else:
predictor = GarbageClassPredictor()
class_name, certainty = predictor.predict(input_image)
class_image = Image.open(f'streamlit/{class_name}.png')
st.image(class_image, width=256,
caption=f'{class_name}\n({100.0*certainty:.2f}% sure)')
It introduces a simple file uploader widget (only one line of code) with a Submit button. Button-click event is handled by if classifier_btn
condition where I use uploaded image and pass it to GarbageClassPredictor to obtain classification results. Displaying model response as an image (e.g. plastic bottle icon for plastic class) is just another variable declaration.
If you don’t care about header image or custom CSS to make the app look better, uploading input image and displaying model response boils down to only 10 lines of code with Streamlit.
Pretty neat, right?
If you have an old project that you tossed out or a trained model that never made it into production, give it a second chance with Streamlit. After you implement that marvelous application, make sure you share it with other people so that they can play with it 😀