在本篇文章中,将学习如何利用Flask框架和Flasgger包将机器学习模型部署为REST API,并提供一个简单的图形用户界面(GUI)以便与模型API进行交互。将具体学习如何将模型封装进REST-API,并使用Flasgger包在Flask应用中添加简单的UI组件,从而无需前端知识即可轻松展示模型给利益相关者。
将构建并保存一个简单的机器学习模型,使用scikit-learn和pickle库。接着,将创建一个FlaskAPI来使用这个模型,并在Flask应用中使用Flasgger添加易用的UI组件。
1.1 Flask:Flask是一个用Python编写的Web应用框架,它允许直接从浏览器与Python代码(在例子中是机器学习模型)进行交互,无需任何代码文件或库。Flask使能够轻松创建Web应用程序编程接口(APIs),这样数据科学家就可以将更多时间花在探索性数据分析、特征工程、模型构建等方面,而不必担心模型对外界的可用性。可以通过创建Flask API轻松部署机器学习模型,并使其在浏览器中可用。更多关于Flask的信息,请参考。
1.2 Flasgger:Flasgger是一个Flask扩展,可以从所有注册到API中的Flask视图中提取OpenAPI规范。Flasgger还内置了SwaggerUI,因此可以访问来可视化和交互API资源。Flasgger是一个帮助创建带有文档和由SwaggerUI驱动的实时游乐场的Flask API的Flask扩展。更多关于Flasgger的信息,请参考。
打开Anaconda Prompt(如果使用的是conda环境)或者cmd(如果使用的是标准Python安装),对于Mac OS用户打开终端并输入:
pip install flask
pip install flasgger
在下面的代码中,将使用逻辑回归构建一个简单的二元分类器。将借助scikit learn库创建数据集,进行分类,并最终使用pickle库保存/导出模型。为了简单起见,将构建一个简单的模型,因为本文的主要目的是展示如何通过开发Flask和Flasgger的API轻松使用机器学习模型。
import numpy as np
from flask import Flask, jsonify, request
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report
from sklearn.linear_model import LogisticRegression
import pickle
import os
from flasgger import Swagger
import flasgger
def train_and_save_model():
'''This function creates and saves a Binary Logistic Regression
Classifier in the current working directory
named as LogisticRegression.pkl'''
## Creating Dummy Data for Classification from sklearn.make_classification
## n_samples = number of rows/number of samples
## n_features = number of total features
## n_classes = number of classes - two in case of binary classifier
X,y = make_classification(n_samples = 1000,n_features = 4,n_classes = 2)
## Train Test Split for evaluation of data - 20% stratified test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42,stratify=y)
## Building Model
logistic_regression = LogisticRegression(random_state=42)
## Training the Model
logistic_regression.fit(X_train,y_train)
## Getting Predictions
predictions = logistic_regression.predict(X_test)
## Analyzing Evaluation Metrics
print("Accuracy Score of Model : " + str(accuracy_score(y_test,predictions)))
print("Classification Report : ")
print(str(classification_report(y_test,predictions)))
## Saving Model in pickle format
## Exports a pickle file named Logistic Regression in current working directory
output_path = os.getcwd()
file_name = '/LogisticRegression.pkl'
output = open(output_path+file_name,'wb')
pickle.dump(logistic_regression,output)
output.close()
train_and_save_model()
调用这个函数会在当前工作目录下保存一个名为LogisticRegression.pkl的文件,并产生以下输出:
现在有了逻辑回归模型(89%准确率🙂)的pickle文件。让看看如何将这个模型封装进Flask API,并嵌入一些Flasgger的UI组件。
app = Flask(__name__)
Swagger(app)
现在将定义一个API的app路由,这意味着每当访问http://127.0.0.1:5000/predict_home时,这个函数将被执行,其中硬编码了特征值并从模型生成预测。
@app.route('/predict_home/',methods = ['GET'])
def get_predictions_home():
feature_1 = 1
feature_2 = 2
feature_3 = 3
feature_4 = 4
test_set = np.array([[feature_1,feature_2,feature_3,feature_4]])
## Loading Model
infile = open('LogisticRegression.pkl','rb')
model = pickle.load(infile)
infile.close()
## Generating Prediction
preds = model.predict(test_set)
return jsonify({"class_name":str(preds)})
现在需要通过在终端运行python model_deployment_blog_script.py来启动Flask服务器,并在浏览器中打开提到的http://127.0.0.1:5000/predict_home/。
现在将在代码中定义另一个app路由,该函数包含UI元素,以docstring的形式。将添加4个输入字段,名称为feature_1, feature_2, feature_3, feature_4,以便可以从用户那里获取输入。
@app.route('/predict',methods = ['GET'])
def get_predictions():
"""
A simple Test API that returns the predicted class given the 4 parameters named feature 1, feature 2, feature 3, and feature 4
---
parameters:
- name: feature_1
in: query
type: number
required: true
- name: feature_2
in: query
type: number
required: true
- name: feature_3
in: query
type: number
required: true
- name: feature_4
in: query
type: number
required: true
responses:
200:
description : predicted Class
"""
## Getting Features from Swagger UI
feature_1 = int(request.args.get("feature_1"))
feature_2 = int(request.args.get("feature_2"))
feature_3 = int(request.args.get("feature_3"))
feature_4 = int(request.args.get("feature_4"))
test_set = np.array([[feature_1,feature_2,feature_3,feature_4]])
## Loading Model
infile = open('LogisticRegression.pkl','rb')
model = pickle.load(infile)
infile.close()
## Generating Prediction
preds = model.predict(test_set)
return jsonify({"class_name":str(preds)})
现在需要保存代码并重新启动服务器,然后访问http://127.0.0.1:5000/apidocs/,就可以看到输出了。
在这里,可以添加4个特征数值并点击执行,以查看模型的响应。
今天学习了如何将机器学习模型封装进Flask API,并如何通过Flasgger轻松添加UI组件以便与API进行交互。可以利用这些概念轻松构建模型的快速原型,并将其展示给利益相关者。