How to expose NLP Machine Learning Models mostly for Spacy by quickly building an API with FastAPI and then play with them

Doing AI does not make sense if you cannot expose these “discoveries” for use in digital products. This is the primary reason for my strong interest in Streamlit, which allows you to quickly expose Machine Learning Models and play with them.

But what to do when we must move from experimentation via POC to industrialization? The immediately obvious idea is to create an API. So, after a quick benchmark, when you are beginner in Python like me, you pick FastAPI. This is the most sensible choice considering the time spent and the learning curve.

For this post, you can find all files for each project on my GitHub account. See https://github.com/bflaven/ia_usages/tree/main/fastapi_nlp_model

Cons-Pros-mise

Even Though, I know there are hundreds of articles making “the pros & the cons” for Frameworks starting with “Django vs Flask vs FastAPI”. This sole list below convinced me.

This article presents a clear and above all reduced list of the main comparison points for Framework against each other: Django, Flask and FastAPI.

Source : https://www.datacamp.com/tutorial/introduction-fastapi-tutorial

Well, tinkering few Machine Leaning Models is one thing, but deploying and using them via an API in other contexts is another. So, if creating an API remains my first objective with the aim of making these same ML available to “enrich” a CMS for example. I got the strange feeling that this API creation could turn into a nightmare at the expense of the work on AI itself.

“The API has its reasons that ML knows not”*.

Naturally, this impression reinforced my desire to use a framework like FastAPI. For what? Because choosing a framework means committing to the virtuous best development practices path, without even knowing it! All the points of vigilance in the creation of an API are provided to you turn-to-key: Performance, Security, Deployment, Testing and Quality, Documentation… I inevitably forget some! This will ensure the sustainability of your API which could be in the heart of your AI’s strategy with maintainability and scalability at its best. Who could ask for more!

I mimicked this famous quote from Blaise Pascal: “Le coeur a ses raisons que la raison ignore” (The heart has its reasons that reason knows not). Just for the record, the real quote in French is “Le coeur a ses raisons que la raison de connaît point.”

Well, enough chatted to only repeat what can be found on the web, let’s move to real cases.

FastAPI at Work

Just a quick line to begin with that summarize the project scope for using FastAPI:

Think of FastAPI as a wrapper around a data science application to expose its functionality as a RESTFUL microservice.

Source: https://towardsdatascience.com/build-and-host-fast-data-science-applications-using-fastapi-823be8a1d6a0

I made several attempts to demonstrate how to use the FastAPI in some data science projects especially for NLP tasks, mostly leveraging on Spacy. These attempts are somehow the “second part” of another blog’s post:

With a more IA oriented wording, these attempts answer to my question “How can you operate and develop a machine learning pipeline with existing or trained models then locally deploy an API and see the models in action?”

I “grab” this reformulation from this excellent post:
Source: https://pycaret.gitbook.io/docs/learn-pycaret/official-blog/build-and-deploy-ml-app-with-pycaret-and-streamlit

What am I leaving out and why?

I leave for the moment several subjects that far beyond a single post:

  • Deployment: Important subject indeed but it seems at first not so tricky with Python as you have plenty of resources available to make Deployment smooth as possible on Azure, Heroku, AWS, Google cloud…
  • Front-end: I really need an API more than anything else but indeed you can then imagine building a web app on the top of this API with Streamlit for instance or Vue.js. I gave an example, found on the web, with Streamlit just for the fun.
  • Testing/Quality: Also important but it’s premature for a small POC even if I have made few unit testing with Pytest just to see. For a real one, I will add probably testing with Postman/Newman or even Cypress.
  • Dockerization: As PO, it exhausts my computer all these images 🙂 But, I must admit Docker has numerous advantages to deploy a local turn-to-key environment and a API found on GitHub for instance.

A very basic workflow with the FastAPI’s API future place
A very basic workflow with the FastAPI's API future place

Last word

Another semantic precision, I am easily mistaken, maybe because I use indifferently algorithm, machine learning, models for the same thing. Bad habit! Here is what I found for my own to clarify this wording issue.

  1. Machine learning involves the use of machine learning algorithms and models.
  2. The best analogy is to think of the machine learning model as a “program”.
  3. The machine learning model “program” is comprised of both data and a procedure for using the data to make a prediction.

Source: https://machinelearningmastery.com/difference-between-algorithm-and-model-in-machine-learning/

FastAPI Projects Digest

So, I have browsed many projects especially those made with Spacy. The last one “017_chatGPT_fastapi_nlp_model” is one made both with the help of ChatGPT and the result of this exploration.

Here is the prompt below.

Write in python with the best practices a FastAPI that enable 4 languages with Spacy and provide four differents endpoints like summary function, “normal” NER function and “custom” NER function.

According to me, firstly the POC is there to test your recent knowledge through immediate practice, in my case on FastAPI. More importantly, it is the opportunity to make every mistake imaginable and not to try to set in stone a definitive API’s architecture. The files made available are therefore indeed working documents resulting from this exploration.

Here are a quick description per GitHub directory. The code can be found on my GitHub account at https://github.com/bflaven/ia_usages/tree/master/fastapi_nlp_model

  • 002_bamimoretomi_spacy_fastapi: An API’s POC with FastAPI using libraries for NLP like NTLTK for text, in english only, with endpoints on similarity, synonyms, antonyms… And a endpoint “tospeech” that convert a text into audio with the library gTTS.
  • 003_juliensalinas_spacy_fastapi: A quick POC on Spacy from the founder of nlpcloud.com, that has leveraged on FastAPI.
  • 004_shanealynn_spacy_fastapi: A simple POC on Spacy and Flair.
  • 006_analyticsindiamag_spacy_fastapi: Very similar to the project 003.
  • 008_anaconda_MKTR-ai_YT_Maj9v-Ev7-4: An informative video on Youtube with excellent explanations from a real professional. The project was initially using Poetry, I have used Anaconda, but I gave the poetry file configuration, and I just took the FastAPI API file. A must seen as the video has its purpose is to go through a super quick ML app build & deployment. You can see in cation a possible full workflow: from building to deploying an NLP / Machine Learning App with Poetry, FastAPI, Docker, Spacy & GCP.
  • 009_streamlit_fastapi_basic_calculator: A quite simple, educational and hybrid project that demonstrate how-to create an API with FastAPI (backend) that will back a web app made with Streamlit. Nice source of inspiration and a basic illustration of the FRONTEND, API, MODEL workflow.
  • 010_spacy_projects: A more “advanced” API made with FastAPI taken from projects/integrations/fastapi/ in the explosion GitHub account, the company behind Spacy. See the full project https://github.com/explosion/projects/tree/v3/integrations/fastapi
  • 011_cookiecutter_spacy_fastapi: I only grab the API and the test (pytest) from this project. In reality, it is a bit more extended project, made by Microsoft’s people to promote Azure platform, using cookiecutter. Here is the full description of the initial project: “A python cookiecutter API for quick deployments of spaCy models with FastAPI. The API interface is compatible with Azure Search Cognitive Skills.”
  • 012_fastapi_tiangolo_testing: A sample to explore a bit more the testing. Quality is key. I have followed only the beginning just to ensure that testing with FastAPI was easy. Extracted from “FastAPI Tutorial – User Guide – Testing” found at
    https://fastapi.tiangolo.com/tutorial/testing/
  • 013_fastapi_datacamp: Great article, Great source code, found at
    https://www.datacamp.com/tutorial/introduction-fastapi-tutorial. Gave a quick comparison between Django, Flask and FastAPI and provide a good introduction to PyCaret “An open-source, low-code machine learning library in Python”
  • 014_fastapi_kinsta: A different kind of API that explore CRUD operations on Users, nothing to do with NLP! But this project gave precious advice on Routing, Code API logical organisation and so on.
  • 017_chatGPT_fastapi_nlp_model: I already mention this POC. The files are a mix between logical code written by ChatGPT (I have given the prompt in the main.py) and then extended by myself with the code grabbed during this exploration. The API has served as a base for a presentation to expose ML app build & deployment in my company. It illustrates my belief “A pseudo product must always exist to become the very subject of discussion”.

Some capture screens to remember how to query an API made with FastAPI via postman this new API 🙂 I always forget it.

  1. fastapi_postman_remember_1.png
  2. fastapi_postman_remember_2.png
  3. fastapi_postman_remember_3.png

As a personal conclusion, I will say that this “exploration” has a secondary objective: indeed, as an ordinary P.O, the labour market clearly indicates the way to follow: transform yourself a bit or at least into Data & AI Product Owner 🙂

Videos to tackle this post

#1 How to expose NLP Machine Learning Models for Spacy by quickly building an API with FastAPI

#2 How-to create an API with FastAPI, backend for a web app made with Streamlit

#3 How-to create an API with FastAPI and start testing it with pytest

More infos