Model Asset eXchange: Path to Ubiquitous Deep Learning Deployment

Model Asset eXchange: Path to Ubiquitous Deep Learning Deployment

Alex Bozarth Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 ajbozart@us.ibm.com Brendan Dwyer Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 Brendan.Dwyer@ibm.com Fei Hu Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 Fei.Hu1@ibm.com Daniel Jalova Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 djalova@us.ibm.com Karthik Muthuraman Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 Karthik.Muthuraman@ibm.com Nick Pentreath Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 NickP@za.ibm.com Simon Plovyt Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 simon@ibm.com Gabriela de Queiroz Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 gdq@ibm.com Saishruthi Swaminathan Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 saishruthi.tn@ibm.com Patrick Titzler Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 ptitzler@us.ibm.com Xin Wu Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 xinwu@us.ibm.com Hong Xu 0000-0001-7874-4518Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 hongx@ibm.com Frederick R Reiss Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 frreiss@us.ibm.com  and  Vijay Bommireddipalli Center for Open-Source Data & AI Technologies (CODAIT), IBM505 Howard StSan FranciscoCaliforniaUSA94105 vijayrb@us.ibm.com
Abstract.

A recent trend observed in traditionally challenging fields such as computer vision and natural language processing has been the significant performance gains shown by deep learning (DL). In many different research fields, DL models have been evolving rapidly and become ubiquitous. Despite researchers’ excitement, unfortunately, most software developers are not DL experts and oftentimes have a difficult time following the booming DL research outputs. As a result, it usually takes a significant amount of time for the latest superior DL models to prevail in industry. This issue is further exacerbated by the common use of sundry incompatible DL programming frameworks, such as Tensorflow, PyTorch, Theano, etc. To address this issue, we propose a system, called Model Asset Exchange (MAX), that avails developers of easy access to state-of-the-art DL models. Regardless of the underlying DL programming frameworks, it provides an open source Python library (called the MAX framework) that wraps DL models and unifies programming interfaces with our standardized RESTful APIs. These RESTful APIs enable developers to exploit the wrapped DL models for inference tasks without the need to fully understand different DL programming frameworks. Using MAX, we have wrapped and open-sourced more than 30 state-of-the-art DL models from various research fields, including computer vision, natural language processing and signal processing, etc. In the end, we selectively demonstrate two web applications that are built on top of MAX, as well as the process of adding a DL model to MAX.

journalyear: 2019conference: The 28th ACM International Conference on Information and Knowledge Management; November 3–7, 2019; Beijing, Chinabooktitle: The 28th ACM International Conference on Information and Knowledge Management (CIKM ’19), November 3–7, 2019, Beijing, Chinaprice: 15.00doi: 10.1145/3357384.3357860isbn: 978-1-4503-6976-3/19/11ccs: Computing methodologies Machine learningccs: Applied computing Enterprise architecture frameworksccs: Applied computing Service-oriented architecturesccs: Software and its engineering Software architectures

1. Introduction

Figure 1. Design of the MAX architecture. The software components to the right of the vertical blue dashed line are on IBM Cloud. Software components inside the round-cornered green rectangles run in Docker containers.
Source of some subcomponents of the figure: https://upload.wikimedia.org/wikipedia/commons/3/38/Ethernet_Port.svg , https://commons.wikimedia.org/wiki/File:Gears.png , https://commons.wikimedia.org/wiki/File:Icons8_flat_phone_android.svg

In the past decade, research fields of artificial intelligence (AI) have advanced drastically. Most noticeably, deep learning (DL) has gained dramatic performance improvements and revolutionized the whole research field of AI. On the one hand, for many tasks that were considered “impossible” for AI to achieve a performance comparable to human, such as playing Go (Silver et al., 2016) and image recognition (He et al., 2015), DL has demonstrated its power by surpassing human performance with a large margin. On the other hand, DL has also become ubiquitous in many areas, especially in Internet of things. Therefore, it is essential to systematize DL models.

DL researchers have been celebrating their remarkable and laudable achievements: Some DL models that arose from applied DL research fields such as computer vision and natural language processing, have been revolutionizing these fields. However, the industry has been struggling to make use of them for the following reasons:

Booming research results Every year, thousands of new DL research results are published. The state of the art is pushed forward rapidly in many DL research fields. For a non-DL expert, it usually takes a significant amount of time to implement necessary blocks derived from research papers.

Mathematical complexity Sometimes DL models require decent knowledge on statistics and linear algebra to grasp. This mathematical barriers are often insurmountable for most software developers due to the diversity in their backgrounds.

Confusing terminologies New terminologies have been emerging due to DL’s rapid evolution and wide adoption in a variety of research fields. Unfortunately, these new terminologies are often inconsistent and thus frequently confuse non-DL experts.

These issues have been severely impeding the adoption of DL in industry. Therefore, to fill this gap, it is imperative to develop a software system that reduces the required minimum knowledge to get started with DL models.

In addition, the difficulty to harness DL in industry is further exacerbated by the common use of sundry DL programming frameworks, including Tensorflow (Abadi et al., 2015), PyTorch (Paszke et al., 2017), Theano (Theano Development Team, 2016), etc. These DL programming frameworks are incompatible with each other and their software structures are usually fundamentally different. Hence, it usually requires strenuous effort to port programming code based on one DL programming framework to another. Therefore, a software system that unifies these programming interfaces and blocks is critical to the success of DL in industry.

In this demo, we present a software system, called Model Asset eXchange (MAX)111https://developer.ibm.com/exchanges/models/, that addresses aforementioned difficulties. The paper is organized as follows. In Section 2, we present the architecture and software components of MAX. In Section 3, we describe our demonstration.

2. Architecture Design and Software Components of MAX

In this section, we describe our architecture design and various software components of MAX.

2.1. Architecture Design

MAX is a software system that employs an extensible and distributive architecture and makes use of state-of-the-art container technology and cloud infrastructures. Figure 1 illustrates its architecture. MAX is hosted on a cloud infrastructure, such as IBM cloud, and communicates with web applications via standardized RESTful APIs. It is undergirded by a powerful abstract component, named the MAX framework. The MAX framework wraps DL models implemented in different DL programming frameworks and provides programming interfaces in an uniform style, which effectively enables developers to use DL models without the need to dive into DL programming frameworks. Each implementation of DL model runs in isolated Docker containers, which promotes security and effectively turns the architecture to be easily distributive and extensible. Additionally, we build MAX exclusively on top of open source technologies, which promotes the open and collaborative culture that academia generally embraces.

2.2. Software Components

In this subsection, we describe MAX’s various software components as shown in Figure 1.

2.2.1. The MAX Framework

The MAX framework is a Python library that wraps DL models to unify the programming interface. To wrap a model, it simply requires implementing functions that process input and output. This simplicity is key to the MAX framework’s agnosticism to DL programming frameworks.

2.2.2. DL Models

MAX can accommodate DL models written in different DL programming frameworks. The MAX framework communicates with DL models via standardized Python programming interfaces. To use a DL model in MAX, we only need to adapt its Python programming interface, i.e., wrap the DL model. Once the DL model is wrapped, it is available throughout the whole MAX system and does not require further adaptation in the future. This Python programming interface is objected-oriented: Wrapping only requires inheriting specific classes and implementing some predefined class functions by converting input and output of DL models to data structures acceptable to the MAX framework.

The wrapped DL models and their programming interface with the MAX framework are hosted in Docker containers222https://www.docker.com. A container is an isolated instance of environment that hosts software of interest and its runtime. This isolation in general promotes extensibility, distributability, and security.

For example, without the help of containers, it is usually difficult to deploy two DL models depending on conflicting runtime environments (such as different versions of TensorFlow) on the same hosting OS in the same physical computer. Containers solve this issue by creating an isolated virtual runtime environment for each DL model. For another example, if multiple DL models are deployed on the same hosting OS directly, a security vulnerability in one DL model would also normally risk other DL models.

Additionally, MAX uses Docker containers instead of traditional isolation/container technologies such as physical isolation (multiple physical computers) or virtual machines (emulated hardware environments running on host OSes). The reason is that traditional isolation/container technologies are in general computationally costly, as each isolated node runs a complete operating system (OS) instance, either physical or virtual.

To mitigate this issue, Docker containers only cost moderate computational resources while retaining a high degree of isolation: Based on Linux container technologies, Docker containers share one single OS kernel instance that is the same as the one that steers the host OS. Therefore, by employing Docker containers, MAX is able to operate under low computational cost with very little compromise in extensibility, distributability, or security that isolation/container technologies provide.

2.2.3. RESTful APIs: Between Applications and the MAX Framework

MAX provides a standardized DL programming framework-agnostic programming interface as RESTful APIs, which effectively avails developers of DL models without requiring them to dive into the various DL programming frameworks.

For each DL model, MAX’s output is in the JSON format following a standardized specification. This standardization enables developers to quickly adapt their applications by replacing the underlying DL model with very little and often zero modification to the code that interacts with the DL model. This is in sharp contrast with the current common practice: Due to the non-standardized programming interfaces, when replacing underlying DL models, developers usually have to drastically modify their code and frequently find themselves mired in figuring out the correct usage of often abstrusely defined APIs. MAX also integrates Swagger333https://swagger.io/ to make a graphical user interface (GUI) automatically available to all wrapped DL models. An example is shown below444Taken from https://github.com/IBM/MAX-Text-Sentiment-Classifier .:

{Verbatim}

[fontsize=,commandchars=
{}] – ”status”: ”ok”, ”predictions”:[ [–”positive”: 0.9977352619171143, ”negative”: 0.002264695707708597˝], [–”positive”: 0.001138084102421999, ”negative”: 0.9988619089126587˝] ] ˝

3. Demonstration

(a) Image caption generator (web UI screenshot)
(b) Object detector (web UI screenshot)
Figure 2. Web application demonstration.

In this section, we describe our demonstration.

3.1. Web Applications Built upon MAX

In this subsection, we describe our demonstration of the architecture of MAX using two web applications, an object detector (Huang et al., 2017; Lin et al., 2014; Liu et al., 2016; Howard et al., 2017) and an image caption generator (Vinyals et al., 2017), that are built on top of MAX. We describe in turn the demonstration of software components (as shown in Figure 1) and interfaces between them.

  • Web UI: They interact with users via GUIs and communicate with the MAX framework via a RESTful JSON interface. Figures 1(a) and 1(b) preview our demonstration.

  • JSON Interface Between Web UI and the MAX Framework: This interface is RESTful and based on the JSON format. Upon users’ requests, the Web UI accordingly sends the MAX framework a JSON-formatted string following a predefined specification. Figure 3 previews our demonstration (using Swagger) of this JSON interface.

  • Python Programming Interface Between DL Models and the MAX Framework: We will demonstrate the Python interface that bridges these two components.

3.2. Adding a DL Model to MAX

We will also interactively demonstrate the process to add a DL model to MAX and thus make it available to MAX users. This includes three steps: (1) Wrapping a DL model using the MAX framework, (2) building a Docker image that hosts the wrapped DL model, and (3) optionally uploading to IBM Cloud. Additionally, for this process, we have also created a skeleton called MAX-Skeleton555https://github.com/IBM/MAX-Skeleton as a convenient starting point for typical use cases.

4. Conclusion

Figure 3. Example JSON output from the image caption generator.

In this demo paper, to address the difficulties in the industry’s adoption of DL models from DL research fields, we presented MAX, a software system that employs an extensible and distributive architecture and makes use of state-of-the-art container technology and cloud infrastructures. In particular, we described the architecture and software components of MAX as well as the interaction between them in detail. Finally, we proposed our demonstration of two web applications that are built on top of MAX, as well as the process of adding a DL model to MAX.

References

Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
388351
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description