C3PO: Database and Benchmark for Early-stage Malicious Activity Detection in 3D Printing

C3PO: Database and Benchmark for Early-stage Malicious Activity Detection in 3D Printing


Increasing malicious users have sought practices to leverage 3D printing technology to produce unlawful tools in criminal activities. Current regulations are inadequate to deal with the rapid growth of 3D printers. It is of vital importance to enable 3D printers to identify the objects to be printed, so that the manufacturing procedure of an illegal weapon can be terminated at the early stage. Deep learning yields significant rises in performance in the object recognition tasks. However, the lack of large-scale databases in 3D printing domain stalls the advancement of automatic illegal weapon recognition.

This paper presents a new 3D printing image database, namely C3PO, which compromises two subsets for the different system working scenarios. We extract images from the numerical control programming code files of 22 3D models, and then categorize the images into 10 distinct labels. The first set consists of 62,200 images which represent the object projections on the three planes in a Cartesian coordinate system. And the second sets consists of sequences of total 671,677 images to simulate the cameras’ captures of the printed objects. Importantly, we demonstrate that the weapons can be recognized in either scenario using deep learning based approaches using our proposed database. The quantitative results are promising, and the future exploration of the database and the crime prevention in 3D printing are demanding tasks.

3D printing; G-code; Object Recognition; Security; CNN; RNN.

m \SetHorizontalCoffin\tablecoffin#1\TypesetCoffin\tablecoffin[l,vc]

1 Introduction

3D printing, also known as additive manufacturing, has been exposing its potential to drive the third industrial revolution in the recent years by evolving the product design and manufacturing. It has been widely observed that 3D printing is superior on the traditional manufacturing techniques on customization, limited material requirement, low equipment costs, and shorter product life cycle and wide accessibility [8], [37]. However, these characteristics are easily leveraged by malicious users to manufacture unethical products (e.g. counterfeiting a patented product) and even illegal products (especially weapons as shown in Fig. 1).

Figure 1: Examples of illegal products (weapons) manufactured by 3D printing.

Due to the wide accessibility, 3D printers can be easily acquired by adversaries who exploit the mentioned characteristics to perform high-impact criminal activities [20], [30], [1], [3], [19]. Illegal weapon production is among one of the most immediate concerns to be addressed in 3D printing domain. A 3D printed gun could be produced with neither a serial number nor background check [29], [26]. This allows the firearms to remain untraceable, making it impossible to obtain the knowledge of their proprietors. In addition, the production of personalized firearms is inexpensive compared to its market cost [14]. The development of 3D printed bullet has also raised the concern for security implications of 3D printing [2]. Being concerned about the misuse of the technology, the U.S. Department of State urged International Traffic in Arms Regulation (ITAR) to limit the proliferation of 3D printed firearms. However, these regulations have a varying degree of efficacy and are inadequate to deal with the rapid growth of 3D printers [9].

Figure 2: Overview of our proposed early-stage malicious activity detection system for 3D printing. The system embeds an object recognition module in a 3D printer. This module either reads (denoted as blue arrow) numerical control programming language (called G-code) or visually monitors (denoted as green arrow) the printing process, then recognizes the objects being printed to determine whether the objects are weapons. The module is designed to terminate the printing process by sending control to the 3D printer once weapon is recognized.

It is an urgent challenge to prevent the illegal weapon production in real time manner. The intuitive idea is to make the 3D printer aware of the identity of a printed object so that production can be administrated and the illegal weapon will not be manufactured. Fig. 2 is an overview of our solution to remedy the illegal weapon production in 3D printing. Our goal is to embed a recognition system to the 3D printers to terminate the printing process when a illegal production is recognized. In the recent decade, deep learning techniques such as Convolutional Neural Networks (CNN) have been widely used for computer vision tasks. From AlexNet [24], VGG [36], GoogLeNet [39] to ResNet [16], [17], CNNs have been evolving significantly to solve the nature image recognition problems. Nevertheless, few researches have focused on combining highly developed deep learning techniques with 3D printing to address aforementioned concern - illegal weapon production. The lack of large-scale databases stalls the advancement of automatic illegal weapon recognition in 3D printing.

To tackle the illegal weapon production issue in 3D printing, we propose a new image database for 3D printing objects, namely “C3PO” (representing cognitive 3D printing objects). The first dataset in C3PO comprises grayscale/RGB images for 10 classes collected from 22 well-defined 3D models. The 3D models includes real guns, gun-like objects, common objects. The grayscale images are essentially the projections of randomly posed 3D objects on the , , and planes. In particular, we demonstrate that these commonly printed objects can be recognized via a supervised image classification formulation. The above scenario is based on the full prior knowledge of the printed objects, i.e., the recognition system can pre-compute the projections of an object from the control file for printing and then classify the object by its projections.

On the other hand, what if the recognition system is not capable to have the full prior knowledge of the printed objects? In this case, it can only observe what has been printed, e.g. a camera keeps capturing the top-view of the printed objects during the printing process. To address the recognition problem in this scenario, we propose another dataset included in “C3PO” to simulate a random object’s projection on the , , and planes with respect to steps (a step is a printing completeness of a layer). The dataset includes sequences of a total of images. Each sequence is extracted for a randomly posed object defined in one of twenty-two 3D models. The objects are categorized to 10 labels. We demonstrate that the image sequence recognition can be formulated as a unified model combining convolutional neural network with recurrent neural network (RNN).

This paper is the first to present a large-scale image database in 3D printing domain. The proposed approaches shows promising quantitative results using the proposed database. However developing robust deep learning based 3D printed objects recognition systems is still an arduous journey to be exploited.

2 Related work

With the wide utilization and spread of 3D recognition, several important datasets have been established to help improve the recognition. Silberman et al. established the NYU Depth dataset including RGBD images, capturing diverse indoor scenes, with detailed annotations in 2012 [33]. Geiger et al. released the KITTI dataset for autonomous driving, which comprises 389 stereo and optical flow image pairs, stereo visual odometry sequences of 39.2 km length, and more than 200k 3D object annotations captured in cluttered scenarios [15]. PASCAL3D+ dataset is proposed by Xiang et al. in 2014, providing 2D-3D alignments to 12 rigid categories containing images [42]. In 2016, Xiang et al. contributed to the ObjectNet3D, a larger scale database for 3D object recognition, which is comprised of 100 categories, images, objects in these images and 3D shapes [41]. Based on the 3D information provided, supervised learning techniques are introduced and developed to recognize the 3D objects. In 2015, Maturana et al. proposed VoxNet, an architecture which integrates a volumetric Occupancy Grid representation with a supervised 3D CNN [28]. Johns et al. applied CNN to generic multi-view recognition, by decomposing an image sequence into a set of image pairs, classifying each pair independently, and then learning an object classifier by weighting the contribution of each pair [22]. Wang et al. introduced a view clustering and pooling layer based on dominant sets to pool information from views which are similar and thus belong to the same cluster [40]. However, the related researches on 3D printing object identification are almost blank due to the lack of large-scale database in the 3D printing domain.

3 Construction of malicious activity recognition database

In this section, we describe the approach for building a large-scale malicious activity recognition database for 3D printing, namely “C3PO”, extracted from our pre-defined 3D models. First, we introduce the basics about 3D printing details. Then, we define two working scenarios in the 3D printing malicious activity recognition system. We consider the 3D object recognition module in Fig. 2 in the system being able to either access the numerical control programming code from 3D printer or visually monitor the printed objects by cameras. For two scenarios, we collect two sets of projections of the printed objects on a/some certain plane(s). The projections are converted to images to form the database “C3PO”.

3.1 3D printing basics

The diagram of the 3D printer structure is exhibited in Fig. 3. It primarily includes four parts categorized according to different physical functions, i.e., Feeder, Positioner, Extruder and Controller. There are two components in the Feeder: feed motor and gripping gear. Its function is to feed the specified material, varying for each printer type, into the material convergence channel. The spatial movement of the nozzle is governed by the Positioner, which includes a belt transmission (including synchronous gear, belt, pulley, shaft and bearing), a screw rod, three step motors and a platform. The Extruder involves a nozzle, an aluminum block, a heater, a thermal sensor and a set of cooling fans. The block defines the end of the material conveyance channel. The melt material is extruded from the nozzle forming a line repeatedly. The Controller governs the Feeder, Positioner and Extruder. It regulates the working process, according to the instructions present in the design file, and the sensor feedback. It also contains an integrated master circuit board. To summarize, 3D printing is an add-on process where the successive extrusion of material forms the lines and the stack of the lines build the object. In addition, the superposition of the lines determines the surface attribute of the printed object.

Figure 3: 3D printer hardware structure.

Fig. 4 illustrates the general procedure of 3D printing. The complete printing process happens in two domains, cyber domain and physical domain. The cyber domain always resides in a host machine, such as a computer. In order to instruct a 3D printer, a 3D object model is created through computer aided design (CAD) software and converted to the standard object file (STL). The computer aided manufacturing happens in a controlling software. The controlling software of a 3D printer (Ultimaker Cura [4] in our case) decodes the STL file and the user is able to customize the printing setting through its GUI. Once the user initiates the printing, the controlling software slices the decoded 3D model into uniform layers and generates a data stream called G-code, which is a numerical programming language that humans use to instruct a machine to operate [7], [25]. A G-code file contains all the information that the 3D printer needs to print an object [5]. It is the most widely used file format containing all the control commands including shape, dimensions, and volume. In the physical domain, the data stream feeds the 3D printer to direct the printer’s movements and actions. And the 3D printer’s nozzle extrudes the melt material to form the lines while the positioner follows the order encoded in the G-code.

Figure 4: The overview of the 3D printing procedure (a pistol as example). Computer aided design (CAD) software creates the standard object file (STL) which describes the surface geometry of the 3D object. The computer aided manufacturing software slices the model into uniform layers and generates G-code. The 3D printer interprets to follow the instructions in the G-code to make spatial movement and extrude the melt material.

3.2 Two scenarios in 3D printing object recognition

We define two working scenarios for an intelligent 3D printer to detect the printed objects.

G-code interpretation. In this scenario, we consider that the 3D object recognition module has full access to the G-code. The host computer sends the G-code to the 3D printer and a copy to the 3D object recognition module as the blue arrow in Fig. 2. The module transforms the G-code internally to images through which the module recognizes the object before it is printed. If the object is a weapon, the module will send a control signal to the 3D printer to terminate the printing process. In this method, the malicious activity in 3D printing is prevented at the beginning stage.

Cumulative visual recognition. In the above scenario, the 3D object recognition module stores the copy of the complete G-code file which is usually huge and the recognition module has to transform the G-code to images. To maintain a lightweight 3D object recognition module, we consider an alternative scenario that the module will not handle the G-code directly, but monitor the printing process to recognize the object. We mount three cameras attached to the inner , and surfaces of the 3D printer facing the printing workspace. Through cameras, we can capture three perpendicular views of “screenshot” for the printed object. The cameras are linked to the controller of the 3D printer which determines the camera capture frequency of every layers. As the object is being printed layer by layer, the recognition module recognizes the object with higher and higher confidence. When the module is confident on that the object is a weapon by cumulatively observing any view of printed object, the module will send a control signal to the 3D printer to terminate the printing process.

3.3 Transforming prior knowledge to images

3D printing is a procedure of successively adding material from beginning to the ending position, i.e., a sequence of layers of material accumulating (“printing”). For example, a 3D printer prints an object layer by layer as Fig. 5(a) shows. The controlling software (Ultimaker Cura [4] in our case) decodes the STL file and generate layer-wise commands in the G-code format for the 3D printer. The G-code files is composed of lines which define how the printer nozzle’s behaviors such as position, speed, extruding material or not, etc for the corresponding layer. A 3D printer takes commands from G-code and execute each line of G-code sequentially. To interpret the G-code, we use the starting position of each printing movement (line starting with G0 in the G-code fir each layer), as such points can depict the outline of the object. We save the coordinates into a csv file. Fig. 5(b) shows example G-code that Ultimaker Cura generates and the csv file extracted from it.

Figure 5: (a) Layer view from Ultimaker Cura. The STL file is decoded and the G-code is generated. The layer view of the software demonstrates the expected printing results (b) G-code and extracted csv files. G0 indicates the starting coordinates of each printing movement and we save all G0 coordinates in the csv files. Other commands are explained in the figure but not considered in this work as they don’t not decipt the contour of the object.

Please note that each line of the coordinates in the csv file is a three-dimensional data entry, representing a point in a three-dimensional euclidean space. We are able to compute the object projections on , , and planes using the coordinates. White pixels in the grayscale image denote the printer trajectories and represent the object, while the black pixels are the background.

As we design two working scenarios for a intelligent 3D printer, we collect two sets of images to fit corresponding scenarios. For G-code interpretation, we construct a dataset in which the objects’ projections are obtained at completion stage (a completion stage means the objects are fully printed and the G-code contains all layers’ instructions for the 3D printer), while for cumulative visual recognition, we prepare a dataset that contains the projection images describing the printing procedure. The two construction approaches to fulfill above two objectives are shown as follows:

Objects’ projections at completion stage For the first scenario, we collect two sub-datasets of images from the G-code. Given that the entire G-code is obtained, we can process it into 3 images from its , and projections for its completion stage. The first subset is a set of one-channel grayscale images recognizable by humans. As we can see from Fig. 6, the object is a illegal gun and we process it to three 2D projections on , and planes. Only the projection on the plane is recognizable to human, we keep this projection and abandon the others.

Figure 6: The plane selection procedure. A G-code is processed to projections on three planes, , and . Select the recognizable images and discard others.

We save the human recognizable images as our one-channel dataset. In practical, the 3D object recognition module also process the G-code to 3 images. Once any one of them is recognized as a weapon, the module send a termination signal to the printer to abort the printing procedure. The selection is achieved via crowd-sourcing services where we provide the three projection images with the corresponding labels, and select the projection image with the most votes.

Meanwhile the other two projections that we discard catch our attention as well. As most deep learning approaches reveal that some implicit information is not recognizable by human, we keep the other non-recognizable two projection images to reserve the integrity of feature space. Fig. 7 illustrates that we combine three single-channel grayscale projections into one three-channel RGB images. We can observe that the three-channel image is hard to comprehended by humans. For computers, however, three-channel images enlarge the feature space of the object, and are supposed to give better performance for the recognition system.

Figure 7: The process of generating a 3-channel image

We categorize 22 3D models into 10 object classes, Cup, FathersDayTrophy, GunPart, Gun_like_objects, Horse, Pistol, PortalGun, Revolver, SpecialRevolver, and ToyGun. Please refer to the supplementary material for the plots of the 3D models and more details.

Sequencing the objects’ projections For the second scenario, the 3D printer is supposed to observe the printing process from beginning to the end. The image sub-dataset in the first scenario will not fit for this job, since the projection images are collected from the completion stage of the object and cannot reflect the actual procedure of the printing. Learning from the images in the first scenario, without the G-code, a 3D object recognition module can only recognize the object when the printing procedure is finished through the projections captures by cameras. To remedy it, we generate separate csv files by extracting the layer-wise information from the G-code. We convert the coordinates into an individual csv file cumulatively for every five layers defined in the G-code. Then, similarly, we plot the projections for each converted csv file and save the projections to images. For example, in the first 5 layers, if the G-code defines 60 lines of the behaviors of the 3D printer, we convert these 60 lines to a csv file. And we save another csv file for the first 10 layers, the first 15 layers and so on. The extracted image sequence that depicts the printing procedure is shown in Fig. 8. We can see that this sequence-based dataset properly describes the printing procedure. The pixels in the image are accumulating alongside as the printing procedure goes on. With a deep learning model using the image sequences, the 3D object recognition module is able to recognize the object through mounted cameras without G-code infomation. Table 1 shows the technical specification of our sub-datasets which fit the cumulative visual recognition task.

Figure 8: Selected images from one sequence for example model (splatter). With the printer extruding printing material, the object’s shape is getting more and more clear. The projection sequence reflects the printing process.
Class Minimum Number Maximum Number Average Number Number
of Layers of Layers of Layers of Images
Special Revolver
Toy Gun
Gun Part
Gun-like Object
Father’s Day Trophy
Portal Gun
Table 1: Dataset specifications for the cumulative visual recognition task.

3.4 Diversifying the database

A 3D object can be printed with any orientations. To cope with the variance of an object’s orientation, we rotate the 3D model around each all three -axes respectively with the step size of 15 degree. The reason we choose 15 degree as the step size is that the minimum rotation degree in the Ultimaker [4] is 15 degree. To minimize the manual labor force needed, we only rotate 6 times around each axis (totally orientations) to generate all the variations in the first quadrant of the 3D euclidean space. Possible variations in the other seven quadrants of the 3D euclidean space are obtained by rotating and mirroring the collected images. If required, we select the human recognizable image/sequence first before diversification. When collecting the human recognizable sequences, we select the sequences by their last images aforementioned image selection method. The projection sequence on one plane is selected and the other two are discarded. We conduct 90 degree rotation around one axis (say axis) for the selected projection images three times to get the orientation variations in the other 3 quadrants and then mirrored all the images with respect to the orthogonal plane (say plane) to get the variations in all eight quadrants of the euclidean space. Hence we produce all possible distinct images which corresponds to individual orientations.

4 Case study: early-stage 3D object recognition with G-code

In the first scenario that the 3D object recognition module has full access to the G-code, we consider that the module is capable of recognizing the object before the printing starts. We apply convolutional neural network on two sets of collected images, single-view (one-channel) and multi-view (three-channel), to show the proposed database enables automatic malicious activity detection in 3D printing domain.

4.1 Model

We build a simple version fully convolutional networks [27] to conduct the classification on our one-channel and three-channel dataset. The proposed network consists of three convolution layers followed by global pooling layer, and then the pooled feature maps are convolved by ten filters to obtain the classification logits. The detail about the network architecture is as below. The three convolutional layers consists of 128 , 256 , and 512 filters. A Log-Sum-Exp (LSE) pooling layer proposed in [34] is utilized after the first two convolution layers. The LSE pooled value is defined as,


where is the activation value at (m, n), (m, n) is one location in the pooling region , and , is the pooling filter size in . By adjusting the hyper-parameter , the pooled value ranges from the maximum in (when ) to average (when ). The LSE pooling is an adjustable operation between max-pooling and average pooling. Please note that in the global pooling layer, we also apply LSE pooling. We use rectified linear unit (ReLU) [31] as the activation function. We only use dropout regularization [38] before the last classification layer with a keep rate of . We derived the probabilities of predictions by softmax activation function as follows:


where is the logit for a particular class . We train the network using cross-entropy loss , which can be represented as,


where represents the actual label for class and is the -th element in the prediction output.

4.2 Experimental results

Dateset and training. The dataset in this scenario consists of grayscale (single-view) /RGB (multi-view) images divided into training images and testing images. The training and test images are randomly selected from each class with the amount ratio of . We add L2 regularization [10] to the loss function to prevent overfitting. We optimize the model by Adam [23] method on an Nvidia GTX1080 GPU. The model is implemented in TensorFlow [6].

Classification results. We apply the proposed network on both single-view and multi-view datasets. We vary the value to adjust pooling value during training, is assigned with value of , , and . We also compare the applied LSE pooling with average pooling and max pooling. Table 2 shows the evaluation results of our experiment.

Dataset Channel r Accuracy (%)
Single-View 1 0 (AVG) 90.8
0.1 91.3
5 89.5
10 92.1
(MAX) 91.2
Multi-View 3 0 (AVG) 91.7
0.1 92.9
5 90.9
10 93.6
(MAX) 91.9
Table 2: Classification accuracies using single-/multi-view projection images. Bold text denotes the best results

We can observe that the performances achieved using the multi-view dataset are consistently higher than using the single-view dataset. That is because the multi-view dataset provides the complete feature space for an object through all three perpendicular views of projection. The human non-recognizable projections have implicit information to help to identify the objects. On the other hand, when is very small as , the pooling is close to the method of average pooling. The accuracies are and respectively, which are better than the average/max pooling results. When we increase to , the performances reach their bottom as and Then they achieve their best performance at , i.e. and . Overall, the use of LSE pooling improve the performance compared with using simple average/max pooling with a good choice of .

5 Case study: early-stage 3D object recognition with cumulative visual monitoring

In the second scenario that the 3D object recognition module has no access to the G-code, we consider that the module receives the data stream of images from the embedded cameras perpendicular to , , and planes facing the printing workspace. The module executes three recognition threads which keep reading the object’s projection images (projections on , , and planes) correspondingly, and terminates the printing process when any of aforementioned threads recognizes the object is a weapon.

5.1 Model

Our goal is to enable the system to read a stream of images and gives the prediction every time it receives an image. We keep track of moving averages for the predictions of all classes and take it as the fused prediction at step . When the prediction of a certain class have been made above the confidence threshold , the model makes a decision to recognize the object is within class . The images are converted to vector representations, and the recurrent neural network (RNN) model is utilized to handle the sequence of image representations. The proposed architecture is summarized in Fig. 9. Our problem is analogous to the activity recognition problem for a video [13]. Our prediction for the object class is based on the moving average while [13] uses a average over fixed frames.

Figure 9: Model Overview. Each image at step in a sequence is processed by a CNN first to construct a -dimensional feature vector. Then the -dimensional feature vector is transformed to a -dimensional image embedding as the input for the RNN model for each step . The initial state of the RNN is set to be a vector of zeroes. At each step, the RNN gives a prediction of the object class in 3D printing.

Image representation. In this work, we use a simple 3-layer CNN similar to that in the last section to extract the image features. We use rectified linear unit (ReLU) [31] as the activation function and applied batch normalization [21] as well. At each step of the sequence, the grayscale image with shape is computed for its representation as follows,


where first transforms the image into feature tensor with respect to CNN parameters , then the tensor is globally pooled on spatial dimensions (width and height) to obtain a -dimensional feature. is the transformation matrix which has dimensions of , where is the size of the embedding space. is the bias. Thus each image at step is represented as a -dimensional vector .

Sequence representation. We propose to use a RNN to model the sequence of projection images in the order of observation during printing. Please note that as our aim is not to generate predictions for the complete sentence of images and our goal is to make a confident recognition for the object, we use the forward RNN instead of a Bidirectional RNN (BRNN) [35] to follow the nature of our objective. We use two different RNN cells in this work, Long short-term memory (LSTM) [18] and Gated recurrent unit (GRU) [11]. The hidden units in the RNN cells are set to be . The output of the RNN at step is connected with the final classification layer, a fully-connected layer with dimension , where is the number of classes. Softmax function [32] is used to normalize the prediction as the classes are mutually exclusive. We use the truncated backpropagation through time (BPTT) learning algorithm [12] to compute parameter gradients on short subsequences of the training image embeddings. Activations are forward propagated and the gradients are backward propagated for a fixed step (20, typically in this work). Cross entropy gradients are computed for this subsequence and back-propagated to its start. This truncation of the sequence well handles the relatively long length of the sequence, as shown in Table 1. The average number of layers is and the corresponding average length of the image sequence is as we collect the images every layers. We use average softmax cross-entroy in the truncated BPTT window. And we use moving average value of the predictions during testing. When the average softmax output value for any class exceeds , we stop the inference for the sequence and take the as the label prediction of the sequence. If none of the moving average exceeds until the last image of the sequence, we take the class with the maximum softmax value at the last step as the label prediction.

5.2 Experimental results

Dateset and training. The dataset in this scenario consists of 1 single-view grayscale image sequences categorized to 10 classes. Each sequence contains as many as images. The training and test images are randomly selected from each class with the partitioning ratio of . We use the same optimization method and platform as the first scenario.

Recognition Results. We apply the proposed approach on the single-view image sequences. We vary the value to adjust pooling value during training, threshold is assigned with value of , , , and . To make sure that the prediction value is able to exceed the in most cases, we do not use larger . Table 3 shows the evaluation results of our experiment. We report the the average step indices when the inference for one sequence stops.

RNN cell th Accuracy (%) avg. stop step
LSTM[18] 0.1 12.5 17.5
0.3 46.8 39.6
0.5 69.1 58.3
0.7 77.4 73.4
GRU[11] 0.1 14.6 19.4
0.3 44.7 41.2
0.5 68.5 50.1
0.7 78.4 70.8
Table 3: Benchmark results using single-view projection image sequence. Bold text denotes the best results

GRU and LSTM achieve approximately equivalent performance in this task. We can observe from Table 3 that the with a small , it is too quick for the system to make a decision. The prediction results are quite low. As the gets bigger, the accuracy performance increases as the average stop step index gets larger as well. Because the moving average values of softmax outputs indicate the prediction confidence for each class. With more steps, the system gets more information about the object and is able to make more confident predictions. A large ensures that the inference process accepts enough images to make confident predictions before stop. Since the unconfident predictions give low values for a specific class, the inference stop does not happen easily with a large . Please note that we are able to achieve a better accuracy than 78.4% with higher confidence than 0.7, but more steps are needed. The setting of the confidence threshold depends on practical requirement of accuracy or the stop step index. The 3D printer can save printing material by stop the printing process in early steps at the risk of mis-recognition of the object.

6 Conclusion

The abuse of 3D printing technology to produce illegal weapons requires an intelligent 3D printer with early stage malicious activity detection. The 3D printer should identify the objects to be printed, so that the manufacturing procedure of an illegal weapon can be terminated at early stage. The lack of large-scale dataset obstructs the development of the intelligent 3D printer equipped with deep learning techniques. The construction of 3D printing image database in such scale with the recognition benchmarks has not been addressed until this work. We attempt to design two working scenarios for an intelligent 3D printer and provides corresponding image datasets (tens of hundreds and tens of thousands images). We also conduct quantitative performance benchmarking on ten 3D object recognition given single images and image sequences using C3PO database. This work brings the new thought of designing an object-aware 3D printing system. The main goal is to initiate the fusion of 3D printing technology and deep learning techniques in the computer vision domain enabling the secure use of 3D printing technology. As the 3D models are highly customized and diverse, building a robust recognition system remains a tough task. For the furture work, C3PO will include more common 3D models especially for firearms.

Supplementary material

More details about the data

In the Table 4, we demonstrate the plots and categorization of the 22 3D models. The models are plotted by Ultimaker. We also show the example projection of the 3D models. Please note that the projections are converted using the G0 coordinates in the generated G-codes. The G0 coordinates depict the the outlines of the models. However, we can see that there are some missing white pixels in projections. This is because that those locations are not described by G0 coordinates. Instead, they will be printed along with the nozzle’s movement during printing process, described by G1 codes. In this work, we consider that the projections converted by only G0 coordinates are able to fully outline the 3D objects. The database can be accessed through link:


Class Name Model Name Model Projection
Pistol Beretta Prop Gun \Vcentre \Vcentre
Desert Eagle Gun \Vcentre\Vcentre \Vcentre
PX4 \Vcentre \Vcentre
Revolver Assemble \Vcentre \Vcentre
Smith & Wesson \Vcentre \Vcentre
SOTG \Vcentre \Vcentre
Special Revolver Blade runner \Vcentre \Vcentre
Rev3 Rubber Band Gun \Vcentre \Vcentre
Mal GunV2 \Vcentre \Vcentre
Gun Part Peter NERF \Vcentre \Vcentre
Pist Corpo \Vcentre \Vcentre
Red Hood Gun \Vcentre \Vcentre
Gun-like objects Gun Key-chain \Vcentre \Vcentre
Banana Gun \Vcentre \Vcentre
MIB Gun \Vcentre \Vcentre
Toy Gun Ronald Ray gun \Vcentre \Vcentre
Space Dandy gun \Vcentre \Vcentre
Splator \Vcentre \Vcentre
Portal Gun Portal Gun \Vcentre \Vcentre
Cup Cup \Vcentre \Vcentre
Father’s Day Gift Father’s Day Gift \Vcentre \Vcentre
Horse Horse \Vcentre \Vcentre
End of Table
Table 4: Model Details.

Prototyping the intelligent 3D printer

Fig. 10 demonstrates our very first prototype to secure the 3D printing process by terminating any illegal firearm production. We implement the proposed networks on the Nvidia Jetson TX 1 embedded system, which can achieve a fast inference and real-time detection as the system include 256 CUDA cores. The pictures captured by the camera need to be processed to obtain the projection. As the wall of the workspace of the 3D printer is black and the printed object is monochromatic, it is not complicated to implement a heuristic filtering program to convert the captured image to binary values. It is a strenuous task to build a ready-for-sale intelligent 3D printer to prevent malicious activities in 3D printing.

Figure 10: (a)3D printer - Stratasys uPrint Plus. (b) 3D object recognition module implemented on Nvidia Jetson TX 1 embedded system. (c) Internal view of the 3D printer, the 3D recognition module and cameras can be mounted on the wall or/and ceiling. (d) The 3D object recognition module with camera, the camera can be connected through on-board socket or cable.


  1. we only use the sequence variants in the first quadrant of the 3D Eucliden space.


  1. 34-year-old french man 3d printed fake fronts for cashpoints to steal thousands. http://www.3ders.org/articles/20140823-34-year-old-french-man-3d-printed-fake-fronts-for-cashpoints-to-steal-thousands.html
  2. 3d printed bullets developed and tested by russian researchers. https://3dprint.com/156003/russia-3d-prints-bullets/
  3. 3d printer confiscated in organized crime raid. https://www.engineering.com/3DPrinting/3DPrintingArticles/ArticleID/8642/3D-Printer-Confiscated-in-Organized-Crime-Raid.aspx
  4. Ultimaker cura software. https://ultimaker.com/en/products/ultimaker-cura-software
  5. G-code. https://en.wikipedia.org/wiki/G-code (2018)
  6. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: Large-scale machine learning on heterogeneous systems (2015), https://www.tensorflow.org/, software available from tensorflow.org
  7. Association, E.I., et al.: Eia standard eia-274-d interchangeable variable block data format for positioning, contouring, and contouring/positioning numerically controlled machines (1979)
  8. Bayens, C., Le T, G.L., Beyah, R., Javanmard, M., Zonouz, S.: See no evil, hear no evil, feel no evil, print no evil? malicious fill patterns detection in additive manufacturing. In: 26th USENIX Security Symposium (USENIX Security 17). pp. 1181–1198. {{USENIX} Association} (2017)
  9. Bryans, D.: Unlocked and loaded: government censorship of 3d-printed firearms and a proposal for more reasonable regulation of 3d-printed goods. Ind. LJ 90, 901 (2015)
  10. Bühlmann, P., Van De Geer, S.: Statistics for high-dimensional data: methods, theory and applications. Springer Science & Business Media (2011)
  11. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
  12. Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Senior, A., Tucker, P., Yang, K., Le, Q.V., et al.: Large scale distributed deep networks. In: Advances in neural information processing systems. pp. 1223–1231 (2012)
  13. Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2625–2634 (2015)
  14. Fruehauf, J.D., Hartle, F.X., Al-Khalifa, F.: 3d printing: The future crime of the present. In: Proceedings of the Conference on Information Systems Applied Research ISSN. vol. 2167, p. 1508 (2016)
  15. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. pp. 3354–3361. IEEE (2012)
  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778 (2016)
  17. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: European Conference on Computer Vision. pp. 630–645. Springer (2016)
  18. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural computation 9(8), 1735–1780 (1997)
  19. Hornick, J.: 3d printing new kinds of crime. http://www.policechiefmagazine.org/3d-printing-new-kinds-of-crime
  20. Hurd, R.: Homemade gun in stanford student’s murder-suicide spurs question on ’ghost guns’. https://www.mercurynews.com/2015/08/06/homemade-gun-in-stanford-students-murder-suicide-spurs-question-on-ghost-guns/ (2016)
  21. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. pp. 448–456 (2015)
  22. Johns, E., Leutenegger, S., Davison, A.J.: Pairwise decomposition of image sequences for active multi-view recognition. In: Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on. pp. 3813–3822. IEEE (2016)
  23. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  24. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. pp. 1097–1105 (2012)
  25. Libicki, M.: Information technology standards: quest for the common byte. Elsevier (2016)
  26. Little, R.K.: Guns don’t kill people, 3d printing does: Why the technology is a distraction from effective gun controls. Hastings LJ 65, 1505 (2013)
  27. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3431–3440 (2015)
  28. Maturana, D., Scherer, S.: Voxnet: A 3d convolutional neural network for real-time object recognition. In: Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on. pp. 922–928. IEEE (2015)
  29. McCutcheon, C.R.: Deeper than a paper cut: Is it possible to regulate three-dimensionally printed weapons or will federal gun laws be obsolete before the ink has dried. U. Ill. JL Tech. & Pol’y p. 219 (2014)
  30. Molitch-Hou, M.: Ar-15 with 3d printed lower receiver seized in oregon. https://3dprintingindustry.com/news/ar-15-with-3d-printed-lower-receiver-seized-in-oregon-52234/ (2018)
  31. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10). pp. 807–814 (2010)
  32. Nasrabadi, N.M.: Pattern recognition and machine learning. Journal of electronic imaging 16(4), 049901 (2007)
  33. Nathan Silberman, Derek Hoiem, P.K., Fergus, R.: Indoor segmentation and support inference from rgbd images. In: ECCV (2012)
  34. Pinheiro, P.O., Collobert, R.: From image-level to pixel-level labeling with convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1713–1721 (2015)
  35. Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45(11), 2673–2681 (1997)
  36. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  37. Song, C., Lin, F., Ba, Z., Ren, K., Zhou, C., Xu, W.: My smartphone knows what you print: Exploring smartphone-based side-channel attacks against 3d printers. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. pp. 895–907. ACM (2016)
  38. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15(1), 1929–1958 (2014)
  39. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1–9 (2015)
  40. Wang, C., Pelillo, M., Siddiqi, K.: Dominant set clustering and pooling for multi-view 3d object recognition. In: Proceedings of British Machine Vision Conference (BMVC) (2017)
  41. Xiang, Y., Kim, W., Chen, W., Ji, J., Choy, C., Su, H., Mottaghi, R., Guibas, L., Savarese, S.: Objectnet3d: A large scale database for 3d object recognition. In: European Conference Computer Vision (ECCV) (2016)
  42. Xiang, Y., Mottaghi, R., Savarese, S.: Beyond pascal: A benchmark for 3d object detection in the wild. In: Applications of Computer Vision (WACV), 2014 IEEE Winter Conference on. pp. 75–82. IEEE (2014)
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description