Image Inputs

The coremltools converter generates by default a model with a multidimensional array (MultiArray) as the input. For example, you can use convert with just the model as the argument to convert a TensorFlow 2 model to a Core ML model with the an input of type MLMultiArray:

# Convert to Core ML with an MLMultiArrayType
model = ct.convert(tf_model)

The MultiArray input type may be convenient, but you may want to generate a model with an image as the input. You can provide the inputs argument to use an ImageType:

# Convert to Core ML with an Image Input Type
model = ct.convert(tf_model, inputs=[ct.ImageType()])

👍

Performance advantages of using ImageType

If your model expects an image as input, the best practice to covert the model is to use an ImageType, which can save a few milliseconds in inference time. A few milliseconds can make a big difference because Core ML models are heavily accelerated by the neural engine. An inefficient MLMultiArray copy operation could become a bottleneck in your model. Using an ImageType is a more efficient way to copy over an input of type CVPixelBuffer to the Core ML prediction API.

By converting a model that takes images as input to Core ML, you can apply classification models and preprocess the images using the Vision framework. You can provide any image while predicting with a Core ML model, and Vision will automatically resize it for you. This makes it much more convenient for consumption on the device. The Core ML API also contains several convenient ways to initialize an image feature value.

📘

Tip

For details on how to use Vision and Core ML for image classification, see Classifying Images with Vision and Core ML.

However, you need to be aware of different input interfaces for MultiArray and Image types, as shown in the following examples. Differences include different inputs to the predict API in coremltools, and different inputs when running on the device using the Core ML prediction API (in Core ML). For details about predictions, see Model Predictions.

Convert a model with a MultiArray

Using the following sample code, you can convert a TensorFlow 2 model to a Core ML model with the an input of type MLMultiArray:

import coremltools as ct
import tensorflow as tf # TF 2.2.0


# Load MobileNetV2
keras_model = tf.keras.applications.MobileNetV2()
input_name = keras_model.input_names[0]

# Convert to Core ML with an MLMultiArrayType
model = ct.convert(keras_model)

# In Python, provide a numpy array as input for prediction
import numpy as np
data = np.random.rand(1, 224, 224, 3)

# Make a prediction using Core ML
out_dict = model.predict({input_name: data})

# save to disk
model.save("MobileNetV2.mlmodel")

You can view the resulting model, saved as MobileNetV2.mlmodel, in Xcode:

1800

In the above figure, the input is called image_array and is of type MultiArray(1 x 224 x 224 x 3) of type Float32. You can rename the inputs and outputs using the rename_feature method.

Convert a model with an ImageType

You can use an ImageType to produce a model with image inputs. In the following example, the input type is specified with the class ct.ImageType. The model that is produced has an input of type Image:

import coremltools as ct

# Load MobileNetV2
import tensorflow as tf
keras_model = tf.keras.applications.MobileNetV2()
input_name = keras_model.input_names[0]

# Image Input Type
model = ct.convert(keras_model, inputs=[ct.ImageType()])

# Use PIL to load and resize the image to expected size
from PIL import Image
example_image = Image.open("daisy.jpg").resize((224, 224))

# Make a prediction using Core ML
out_dict = model.predict({input_name: example_image})

# save to disk
model.save("MobileNetV2.mlmodel")

As shown in the above example, the type of input for an image must be a PIL image to invoke a prediction in Python.

The following figure shows the model viewed in Xcode.

1820

In the above figure, the image input is of type Image with attributes set to (Color, 224 224). You can rename the inputs and outputs using the rename_feature method.

Add image preprocessing options

Add scale and bias pre-processing parameters for images during the initialization of the ct.ImageType class. Image preprocessing may be a requirement for some models, unless image preprocessing is contained in the model itself.

For example, the TensorFlow 1 MobileNet model expects the input image to be normalized with the interval [-1, 1]. This automatic rescaling occurs as part of the conversion process, and this information is efficiently handled during prediction time by Core ML. The following example shows how to add the bias and scaling options for the conversion.

import coremltools as ct
import tensorflow as tf # TF 1.15

keras_model = tf.keras.applications.MobileNet()

mlmodel = ct.convert(keras_model, 
                     inputs=[ct.ImageType(bias=[-1,-1,-1], scale=1/127.0)])

📘

Tip

To learn how to evaluate a Core ML model with image inputs in Python, see Model Prediction.