Flexible Input Shapes

While some neural network models accept only fixed-size input, such as an image with a resolution of 224 x 224 pixels, other models require flexible input shapes that are determined at runtime. Examples are language models that work on arbitrary input lengths, and style transfer models that work on multiple resolutions.

When converting a model to Core ML using coremltools, you can specify a fixed input shape or a flexible input shape. With a flexible input shape you can specify the input as follows:

In addition, you can set a default shape to preallocate memory for the input.

Select from predetermined shapes

You can specify a set of predetermined shapes for input in order to optimize performance, ensure model accuracy, or limit the range of inputs for other purposes. To make Core ML models using this constraint, set EnumeratedShapes. The following example shows the same model with a sequence length to be a multiple of 10 by creating a set of enumerated ranges:

import coremltools as ct

# Transformer model with input shape: [batch, sequence, feature_len]
nmt_model = "MT-Transformer.pb"

# Specifying that the sequence must be in [10, 20, 30, 40, 50]
shapes = [(1, 10*i, 1024) for i in range(1, 6)]
input_shape = ct.EnumeratedShapes(shapes=shapes)
model_input = ct.TensorType(shape=input_shape)

# Convert the model
mlmodel = ct.convert(model=nmt_model, inputs=[model_inputs])

👍

Enumerated shapes provide a performance advantage

Use EnumeratedShapes for best performance. During compilation the model can be optimized on the device for the finite set of input shapes. If you need more flexibility for inputs, consider setting the range for each dimension, or enabling unbounded ranges.

📘

Multi-input models

Note that for a multi input model, only one of the inputs can be marked with enumerated shapes, the rest must have fixed single shapes. If you require multiple inputs to be flexible then please use the "range" flexibility as defined below.

Set the range for each dimension

If you know that the input shape will be within a specific interval in each dimension, set the range for each dimension with RangeDim.

The following is an example of using a specific interval. It converts a machine translation transformer model. The model requires a 3D input with batch, sequence, and feature_len. During runtime, the model can be provided with a range of sequences between [1, 50]. You can specify this range during conversion using the following code:

import coremltools as ct

# Transformer model with input shape: [batch, sequence, feature_len]
nmt_model = "MT-Transformer.pb"

# Range for the sequence dimension to be between [1, 50]
input_shape = ct.Shape(shape=(1, ct.RangeDim(1, 50), 1024))
model_input = ct.TensorType(shape=input_shape)

# Convert the model
mlmodel = ct.convert(model=nmt_model, inputs=[model_inputs])

Enable unbounded ranges

You can specify an empty range to enable an unbounded range, which allows the input to be as large as needed. However, the device's memory constraints determine how large the input can be. Use RangeDim with a set maximum range to ensure that the model doesn't crash at runtime for your users. The following example demonstrates using an unbounded range:

import coremltools as ct

# Transformer model with input shape: [batch, sequence, feature_len]
nmt_model = "MT-Transformer.pb"

# Range for the sequence dimension is "arbitary"
input_shape = ct.Shape(shape=(1, ct.RangeDim(), 1024))
model_input = ct.TensorType(shape=input_shape)

# Convert the model
mlmodel = ct.convert(model=nmt_model, inputs=[model_inputs])

Set a default shape

You can set a default shape for RangeDim or EnumeratedShapes to preallocate memory for the default shape, speeding up performance. The following example converts a model with a default shape of (1, 50, 1024) while maintaining input flexibility:

import coremltools as ct

# Transformer model with input shape: [batch, sequence, feature_len]
nmt_model = "MT-Transformer.pb"

# Specifying that the sequence must be in [10, 20, 30, 40, 50]
shapes = [(1, 10*i, 1024) for i in range(1, 6)]
input_shape = ct.EnumeratedShapes(shapes=shapes, 
                                  default=(1, 50, 1024))
model_input = ct.TensorType(shape=input_shape)

# Convert the model
mlmodel = ct.convert(model=nmt_model, inputs=[model_inputs])

📘

About memory preallocation

When the default is set, Core ML preallocates the memory needed for the default shape. Memory reallocation does not occur if inputs fed into the model consume less than the preallocated block. For example, for seq2seq models, you might want to set the default to the maximum sequence length. By doing so, no memory reallocation would happen during inference time. The tradeoff is the possibility of using more memory than needed.

API reference

For details, see EnumeratedShapes and RangeDim.