Flexible Input Shapes
While some neural network models accept only fixed-size input, such as an image with a resolution of 224 x 224 pixels, other models require flexible input shapes that are determined at runtime. Examples are language models that work on arbitrary input lengths, and style transfer models that work on multiple resolutions.
When converting a model to Core ML using coremltools, you can specify a fixed input shape or a flexible input shape. With a flexible input shape you can specify the input as follows:
- Select from predetermined shapes to limit the input to selected shapes, optimizing performance.
- Set the range for each dimension to define the minimum and maximum for more dynamic shapes.
- Enable unbounded ranges to be as large as needed for ultimate flexibility.
In addition, you can set a default shape to preallocate memory for the input.
Select from predetermined shapes
You can specify a set of predetermined shapes for input in order to optimize performance, ensure model accuracy, or limit the range of inputs for other purposes. To make Core ML models using this constraint, set EnumeratedShapes
. The following example shows the same model with a sequence length to be a multiple of 10 by creating a set of enumerated ranges:
import coremltools as ct
# Transformer model with input shape: [batch, sequence, feature_len]
nmt_model = "MT-Transformer.pb"
# Specifying that the sequence must be in [10, 20, 30, 40, 50]
shapes = [(1, 10*i, 1024) for i in range(1, 6)]
input_shape = ct.EnumeratedShapes(shapes=shapes)
model_input = ct.TensorType(shape=input_shape)
# Convert the model
mlmodel = ct.convert(model=nmt_model, inputs=[model_inputs])
Enumerated shapes provide a performance advantage
Use
EnumeratedShapes
for best performance. During compilation the model can be optimized on the device for the finite set of input shapes. If you need more flexibility for inputs, consider setting the range for each dimension, or enabling unbounded ranges.
Multi-input models
Note that for a multi input model, only one of the inputs can be marked with enumerated shapes, the rest must have fixed single shapes. If you require multiple inputs to be flexible then please use the "range" flexibility as defined below.
Set the range for each dimension
If you know that the input shape will be within a specific interval in each dimension, set the range for each dimension with RangeDim
.
The following is an example of using a specific interval. It converts a machine translation transformer model. The model requires a 3D input with batch
, sequence
, and feature_len
. During runtime, the model can be provided with a range of sequences between [1, 50]
. You can specify this range during conversion using the following code:
import coremltools as ct
# Transformer model with input shape: [batch, sequence, feature_len]
nmt_model = "MT-Transformer.pb"
# Range for the sequence dimension to be between [1, 50]
input_shape = ct.Shape(shape=(1, ct.RangeDim(1, 50), 1024))
model_input = ct.TensorType(shape=input_shape)
# Convert the model
mlmodel = ct.convert(model=nmt_model, inputs=[model_inputs])
Enable unbounded ranges
You can specify an empty range to enable an unbounded range, which allows the input to be as large as needed. However, the device's memory constraints determine how large the input can be. Use RangeDim
with a set maximum range to ensure that the model doesn't crash at runtime for your users. The following example demonstrates using an unbounded range:
import coremltools as ct
# Transformer model with input shape: [batch, sequence, feature_len]
nmt_model = "MT-Transformer.pb"
# Range for the sequence dimension is "arbitary"
input_shape = ct.Shape(shape=(1, ct.RangeDim(), 1024))
model_input = ct.TensorType(shape=input_shape)
# Convert the model
mlmodel = ct.convert(model=nmt_model, inputs=[model_inputs])
Set a default shape
You can set a default shape for RangeDim
or EnumeratedShapes
to preallocate memory for the default shape, speeding up performance. The following example converts a model with a default shape of (1, 50, 1024)
while maintaining input flexibility:
import coremltools as ct
# Transformer model with input shape: [batch, sequence, feature_len]
nmt_model = "MT-Transformer.pb"
# Specifying that the sequence must be in [10, 20, 30, 40, 50]
shapes = [(1, 10*i, 1024) for i in range(1, 6)]
input_shape = ct.EnumeratedShapes(shapes=shapes,
default=(1, 50, 1024))
model_input = ct.TensorType(shape=input_shape)
# Convert the model
mlmodel = ct.convert(model=nmt_model, inputs=[model_inputs])
About memory preallocation
When the
default
is set, Core ML preallocates the memory needed for the default shape. Memory reallocation does not occur if inputs fed into the model consume less than the preallocated block. For example, for seq2seq models, you might want to set the default to the maximum sequence length. By doing so, no memory reallocation would happen during inference time. The tradeoff is the possibility of using more memory than needed.
API reference
For details, see EnumeratedShapes
and RangeDim
.
Updated almost 4 years ago