Converting TensorFlow 2 BERT Transformer Models
The following examples demonstrate converting TensorFlow 2 models to Core ML using Core ML Tools.
Convert the DistilBERT Transformer Model
The following example converts the DistilBERT model from Huggingface to Core ML.
Install Transformers
You may need to first install Transformers version 4.17.0.
Follow these steps:
- Add the import statements:
import numpy as np
import coremltools as ct
import tensorflow as tf
from transformers import DistilBertTokenizer, TFDistilBertForMaskedLM
- Load the DistilBERT model and tokenizer. This example uses the
TFDistilBertForMaskedLM
variant:
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-cased')
distilbert_model = TFDistilBertForMaskedLM.from_pretrained('distilbert-base-cased')
- Describe and set the input layer, and then build the TensorFlow model (
tf_model
):
max_seq_length = 10
input_shape = (1, max_seq_length) #(batch_size, maximum_sequence_length)
input_layer = tf.keras.layers.Input(shape=input_shape[1:], dtype=tf.int32, name='input')
prediction_model = distilbert_model(input_layer)
tf_model = tf.keras.models.Model(inputs=input_layer, outputs=prediction_model)
- Convert the
tf_model
to a Core ML neural network (mlmodel
):
mlmodel = ct.convert(tf_model)
- Create the input using
tokenizer
:
# Fill the input with zeros to adhere to input_shape
input_values = np.zeros(input_shape)
# Store the tokens from our sample sentence into the input
input_values[0,:8] = np.array(tokenizer.encode("Hello, my dog is cute")).astype(np.int32)
- Use
mlmodel
for prediction:
mlmodel.predict({'input':input_values}) # 'input' is the name of our input layer from (3)
Convert the TF Hub BERT Transformer Model
The following example converts the BERT model from TensorFlow Hub. Follow these steps:
- Add the import statements:
import numpy as np
import tensorflow as tf
import tensorflow_hub as tf_hub
import coremltools as ct
- Describe and set the input layer:
max_seq_length = 384
input_shape = (1, max_seq_length)
input_words = tf.keras.layers.Input(
shape=input_shape[1:], dtype=tf.int32, name='input_words')
input_masks = tf.keras.layers.Input(
shape=input_shape[1:], dtype=tf.int32, name='input_masks')
segment_ids = tf.keras.layers.Input(
shape=input_shape[1:], dtype=tf.int32, name='segment_ids')
- Build the TensorFlow model (
tf_model
):
bert_layer = tf_hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/1", trainable=False)
pooled_output, sequence_output = bert_layer(
[input_words, input_masks, segment_ids])
tf_model = tf.keras.models.Model(
inputs=[input_words, input_masks, segment_ids],
outputs=[pooled_output, sequence_output])
- Convert the
tf_model
to a neural network:
mlmodel = ct.convert(tf_model, source='TensorFlow')
Updated almost 2 years ago