Transformers supports framework interoperability between PyTorch, TensorFlow, and JAX, which means you can train a model in one framework and load it for inference in another. This is super convenient because sometimes different frameworks have their own strengths and weaknesses depending on the task at hand.
For example, let’s say you want to fine-tune a pretrained language model like BERT (Bidirectional Encoder Representations from Transformers) for text classification. You can do this in just three lines of code using PyTorch:
# Import necessary libraries
from transformers import BertTokenizer, TFBertForSequenceClassification
import tensorflow as tf
# Load the pretrained model and tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') # Load the BERT tokenizer
model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2) # Load the BERT model for sequence classification with 2 labels
# Preprocess your data (e.g., convert text to tokenized inputs and labels)
inputs, labels = preprocess(your_data) # Preprocess the data by converting text to tokenized inputs and labels
# Train the model on your data using TensorFlow's Keras API
model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(), metrics=['accuracy']) # Compile the model with Adam optimizer, SparseCategoricalCrossentropy loss function and accuracy as the metric
history = model.fit(inputs, labels, epochs=3) # Train the model for 3 epochs using the tokenized inputs and labels
That’s it! You can then use the trained model to make predictions on new data:
# Load your pretrained model and tokenizer (assuming you saved them in a folder called 'models')
# Import necessary libraries
from transformers import BertTokenizer, TFBertForSequenceClassification
import tensorflow as tf
# Load the tokenizer and model from the specified directory
tokenizer = BertTokenizer.from_pretrained('models/your-model-name', cache_dir='./cache')
model = TFBertForSequenceClassification.from_pretrained('models/your-model-name', num_labels=2)
# Preprocess your data (e.g., convert text to tokenized inputs and labels)
# Assuming the preprocess function takes in new_data as input and returns tokenized inputs and labels
inputs, labels = preprocess(new_data)
# Make predictions using the loaded model
# The model.predict() function takes in the tokenized inputs and returns the predicted labels
predictions = model.predict(inputs)
Transformers also provides tools for downloading and training state-of-the-art models in other modalities like computer vision (e.g., image classification, object detection, segmentation), audio recognition (e.g., automatic speech recognition, audio classification), and multimodal tasks like table question answering and visual question answering.
Overall, Transformers is a super useful library for anyone working with NLP or other modalities who wants to take advantage of pretrained models without having to train them from scratch. It’s also great if you want to experiment with different frameworks (PyTorch, TensorFlow, JAX) and see which one works best for your specific task.