Let me show you an example of how to use the `compile()` function with a custom loss function, as well as how to use early stopping during training. First, let’s create some data for our model to learn from:
# Importing the necessary libraries
import tensorflow.keras as keras # Importing the Keras library and aliasing it as "keras"
from tensorflow.keras.datasets import mnist # Importing the MNIST dataset from Keras
from tensorflow.keras.utils import to_categorical # Importing the to_categorical function from Keras utilities
# Loading and preprocessing the data
(x_train, y_train), (x_test, y_test) = mnist.load_data() # Loading the MNIST dataset and splitting it into training and testing sets
x_train = x_train / 255.0 # Normalizing the training data by dividing it by 255.0
x_test = x_test / 255.0 # Normalizing the testing data by dividing it by 255.0
y_train = to_categorical(y_train, num_classes=10) # Converting the training labels into one-hot encoded vectors with 10 classes
y_test = to_categorical(y_test, num_classes=10) # Converting the testing labels into one-hot encoded vectors with 10 classes
Next, let’s define our model:
# Importing necessary libraries
from tensorflow.keras.models import Sequential # Importing the Sequential model from the tensorflow.keras library
from tensorflow.keras.layers import Dense # Importing the Dense layer from the tensorflow.keras library
# Defining the model
model = Sequential() # Creating an instance of the Sequential model
# Adding layers to the model
model.add(Dense(512, activation='relu', input_shape=(784,))) # Adding a Dense layer with 512 neurons, using ReLU activation function and specifying the input shape of the layer as (784,)
model.add(Dense(10, activation='softmax')) # Adding a Dense layer with 10 neurons, using Softmax activation function
# The model is now ready to be trained and used for predictions.
Now let’s compile the model with a custom loss function and early stopping:
# Import necessary libraries
from tensorflow.keras.losses import categorical_crossentropy # Importing the categorical crossentropy loss function from the tensorflow.keras library
from tensorflow.keras.optimizers import Adam # Importing the Adam optimizer from the tensorflow.keras library
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint # Importing the EarlyStopping and ModelCheckpoint callbacks from the tensorflow.keras library
# Compile the model with custom loss function and early stopping
model.compile(loss=categorical_crossentropy, optimizer=Adam(), metrics=['accuracy']) # Compiling the model with the categorical crossentropy loss function, Adam optimizer, and accuracy metric
early = EarlyStopping(monitor='val_loss', mode='min', verbose=1) # Creating an EarlyStopping callback to monitor the validation loss and stop training if it stops decreasing
checkpoint = ModelCheckpoint('mnist-{epoch:02d}-{val_loss:.4f}.hdf5', monitor='val_loss', save_best_only=True, mode='auto') # Creating a ModelCheckpoint callback to save the best model based on the validation loss during training
Finally, let’s train the model for 10 epochs with early stopping and checkpointing:
# Import necessary libraries
import tensorflow as tf
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
# Define early stopping and checkpoint callbacks
early = EarlyStopping(monitor='val_loss', patience=3) # Stop training if validation loss does not improve for 3 epochs
checkpoint = ModelCheckpoint(filepath='model.h5', monitor='val_accuracy', save_best_only=True) # Save the model with the best validation accuracy
# Train the model for 10 epochs with early stopping and checkpointing
model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=10, callbacks=[early, checkpoint]) # Fit the model on the training data, validate on the test data, and use the early stopping and checkpoint callbacks during training
This will train the model for 10 epochs or until the validation loss stops improving (whichever comes first). If the validation loss improves at any point during training, a new checkpoint file will be saved with the current epoch number and validation loss. This can help you avoid overfitting by stopping training when the model has reached its best performance on the validation set.
I hope this helps! Let me know if you have any questions or need further clarification.