It’s kind of like giving instructions to your computer in its own language. And TensorFlow is a library within Python that helps us build machine learning models.
Now let me give you an example. Let’s say we have a dataset with information about different types of flowers, including their petal length and width, as well as the type of species they belong to (e.g., roses or daisies). We want our computer to learn how to classify these flowers based on their features using machine learning techniques.
First, we’ll import the necessary libraries in Python:
# Import necessary libraries
import pandas as pd # for data manipulation and analysis
import numpy as np # for numerical operations
import matplotlib.pyplot as plt # for visualization
from sklearn.model_selection import train_test_split # for splitting our dataset into training and testing sets
from tensorflow.keras.models import Sequential # for building our machine learning model using TensorFlow's Keras library
# Import dataset
flowers = pd.read_csv("flowers.csv") # reads the csv file containing flower data and stores it in a pandas dataframe called "flowers"
# Split dataset into features and labels
X = flowers[['petal_length', 'petal_width']] # selects the features (petal length and width) and stores them in a new dataframe called "X"
y = flowers['species'] # selects the labels (flower species) and stores them in a new dataframe called "y"
# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) # splits the data into training and testing sets, with 20% of the data being used for testing
# Build a sequential model
model = Sequential() # creates a sequential model using TensorFlow's Keras library
# Add layers to the model
model.add(Dense(10, input_shape=(2,), activation='relu')) # adds a dense layer with 10 neurons, specifying the input shape and activation function
model.add(Dense(3, activation='softmax')) # adds a dense layer with 3 neurons and a softmax activation function, which is suitable for multi-class classification
# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # compiles the model, specifying the loss function, optimizer, and evaluation metrics
# Train the model
model.fit(X_train, y_train, epochs=50, batch_size=32) # trains the model for 50 epochs (iterations) with a batch size of 32, using the training data
# Evaluate the model on the testing set
test_loss, test_acc = model.evaluate(X_test, y_test) # evaluates the model on the testing set and stores the loss and accuracy values in variables
# Make predictions on new data
new_data = np.array([[5.1, 1.2], [3.5, 0.5]]) # creates a numpy array with new data (petal length and width) for two flowers
predictions = model.predict(new_data) # uses the trained model to make predictions on the new data
# Print the predictions
print(predictions) # prints the predicted probabilities for each flower belonging to each species
Next, we’ll load the data from a CSV file:
# Import the pandas library
import pandas as pd
# Load the data from a CSV file into a pandas dataframe
df = pd.read_csv('flower-data.csv')
# The 'pd' alias is used to refer to the pandas library throughout the script
# The 'read_csv' function is used to read data from a CSV file and store it in a dataframe called 'df'
# The 'df' variable is used to store the dataframe, which will be used to manipulate and analyze the data
Then, we’ll split the dataset into training and testing sets using the `train_test_split()` function:
# Split the dataset into training and testing sets using the `train_test_split()` function
# Import the necessary libraries
from sklearn.model_selection import train_test_split
# Create a variable `X` that contains the features (independent variables) from the dataframe `df`
X = df[['petal length', 'petal width']].values
# Create a variable `y` that contains the labels (dependent variable) from the dataframe `df`
y = df['species'].values
# Use the `train_test_split()` function to split the dataset into training and testing sets
# Set the test size to 20% of the dataset and use a random state of 42 for reproducibility
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# The `train_test_split()` function returns four arrays:
# x_train: array containing the features for the training set
# x_test: array containing the features for the testing set
# y_train: array containing the labels for the training set
# y_test: array containing the labels for the testing set
Now we’re ready to build our machine learning model using TensorFlow’s Keras library:
# Import necessary libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
# Create a new sequential model for our neural network
model = Sequential()
# Add the first layer with 64 neurons and ReLU activation function
# Input shape is (2,) since we have 2 features in our dataset
model.add(Dense(64, input_shape=(2,), activation='relu'))
# Add dropout to prevent overfitting (i.e., memorizing training data too well)
# Dropout rate is set to 0.5, meaning 50% of neurons will be randomly dropped during training
model.add(Dropout(0.5))
# Add the output layer with 1 neuron and sigmoid activation function for binary classification
# Since we have only two classes (roses or daisies), sigmoid function is suitable for binary classification
model.add(Dense(1, activation='sigmoid'))
# Compile our model using binary cross-entropy loss function and Adam optimization algorithm
# Binary cross-entropy is commonly used for binary classification problems
# Adam is an efficient optimization algorithm for training neural networks
model.compile(loss='binary_crossentropy', optimizer='adam')
Finally, we’ll train the model on the training set for 10 epochs (i.e., iterations through the data) with a batch size of 32:
# Import necessary libraries
import tensorflow as tf # import tensorflow library
import numpy as np # import numpy library
# Define the model architecture
model = tf.keras.Sequential() # create a sequential model
# Add layers to the model
model.add(tf.keras.layers.Dense(64, activation='relu', input_shape=(10,))) # add a dense layer with 64 units, relu activation function and input shape of 10
model.add(tf.keras.layers.Dense(64, activation='relu')) # add another dense layer with 64 units and relu activation function
model.add(tf.keras.layers.Dense(1, activation='sigmoid')) # add a final dense layer with 1 unit and sigmoid activation function
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) # compile the model with adam optimizer, binary crossentropy loss function and accuracy metric
# Train the model on the training set for 10 epochs with a batch size of 32
model.fit(x_train, y_train, epochs=10, batch_size=32) # train the model on the training set for 10 epochs and with a batch size of 32
And that’s it! We can now test our model using the testing set:
# This script evaluates the performance of a model on a testing set by calculating its accuracy and other metrics.
# Import necessary libraries
import numpy as np
import tensorflow as tf
# Load the testing data
x_test = np.load('x_test.npy') # Load the input features of the testing set
y_test = np.load('y_test.npy') # Load the corresponding labels of the testing set
# Evaluate the model on the testing set
model.evaluate(x_test, y_test) # Evaluate the model's performance on the testing set by passing in the input features and corresponding labels
Hope this helps you understand how Python and TensorFlow work together for machine learning!