To kick things off, we need some data. Let’s say we want to predict the price of Apple (AAPL) stock over the next month based on its performance over the past year. We can get this data from a financial website or API and store it in a pandas DataFrame for easy manipulation:
# Import the pandas library to use its functions for data manipulation
import pandas as pd
# Import the datetime library to work with dates and times
from datetime import datetime, timedelta
# Set up our date range (one month ago to today)
# Use the datetime.now() function to get the current date and time
# Use the timedelta function to subtract 30 days from the current date
start_date = datetime.now() - timedelta(days=30)
# Use the datetime.now() function to get the current date and time
end_date = datetime.now()
# Fetch stock data from Yahoo Finance API using pandas read_csv function
# Use the pd.read_csv() function to read data from a CSV file or URL
# Use the 'index_col=False' parameter to specify that the first column is not the index
# Use the 'parse_dates=True' parameter to automatically parse dates in the data
# Use string concatenation to construct the URL for the Yahoo Finance API
# Use the str() function to convert the timestamp values to strings
# Use the int() function to convert the timestamp values to integers
# Use the .timestamp() function to convert the datetime objects to timestamps
# Multiply the timestamps by 1000 to convert them to milliseconds
df = pd.read_csv('https://query1.finance.yahoo.com/v7/finance/download/' + 'AAPL' + '?period1=' + str(int(start_date.timestamp()) * 1000) + '&period2=' + str(int(end_date.timestamp()) * 1000) + '&interval=5m', index_col=False, parse_dates=True)
Now that we have our data, let’s clean it up and prepare it for training our model. We can do this by removing any missing values or outliers, scaling the data to a standard range (e.g. 0-1), and splitting it into features (historical stock prices) and targets (future price predictions).
# Remove any rows with missing values
df = df.dropna() # Drops any rows with missing values from the dataframe
# Scale the data between 0 and 1 for easier training
scaler = StandardScaler() # Creates a scaler object to scale the data
X_train, X_test, y_train, y_test = train_test_split(scaler.fit_transform(df[['Open', 'High', 'Low']]), df['Close'].values) # Splits the data into training and testing sets, and scales the features (historical stock prices) using the scaler object. The targets (future price predictions) are not scaled.
Next, we’ll choose a machine learning algorithm to use for our predictions. In this case, let’s go with the Gradient Boosting Regressor from scikit-learn because it has been shown to perform well on stock price prediction tasks:
# Import the GradientBoostingRegressor from the scikit-learn library
from sklearn.ensemble import GradientBoostingRegressor
# Create an instance of the GradientBoostingRegressor with 100 estimators and a learning rate of 0.05
model = GradientBoostingRegressor(n_estimators=100, learning_rate=0.05)
# Train the model using the training data X_train and y_train
model.fit(X_train, y_train)
# The model is now trained and ready to make predictions on new data.
Finally, we can use our trained model to make predictions about future stock prices:
# Import necessary libraries
import datetime # Importing the datetime library to work with dates and times
import numpy as np # Importing the numpy library for scientific computing
import tensorflow as tf # Importing the tensorflow library for machine learning
# Define the input data for the prediction
input_data = [[162.95, 163.47, 160.89, 163.47]] # Creating a list of lists with the input data for the prediction
# Make a prediction for the next day's closing price using the trained model
prediction = model.predict(input_data)[0] # Using the trained model to make a prediction on the input data and storing the result in the variable "prediction"
# Print the predicted closing price for the next day
print(f"The predicted closing price for AAPL on {datetime.now().strftime('%Y-%m-%d')} is ${prediction:.2f}") # Using string formatting to print the predicted closing price for the current date in a specific format
And that’s it! We’ve successfully used advanced machine learning techniques to predict the stock prices of Apple (AAPL) over the next month based on its historical performance. Of course, this is just a simple example and there are many other factors that can affect stock prices such as market trends, economic indicators, and company-specific news. But hopefully this gives you an idea of how machine learning can be used to make predictions in finance and other fields!