3-1: How machine learning is used at a hedge fund

The ML problem

Plenty of hedge funds leverage machine learning models and just plain models to make predictions about the market using observation. And this essentially what a model does, machine learning or not. Provided some observation, a model produces some prediction.

In this course, we'll cover how we can process large amounts of data, provide it to a machine learning algorithm to produce a model, and use that model to make predictions from provided observations.

ml-problem

Choosing X and Y

Some examples provided by the course lecture to classify observations and predications are as follows:

  • observations
    • price momentum
    • Bollinger value
    • current price
  • predictions
    • future price
    • future return

choosing-x-y

Supervised regression learning

What's the definition of supervised regression learning? Supervised means we provided an example observation and prediction. Regression means the model will be producing some numerical prediction. Learning means we train the model with some data. There are multiple algorithms to conduct supervised regression learning:

  • Linear regression (parametric)
    • Leverage data to create parameters and then discards the data
  • K nearest neighbor (KNN) (instance based)
    • Retains historic data and consults the data
  • Decision trees
  • Decision forests

supervised-regression-learning

Backtesting

Backtesting is a technique wherein we utilize historical data with our machine learning algorithm and subsequent model to make predictions on events that have already happened. Using the results of our forecasting, we can make determinations as to how accurate our model is and how confident we can be in its predictions. Below is a slide from the lectures on this:

backtesting