Using Deep Learning To Improve the Traditional Machine Learning Performance

Deep learning for feature extraction, ensemble models, and more

Edwin Maina
Heartbeat

--

Photo by DeepMind on Unsplash

The advent of deep learning has been a game-changer in machine learning, paving the way for the creation of complex models capable of feats previously thought impossible. These models have been used to achieve state-of-the-art performance in many different fields, including image classification, natural language processing, and speech recognition. This article delves into using deep learning to enhance the effectiveness of classic ML models.

Background Information

Decision trees, random forests, and linear regression are just a few examples of classic machine-learning models that have been used extensively in business for years. They are easy to implement and deploy because of their simplicity, clarity and interpretability.

Deep learning is a machine learning specialization that uses computational models inspired by the human nervous system. These networks are constructed from nested layers of nodes to learn intricate connections between inputs and outputs. Deep learning models have been shown to achieve cutting-edge performance in many applications, making them the go-to solution for many machine learning problems.

Using Deep Learning Models as Feature Extractors

One of the main ways deep learning can improve the performance of traditional machine learning models is by using them as feature extractors. In this approach, deep learning models are trained to extract useful features from the input data, which can then be used as input data to a traditional machine learning model.

Deep learning models can extract different, and often more useful, features compared to traditional machine learning models for several reasons:

Depth

Deep learning models, especially Convolutional Neural Networks (CNNs), have multiple layers that can learn hierarchical representations of the input data. These layers can learn simple features, such as edges, in early layers, and more complex features, such as objects, in deeper layers. In comparison, traditional machine learning models typically only learn a single representation of the input data.

End-to-end learning

Deep learning models can be trained end-to-end, meaning that the model is trained to optimize the task-specific loss function, such as accuracy, in image classification. This allows the model to learn features that are directly relevant to the task at hand. Traditional machine learning models, on the other hand, usually require manual feature engineering, where domain-specific knowledge is used to extract features from the input data.

Ability to learn from raw data

Deep learning models can learn from raw data, such as images or speech, without the need for manual feature engineering. They also tend to learn more complex and abstract representations of the data.

These advantages of deep learning models make them well suited for tasks where manual feature engineering is difficult or impossible, such as in computer vision or speech recognition. By using deep learning models as feature extractors, we can leverage these advantages to improve the performance of traditional machine learning models.

Want to see the evolution of AI-generated art projects? Visit our public project to see time-lapses, experiment evolutions, and more!

Computer Vision

One example of using deep learning models as feature extractors is in the field of computer vision. Convolutional Neural Networks (CNNs) trained on large image datasets can be used to extract relevant features from images, such as edges, textures, and objects. These features can then be used as input to another machine learning model, such as a support vector machine (SVM) or a random forest classifier, to perform tasks such as image classification or object detection. This approach has been shown to achieve state-of-the-art results on a variety of computer vision tasks like facial recognition, identifying individuals in images or videos based on their facial features.

This can be done by using the output of the deep learning model as an additional feature in the traditional machine learning model or by using the deep learning model to pre-process the input data before passing it to the traditional machine learning model.

Example

We first need to train a deep learning model on the input data. We can do this using a deep learning framework such as TensorFlow or PyTorch. Below we use a TensorFlow implementation of a simple sequential Keras model to learn features from the data.

# Split the data into training and validation sets
x_train, x_val, y_train, y_val = train_test_split(x, y, test_size=0.2)

Here, the data x and labels y are being split into a training set x_train and y_train and a validation set x_val and y_val. The test_size argument is set to 0.2, meaning 20% of the data will be used for validation and the remaining 80% for training.

# Define and train the deep learning model using the training set
model = keras.Sequential([
keras.layers.Dense(12, activation='relu', input_shape(8,))
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)

We build a simple sequential model with the Keras API with three layers: a dense layer with 12 nodes, 8 inputs and the ReLU activation function.

The model is then compiled with the Adam optimizer, the sparse categorical cross-entropy loss function, and accuracy as the metric. Finally, the model is trained on the training set x_train and y_train for 5 epochs.

# Extract features from the validation set using the deep learning model
deep_learning_features = model.predict(x_val)

In this section, features are being extracted from the validation set x_val using the trained deep learning model model. The model.predict method returns the predicted outputs for the validation set, which are used as the extracted features.

# Combine the extracted features with the original validation set data
x_val_with_features = np.concatenate((x_val, deep_learning_features), axis=1)

Here, the extracted features deep_learning_features are being combined with the original validation set data x_val using the np.concatenate method. The axis argument is set to 1, meaning that the arrays are being concatenated along the columns (features) axis.

# Train the traditional machine learning model (Random Forest) using the combined data
clf = RandomForestClassifier(n_estimators=100)
clf.fit(x_val_with_features, y_val)

In this section, a traditional machine learning model, a Random Forest classifier, is being trained on the combined validation set data x_val_with_features and labels y_val. The n_estimators argument is set to 100, meaning that 100 decision trees will be used in the forest. The clf.fit method trains the classifier on the combined data.

Consequently, we are predicting the class labels of the test data using the Random Forest classifier clf. The classifier is trained using the combined features of the original data and the features extracted from the deep learning model. The prediction is performed using the clf.predict() method on the test data.

The Ensemble Approach

Second, traditional machine learning models can benefit from an ensemble approach that includes deep learning models. Here, we show how to get deep learning models to work in tandem with ML classics. To this end, deep learning models can clean the data, and extract valuable features.

An example of how deep learning models can be used with ensemble methods is as follows:

# Import libraries
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from keras.wrappers.scikit_learn import KerasClassifier

# Create a deep learning model
def create_model():
model = Sequential()
# Add layers
# ...
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
return model

# Create a traditional machine learning model
clf1 = LogisticRegression()

# Convert the deep learning model to a scikit-learn model
clf2 = KerasClassifier(build_fn=create_model, epochs=10, batch_size=32)

# Specify weights for each model
weights = [0.7, 0.3] # You can adjust the weights as needed

# Create the ensemble model with soft voting and weights
ensemble_model = VotingClassifier(estimators=[('lr', clf1), ('dl', clf2)], voting='soft',weights=weights)

# Fit the ensemble model
ensemble_model.fit(X_train, y_train)

# Make predictions
y_pred = ensemble_model.predict(X_test)

In the code, a Voting Classifier from scikit-learn is used to perform an ensemble between a Logistic Regression model, which is a traditional machine learning model, and a KerasClassifier, which is a deep learning model implemented using the Keras library.

The KerasClassifier is created by defining a function create_model that creates and returns a deep learning model using the Keras Sequential API. The create_model function creates a sequential model, adds some layers to it, compiles it with an optimizer, loss function, and evaluation metrics, and then returns the model.

Then, the Logistic Regression model is defined as clf1, and the KerasClassifier is defined as clf2 by passing the create_model function to the KerasClassifier and specifying the number of epochs and batch size for training.

Finally, the ensemble model is created as ensemble_model by passing the estimators list containing clf1 and clf2 and the voting strategy voting=’soft’ to the VotingClassifier. The ensemble model is then fit to the training data X_train and y_train, and predictions are made on the test data X_test by calling ensemble_model.predict(X_test).

In this way, the deep learning model is used to make predictions and provide an additional source of information for the ensemble model to use in making its final predictions. The ensemble approach can leverage the strengths of both deep learning and traditional machine learning models to achieve better performance compared to using either one alone.

The performance of an ensemble model can be significantly improved by combining deep learning and traditional ML models. However, it’s important to remember that this approach might be more computationally expensive and necessitate more time to train and evaluate.

There are several more real life examples of how deep learning is being used to improve traditional machine learning models. These include:

  • Reinforcement learning: In this technique, deep learning models are used to enhance the effectiveness of conventional machine learning models by making more precise predictions of the Q-value.
  • Anomaly detection: By providing a more precise prediction of the anomalies, deep learning models are used to enhance the performance of conventional machine learning models in anomaly detection.

Conclusion

The given examples above demonstrate that traditional machine learning methods can be improved with deep learning techniques. However, the best approach will always differ depending on the specifics of the problem at hand and the data at hand.

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletter (Deep Learning Weekly), check out the Comet blog, join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.

--

--