Category Archives: Deep Learning

Some kind of plots

Distribution (Histogram)

import matplotlib.pyplot as plt
df.hist(bins=50, figsize=(20, 15))
plt.show()

Distribution (Density)

A density plot is a smoothed, continuous version of a histogram estimated from the data.

df.plot(kind='density', subplots=True, layout=(8,8), sharex=False, legend=False, figsize = (12,12))
plt.show()

Correlations

corr = train_df.corr()
corr

Heatmap (needs correlations)

%matplotlib inline
import seaborn as sns
plt.figure(figsize = (16,8))
sns.heatmap(corr, annot = True)

Pairplot
sns.pairplot(train_df)

Histogram plot of select variable(s)

from matplotlib import pyplot
pyplot.subplot(211)
pyplot.hist(train_X.iloc[:, 1])
pyplot.subplot(212)
pyplot.hist(train_X.iloc[:, 2])

Machine Learning: Pima Indians Diabetes

Visualise the Dataset

Visualising the data is an important step of the data analysis. With a graphical visualisation of the data we have a better understanding of the various features values distribution: for example we can understand what’s the average age of the people or the average BMI etc…We could of course limit our inspection to the table visualisation, but we could miss important things that may affect our model precision.

import matplotlib.pyplot as plt
dataset.hist(bins=50, figsize=(20, 15))
plt.show()

Source: Machine Learning: Pima Indians Diabetes

Get Started: 3 Ways to Load CSV files into Colab – Towards Data Science

To upload from your local drive, start with the following code:

from google.colab import files
uploaded = files.upload()

It will prompt you to select a file. Click on “Choose Files” then select and upload the file. Wait for the file to be 100% uploaded. You should see the name of the file once Colab has uploaded it.

Finally, type in the following code to import it into a dataframe (make sure the filename matches the name of the uploaded file).

import io
df2 = pd.read_csv(io.BytesIO(uploaded['Filename.csv']))

Dataset is now stored in a Pandas Dataframe

Source: Get Started: 3 Ways to Load CSV files into Colab – Towards Data Science

Choosing a loss function

Regression Problem

A problem where you predict a real-value quantity.

  • Output Layer Configuration: One node with a linear activation unit.
  • Loss Function: Mean Squared Error (MSE).

Binary Classification Problem

A problem where you classify an example as belonging to one of two classes.

The problem is framed as predicting the likelihood of an example belonging to class one, e.g. the class that you assign the integer value 1, whereas the other class is assigned the value 0.

  • Output Layer Configuration: One node with a sigmoid activation unit.
  • Loss Function: Cross-Entropy, also referred to as Logarithmic loss.

Multi-Class Classification Problem

A problem where you classify an example as belonging to one of more than two classes.

The problem is framed as predicting the likelihood of an example belonging to each class.

  • Output Layer Configuration: One node for each class using the softmax activation function.
  • Loss Function: Cross-Entropy, also referred to as Logarithmic loss.

Source: Loss and Loss Functions for Training Deep Learning Neural Networks

Choosing a neural network for your use case

Multilayer Perceptrons

Use MLPs For:

  • Tabular datasets
  • Classification prediction problems
  • Regression prediction problems

Try MLPs On:

  • Image data
  • Text Data
  • Time series data
  • Other types of data

Convolutional Neural Networks

Use CNNs For:

  • Image data
  • Classification prediction problems
  • Regression prediction problems

Try CNNs On:

  • Text data
  • Time series data
  • Sequence input data

Recurrent Neural Networks

Use RNNs For:

  • Text data
  • Speech data
  • Classification prediction problems
  • Regression prediction problems
  • Generative models

Don’t Use RNNs For:

  • Tabular data
  • Image data

Perhaps Try RNNs on:

  • Time series data

Source: When to Use MLP, CNN, and RNN Neural Networks