Artificial Neural Networks (ANN)#

Neural networks are machine learning models that can be used to approximate any non-linear function. In this tutorial, we will first go through an example of how neural networks can be implemented in tensorflow (based on this notebook from Géron), and then look at an example of applying a neural network to a climate data set.

Implementing MLP’s with sci-kit learn#

First, we will look at using sci-kit learn to implement a Multi-layer Perceptron (MLP) on the Iris data set.

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler

The Iris data set is a simple example data that includes values for 4 features:

-sepal length (cm)
-sepal width (cm)
-petal length (cm)
-petal width (cm)

for labeled data sets of 3 different iris types [‘setosa’, ‘versicolor’, ‘virginica’]. The problem is to classify each sample as belonging to one of these 3 iris types, based on the values of these 4 features.

iris = load_iris()
X_train_full, X_test, y_train_full, y_test = train_test_split(
    iris.data, iris.target, test_size=0.1, random_state=42)
X_train, X_valid, y_train, y_valid = train_test_split(
    X_train_full, y_train_full, test_size=0.1, random_state=42)

In the sci-kit learn implementation, we need to specify whether we are training an MLP for classification or regression. Here we use the MLPClassifier since we are interested in classifying each sample as belonging to one of the 3 iris types.

We can specify a number of hyper-parameters for the MLPClassifier:

  • hidden_layer_sizes : the number of neurons in the ith hidden layer

  • activation: the activation function to use. The default is relu

  • max_iter : the maximum number of iterations

  • etc.

mlp_clf = MLPClassifier(hidden_layer_sizes=[5], max_iter=10_000,
                        random_state=42)
pipeline = make_pipeline(StandardScaler(), mlp_clf)
pipeline.fit(X_train, y_train)
accuracy = pipeline.score(X_valid, y_valid)
accuracy
1.0
mlp_clf
MLPClassifier(hidden_layer_sizes=[5], max_iter=10000, random_state=42)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Implementing MLP’s with Keras/Tensorflow#

We would generally not use sci-kit learn to implement a neural network, because we don’t have nearly as much flexibility to customize our deep learning models with this library as we do with libraries that are specifically developed for deep learning. Instead, we will use Tensorflow with its Keras backend as a deep learning library.

To start, we will use a simple example of using the fashion MNIST data set (this is a simple example data set that includes examples of images of different items of clothing labeled by the type of clothing). We will just use this as an example for how one sets up and trains a machine learning model in Tensorflow

import tensorflow as tf
import warnings
warnings.filterwarnings('ignore')
import matplotlib.pyplot as plt
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[7], line 1
----> 1 import tensorflow as tf
      2 import warnings
      3 warnings.filterwarnings('ignore')

ModuleNotFoundError: No module named 'tensorflow'
fashion_mnist = tf.keras.datasets.fashion_mnist.load_data()

(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist
X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]
X_valid, y_valid = X_train_full[-5000:], y_train_full[-5000:]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[8], line 1
----> 1 fashion_mnist = tf.keras.datasets.fashion_mnist.load_data()
      3 (X_train_full, y_train_full), (X_test, y_test) = fashion_mnist
      4 X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]

NameError: name 'tf' is not defined

The training set contains 60,000 grayscale images, each 28x28 pixels.

X_train.shape
(121, 4)

Each pixel intensity is represented as a byte (0 to 255):

X_train.dtype
dtype('float64')

Let’s scale the pixel intensities down to the 0-1 range and convert them to floats, dividing by 255:

X_train, X_valid, X_test = X_train / 255., X_valid / 255., X_test / 255.
plt.imshow(X_train[0], cmap="binary")
plt.axis('off')
plt.show()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[12], line 1
----> 1 plt.imshow(X_train[0], cmap="binary")
      2 plt.axis('off')
      3 plt.show()

NameError: name 'plt' is not defined

The labels are the class IDS, from 0 to 9:

y_train
array([0, 2, 1, 2, 2, 0, 0, 1, 0, 0, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1, 0,
       0, 0, 2, 2, 0, 1, 2, 0, 2, 1, 0, 0, 1, 2, 0, 2, 1, 1, 2, 1, 0, 1,
       1, 1, 0, 0, 2, 0, 2, 1, 1, 2, 1, 0, 0, 0, 1, 0, 2, 2, 0, 1, 1, 1,
       0, 0, 2, 2, 1, 0, 1, 1, 0, 2, 0, 1, 2, 0, 2, 2, 2, 0, 2, 0, 0, 1,
       2, 2, 1, 2, 0, 1, 0, 1, 1, 0, 1, 2, 2, 2, 1, 1, 0, 1, 0, 2, 0, 2,
       1, 0, 0, 2, 2, 2, 2, 0, 0, 2, 2])

The corresponding class names are

class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
               "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]
n_rows = 4
n_cols = 10
plt.figure(figsize=(n_cols * 1.2, n_rows * 1.2))
for row in range(n_rows):
    for col in range(n_cols):
        index = n_cols * row + col
        plt.subplot(n_rows, n_cols, index + 1)
        plt.imshow(X_train[index], cmap="binary", interpolation="nearest")
        plt.axis('off')
        plt.title(class_names[y_train[index]])
plt.subplots_adjust(wspace=0.2, hspace=0.5)

plt.show()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[15], line 3
      1 n_rows = 4
      2 n_cols = 10
----> 3 plt.figure(figsize=(n_cols * 1.2, n_rows * 1.2))
      4 for row in range(n_rows):
      5     for col in range(n_cols):

NameError: name 'plt' is not defined

Creating the model using the sequential API#

tf.random.set_seed(42)
model = tf.keras.Sequential()
model.add(tf.keras.layers.InputLayer(input_shape=[28, 28]))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(300, activation="relu"))
model.add(tf.keras.layers.Dense(100, activation="relu"))
model.add(tf.keras.layers.Dense(10, activation="softmax"))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[16], line 1
----> 1 tf.random.set_seed(42)
      2 model = tf.keras.Sequential()
      3 model.add(tf.keras.layers.InputLayer(input_shape=[28, 28]))

NameError: name 'tf' is not defined
tf.keras.backend.clear_session()
tf.random.set_seed(42)

model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=[28, 28]),
    tf.keras.layers.Dense(300, activation="relu"),
    tf.keras.layers.Dense(100, activation="relu"),
    tf.keras.layers.Dense(10, activation="softmax")
])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 tf.keras.backend.clear_session()
      2 tf.random.set_seed(42)
      4 model = tf.keras.Sequential([
      5     tf.keras.layers.Flatten(input_shape=[28, 28]),
      6     tf.keras.layers.Dense(300, activation="relu"),
      7     tf.keras.layers.Dense(100, activation="relu"),
      8     tf.keras.layers.Dense(10, activation="softmax")
      9 ])

NameError: name 'tf' is not defined
model.summary()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[18], line 1
----> 1 model.summary()

NameError: name 'model' is not defined
model.layers
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[19], line 1
----> 1 model.layers

NameError: name 'model' is not defined
hidden1 = model.layers[1]
hidden1.name
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[20], line 1
----> 1 hidden1 = model.layers[1]
      2 hidden1.name

NameError: name 'model' is not defined
model.get_layer('dense') is hidden1
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[21], line 1
----> 1 model.get_layer('dense') is hidden1

NameError: name 'model' is not defined
weights, biases = hidden1.get_weights()
weights
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[22], line 1
----> 1 weights, biases = hidden1.get_weights()
      2 weights

NameError: name 'hidden1' is not defined
weights.shape
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[23], line 1
----> 1 weights.shape

NameError: name 'weights' is not defined
biases
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[24], line 1
----> 1 biases

NameError: name 'biases' is not defined
biases.shape
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[25], line 1
----> 1 biases.shape

NameError: name 'biases' is not defined

Compiling the model#

model.compile(loss="sparse_categorical_crossentropy",
              optimizer="sgd",
              metrics=["accuracy"])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[26], line 1
----> 1 model.compile(loss="sparse_categorical_crossentropy",
      2               optimizer="sgd",
      3               metrics=["accuracy"])

NameError: name 'model' is not defined
model.compile(loss=tf.keras.losses.sparse_categorical_crossentropy,
              optimizer=tf.keras.optimizers.SGD(),
              metrics=[tf.keras.metrics.sparse_categorical_accuracy])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[27], line 1
----> 1 model.compile(loss=tf.keras.losses.sparse_categorical_crossentropy,
      2               optimizer=tf.keras.optimizers.SGD(),
      3               metrics=[tf.keras.metrics.sparse_categorical_accuracy])

NameError: name 'model' is not defined

Training and evaluating the model#

history = model.fit(X_train, y_train, epochs=30,
                    validation_data=(X_valid, y_valid))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[28], line 1
----> 1 history = model.fit(X_train, y_train, epochs=30,
      2                     validation_data=(X_valid, y_valid))

NameError: name 'model' is not defined
history.params
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[29], line 1
----> 1 history.params

NameError: name 'history' is not defined
print(history.epoch)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[30], line 1
----> 1 print(history.epoch)

NameError: name 'history' is not defined
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

pd.DataFrame(history.history).plot(
    figsize=(8, 5), xlim=[0, 29], ylim=[0, 1], grid=True, xlabel="Epoch",
    style=["r--", "r--.", "b-", "b-*"])
plt.legend(loc="lower left")  # extra code

plt.show()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[31], line 5
      2 import pandas as pd
      3 import numpy as np
----> 5 pd.DataFrame(history.history).plot(
      6     figsize=(8, 5), xlim=[0, 29], ylim=[0, 1], grid=True, xlabel="Epoch",
      7     style=["r--", "r--.", "b-", "b-*"])
      8 plt.legend(loc="lower left")  # extra code
     10 plt.show()

NameError: name 'history' is not defined
model.evaluate(X_test, y_test)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[32], line 1
----> 1 model.evaluate(X_test, y_test)

NameError: name 'model' is not defined

Using the model to make predictions#

X_new = X_test[:3]
y_proba = model.predict(X_new)
y_proba.round(2)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[33], line 2
      1 X_new = X_test[:3]
----> 2 y_proba = model.predict(X_new)
      3 y_proba.round(2)

NameError: name 'model' is not defined
y_pred = y_proba.argmax(axis=-1)
y_pred
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[34], line 1
----> 1 y_pred = y_proba.argmax(axis=-1)
      2 y_pred

NameError: name 'y_proba' is not defined
np.array(class_names)[y_pred]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[35], line 1
----> 1 np.array(class_names)[y_pred]

NameError: name 'y_pred' is not defined
y_new = y_test[:3]
y_new
array([1, 0, 2])
plt.figure(figsize=(7.2, 2.4))
for index, image in enumerate(X_new):
    plt.subplot(1, 3, index + 1)
    plt.imshow(image, cmap="binary", interpolation="nearest")
    plt.axis('off')
    plt.title(class_names[y_test[index]])
plt.subplots_adjust(wspace=0.2, hspace=0.5)

plt.show()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[37], line 4
      2 for index, image in enumerate(X_new):
      3     plt.subplot(1, 3, index + 1)
----> 4     plt.imshow(image, cmap="binary", interpolation="nearest")
      5     plt.axis('off')
      6     plt.title(class_names[y_test[index]])

File /opt/anaconda3/envs/ML4Climate2025/lib/python3.8/site-packages/matplotlib/pyplot.py:2695, in imshow(X, cmap, norm, aspect, interpolation, alpha, vmin, vmax, origin, extent, interpolation_stage, filternorm, filterrad, resample, url, data, **kwargs)
   2689 @_copy_docstring_and_deprecators(Axes.imshow)
   2690 def imshow(
   2691         X, cmap=None, norm=None, *, aspect=None, interpolation=None,
   2692         alpha=None, vmin=None, vmax=None, origin=None, extent=None,
   2693         interpolation_stage=None, filternorm=True, filterrad=4.0,
   2694         resample=None, url=None, data=None, **kwargs):
-> 2695     __ret = gca().imshow(
   2696         X, cmap=cmap, norm=norm, aspect=aspect,
   2697         interpolation=interpolation, alpha=alpha, vmin=vmin,
   2698         vmax=vmax, origin=origin, extent=extent,
   2699         interpolation_stage=interpolation_stage,
   2700         filternorm=filternorm, filterrad=filterrad, resample=resample,
   2701         url=url, **({"data": data} if data is not None else {}),
   2702         **kwargs)
   2703     sci(__ret)
   2704     return __ret

File /opt/anaconda3/envs/ML4Climate2025/lib/python3.8/site-packages/matplotlib/__init__.py:1446, in _preprocess_data.<locals>.inner(ax, data, *args, **kwargs)
   1443 @functools.wraps(func)
   1444 def inner(ax, *args, data=None, **kwargs):
   1445     if data is None:
-> 1446         return func(ax, *map(sanitize_sequence, args), **kwargs)
   1448     bound = new_sig.bind(ax, *args, **kwargs)
   1449     auto_label = (bound.arguments.get(label_namer)
   1450                   or bound.kwargs.get(label_namer))

File /opt/anaconda3/envs/ML4Climate2025/lib/python3.8/site-packages/matplotlib/axes/_axes.py:5663, in Axes.imshow(self, X, cmap, norm, aspect, interpolation, alpha, vmin, vmax, origin, extent, interpolation_stage, filternorm, filterrad, resample, url, **kwargs)
   5655 self.set_aspect(aspect)
   5656 im = mimage.AxesImage(self, cmap=cmap, norm=norm,
   5657                       interpolation=interpolation, origin=origin,
   5658                       extent=extent, filternorm=filternorm,
   5659                       filterrad=filterrad, resample=resample,
   5660                       interpolation_stage=interpolation_stage,
   5661                       **kwargs)
-> 5663 im.set_data(X)
   5664 im.set_alpha(alpha)
   5665 if im.get_clip_path() is None:
   5666     # image does not already have clipping set, clip to axes patch

File /opt/anaconda3/envs/ML4Climate2025/lib/python3.8/site-packages/matplotlib/image.py:710, in _ImageBase.set_data(self, A)
    706     self._A = self._A[:, :, 0]
    708 if not (self._A.ndim == 2
    709         or self._A.ndim == 3 and self._A.shape[-1] in [3, 4]):
--> 710     raise TypeError("Invalid shape {} for image data"
    711                     .format(self._A.shape))
    713 if self._A.ndim == 3:
    714     # If the input data has values outside the valid range (after
    715     # normalisation), we issue a warning and then clip X to the bounds
    716     # - otherwise casting wraps extreme values, hiding outliers and
    717     # making reliable interpretation impossible.
    718     high = 255 if np.issubdtype(self._A.dtype, np.integer) else 1

TypeError: Invalid shape (4,) for image data
../../_images/79498dbdc1479c9db70ebb2d71d0bee5cc4f80ecf474d7f20f0efb7ee12b8dd7.png