Artificial Neural Networks (ANN)#
Neural networks are machine learning models that can be used to approximate any non-linear function. In this tutorial, we will first go through an example of how neural networks can be implemented in tensorflow (based on this notebook from Géron), and then look at an example of applying a neural network to a climate data set.
Implementing MLP’s with sci-kit learn#
First, we will look at using sci-kit learn
to implement a Multi-layer Perceptron (MLP) on the Iris data set.
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
The Iris data set is a simple example data that includes values for 4 features:
-sepal length (cm)
-sepal width (cm)
-petal length (cm)
-petal width (cm)
for labeled data sets of 3 different iris types [‘setosa’, ‘versicolor’, ‘virginica’]. The problem is to classify each sample as belonging to one of these 3 iris types, based on the values of these 4 features.
iris = load_iris()
X_train_full, X_test, y_train_full, y_test = train_test_split(
iris.data, iris.target, test_size=0.1, random_state=42)
X_train, X_valid, y_train, y_valid = train_test_split(
X_train_full, y_train_full, test_size=0.1, random_state=42)
In the sci-kit learn
implementation, we need to specify whether we are training an MLP for classification or regression. Here we use the MLPClassifier
since we are interested in classifying each sample as belonging to one of the 3 iris types.
We can specify a number of hyper-parameters for the MLPClassifier:
hidden_layer_sizes : the number of neurons in the ith hidden layer
activation: the activation function to use. The default is relu
max_iter : the maximum number of iterations
etc.
mlp_clf = MLPClassifier(hidden_layer_sizes=[5], max_iter=10_000,
random_state=42)
pipeline = make_pipeline(StandardScaler(), mlp_clf)
pipeline.fit(X_train, y_train)
accuracy = pipeline.score(X_valid, y_valid)
accuracy
1.0
mlp_clf
MLPClassifier(hidden_layer_sizes=[5], max_iter=10000, random_state=42)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
MLPClassifier(hidden_layer_sizes=[5], max_iter=10000, random_state=42)
Implementing MLP’s with Keras/Tensorflow#
We would generally not use sci-kit learn
to implement a neural network, because we don’t have nearly as much flexibility to customize our deep learning models with this library as we do with libraries that are specifically developed for deep learning. Instead, we will use Tensorflow
with its Keras
backend as a deep learning library.
To start, we will use a simple example of using the fashion MNIST data set (this is a simple example data set that includes examples of images of different items of clothing labeled by the type of clothing). We will just use this as an example for how one sets up and trains a machine learning model in Tensorflow
import tensorflow as tf
import warnings
warnings.filterwarnings('ignore')
import matplotlib.pyplot as plt
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[7], line 1
----> 1 import tensorflow as tf
2 import warnings
3 warnings.filterwarnings('ignore')
ModuleNotFoundError: No module named 'tensorflow'
fashion_mnist = tf.keras.datasets.fashion_mnist.load_data()
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist
X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]
X_valid, y_valid = X_train_full[-5000:], y_train_full[-5000:]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[8], line 1
----> 1 fashion_mnist = tf.keras.datasets.fashion_mnist.load_data()
3 (X_train_full, y_train_full), (X_test, y_test) = fashion_mnist
4 X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]
NameError: name 'tf' is not defined
The training set contains 60,000 grayscale images, each 28x28 pixels.
X_train.shape
(121, 4)
Each pixel intensity is represented as a byte (0 to 255):
X_train.dtype
dtype('float64')
Let’s scale the pixel intensities down to the 0-1 range and convert them to floats, dividing by 255:
X_train, X_valid, X_test = X_train / 255., X_valid / 255., X_test / 255.
plt.imshow(X_train[0], cmap="binary")
plt.axis('off')
plt.show()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[12], line 1
----> 1 plt.imshow(X_train[0], cmap="binary")
2 plt.axis('off')
3 plt.show()
NameError: name 'plt' is not defined
The labels are the class IDS, from 0 to 9:
y_train
array([0, 2, 1, 2, 2, 0, 0, 1, 0, 0, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1, 0,
0, 0, 2, 2, 0, 1, 2, 0, 2, 1, 0, 0, 1, 2, 0, 2, 1, 1, 2, 1, 0, 1,
1, 1, 0, 0, 2, 0, 2, 1, 1, 2, 1, 0, 0, 0, 1, 0, 2, 2, 0, 1, 1, 1,
0, 0, 2, 2, 1, 0, 1, 1, 0, 2, 0, 1, 2, 0, 2, 2, 2, 0, 2, 0, 0, 1,
2, 2, 1, 2, 0, 1, 0, 1, 1, 0, 1, 2, 2, 2, 1, 1, 0, 1, 0, 2, 0, 2,
1, 0, 0, 2, 2, 2, 2, 0, 0, 2, 2])
The corresponding class names are
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
"Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]
n_rows = 4
n_cols = 10
plt.figure(figsize=(n_cols * 1.2, n_rows * 1.2))
for row in range(n_rows):
for col in range(n_cols):
index = n_cols * row + col
plt.subplot(n_rows, n_cols, index + 1)
plt.imshow(X_train[index], cmap="binary", interpolation="nearest")
plt.axis('off')
plt.title(class_names[y_train[index]])
plt.subplots_adjust(wspace=0.2, hspace=0.5)
plt.show()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[15], line 3
1 n_rows = 4
2 n_cols = 10
----> 3 plt.figure(figsize=(n_cols * 1.2, n_rows * 1.2))
4 for row in range(n_rows):
5 for col in range(n_cols):
NameError: name 'plt' is not defined
Creating the model using the sequential API#
tf.random.set_seed(42)
model = tf.keras.Sequential()
model.add(tf.keras.layers.InputLayer(input_shape=[28, 28]))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(300, activation="relu"))
model.add(tf.keras.layers.Dense(100, activation="relu"))
model.add(tf.keras.layers.Dense(10, activation="softmax"))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[16], line 1
----> 1 tf.random.set_seed(42)
2 model = tf.keras.Sequential()
3 model.add(tf.keras.layers.InputLayer(input_shape=[28, 28]))
NameError: name 'tf' is not defined
tf.keras.backend.clear_session()
tf.random.set_seed(42)
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=[28, 28]),
tf.keras.layers.Dense(300, activation="relu"),
tf.keras.layers.Dense(100, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax")
])
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[17], line 1
----> 1 tf.keras.backend.clear_session()
2 tf.random.set_seed(42)
4 model = tf.keras.Sequential([
5 tf.keras.layers.Flatten(input_shape=[28, 28]),
6 tf.keras.layers.Dense(300, activation="relu"),
7 tf.keras.layers.Dense(100, activation="relu"),
8 tf.keras.layers.Dense(10, activation="softmax")
9 ])
NameError: name 'tf' is not defined
model.summary()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[18], line 1
----> 1 model.summary()
NameError: name 'model' is not defined
model.layers
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[19], line 1
----> 1 model.layers
NameError: name 'model' is not defined
hidden1 = model.layers[1]
hidden1.name
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[20], line 1
----> 1 hidden1 = model.layers[1]
2 hidden1.name
NameError: name 'model' is not defined
model.get_layer('dense') is hidden1
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[21], line 1
----> 1 model.get_layer('dense') is hidden1
NameError: name 'model' is not defined
weights, biases = hidden1.get_weights()
weights
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[22], line 1
----> 1 weights, biases = hidden1.get_weights()
2 weights
NameError: name 'hidden1' is not defined
weights.shape
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[23], line 1
----> 1 weights.shape
NameError: name 'weights' is not defined
biases
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[24], line 1
----> 1 biases
NameError: name 'biases' is not defined
biases.shape
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[25], line 1
----> 1 biases.shape
NameError: name 'biases' is not defined
Compiling the model#
model.compile(loss="sparse_categorical_crossentropy",
optimizer="sgd",
metrics=["accuracy"])
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[26], line 1
----> 1 model.compile(loss="sparse_categorical_crossentropy",
2 optimizer="sgd",
3 metrics=["accuracy"])
NameError: name 'model' is not defined
model.compile(loss=tf.keras.losses.sparse_categorical_crossentropy,
optimizer=tf.keras.optimizers.SGD(),
metrics=[tf.keras.metrics.sparse_categorical_accuracy])
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[27], line 1
----> 1 model.compile(loss=tf.keras.losses.sparse_categorical_crossentropy,
2 optimizer=tf.keras.optimizers.SGD(),
3 metrics=[tf.keras.metrics.sparse_categorical_accuracy])
NameError: name 'model' is not defined
Training and evaluating the model#
history = model.fit(X_train, y_train, epochs=30,
validation_data=(X_valid, y_valid))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[28], line 1
----> 1 history = model.fit(X_train, y_train, epochs=30,
2 validation_data=(X_valid, y_valid))
NameError: name 'model' is not defined
history.params
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[29], line 1
----> 1 history.params
NameError: name 'history' is not defined
print(history.epoch)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[30], line 1
----> 1 print(history.epoch)
NameError: name 'history' is not defined
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
pd.DataFrame(history.history).plot(
figsize=(8, 5), xlim=[0, 29], ylim=[0, 1], grid=True, xlabel="Epoch",
style=["r--", "r--.", "b-", "b-*"])
plt.legend(loc="lower left") # extra code
plt.show()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[31], line 5
2 import pandas as pd
3 import numpy as np
----> 5 pd.DataFrame(history.history).plot(
6 figsize=(8, 5), xlim=[0, 29], ylim=[0, 1], grid=True, xlabel="Epoch",
7 style=["r--", "r--.", "b-", "b-*"])
8 plt.legend(loc="lower left") # extra code
10 plt.show()
NameError: name 'history' is not defined
model.evaluate(X_test, y_test)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[32], line 1
----> 1 model.evaluate(X_test, y_test)
NameError: name 'model' is not defined
Using the model to make predictions#
X_new = X_test[:3]
y_proba = model.predict(X_new)
y_proba.round(2)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[33], line 2
1 X_new = X_test[:3]
----> 2 y_proba = model.predict(X_new)
3 y_proba.round(2)
NameError: name 'model' is not defined
y_pred = y_proba.argmax(axis=-1)
y_pred
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[34], line 1
----> 1 y_pred = y_proba.argmax(axis=-1)
2 y_pred
NameError: name 'y_proba' is not defined
np.array(class_names)[y_pred]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[35], line 1
----> 1 np.array(class_names)[y_pred]
NameError: name 'y_pred' is not defined
y_new = y_test[:3]
y_new
array([1, 0, 2])
plt.figure(figsize=(7.2, 2.4))
for index, image in enumerate(X_new):
plt.subplot(1, 3, index + 1)
plt.imshow(image, cmap="binary", interpolation="nearest")
plt.axis('off')
plt.title(class_names[y_test[index]])
plt.subplots_adjust(wspace=0.2, hspace=0.5)
plt.show()
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[37], line 4
2 for index, image in enumerate(X_new):
3 plt.subplot(1, 3, index + 1)
----> 4 plt.imshow(image, cmap="binary", interpolation="nearest")
5 plt.axis('off')
6 plt.title(class_names[y_test[index]])
File /opt/anaconda3/envs/ML4Climate2025/lib/python3.8/site-packages/matplotlib/pyplot.py:2695, in imshow(X, cmap, norm, aspect, interpolation, alpha, vmin, vmax, origin, extent, interpolation_stage, filternorm, filterrad, resample, url, data, **kwargs)
2689 @_copy_docstring_and_deprecators(Axes.imshow)
2690 def imshow(
2691 X, cmap=None, norm=None, *, aspect=None, interpolation=None,
2692 alpha=None, vmin=None, vmax=None, origin=None, extent=None,
2693 interpolation_stage=None, filternorm=True, filterrad=4.0,
2694 resample=None, url=None, data=None, **kwargs):
-> 2695 __ret = gca().imshow(
2696 X, cmap=cmap, norm=norm, aspect=aspect,
2697 interpolation=interpolation, alpha=alpha, vmin=vmin,
2698 vmax=vmax, origin=origin, extent=extent,
2699 interpolation_stage=interpolation_stage,
2700 filternorm=filternorm, filterrad=filterrad, resample=resample,
2701 url=url, **({"data": data} if data is not None else {}),
2702 **kwargs)
2703 sci(__ret)
2704 return __ret
File /opt/anaconda3/envs/ML4Climate2025/lib/python3.8/site-packages/matplotlib/__init__.py:1446, in _preprocess_data.<locals>.inner(ax, data, *args, **kwargs)
1443 @functools.wraps(func)
1444 def inner(ax, *args, data=None, **kwargs):
1445 if data is None:
-> 1446 return func(ax, *map(sanitize_sequence, args), **kwargs)
1448 bound = new_sig.bind(ax, *args, **kwargs)
1449 auto_label = (bound.arguments.get(label_namer)
1450 or bound.kwargs.get(label_namer))
File /opt/anaconda3/envs/ML4Climate2025/lib/python3.8/site-packages/matplotlib/axes/_axes.py:5663, in Axes.imshow(self, X, cmap, norm, aspect, interpolation, alpha, vmin, vmax, origin, extent, interpolation_stage, filternorm, filterrad, resample, url, **kwargs)
5655 self.set_aspect(aspect)
5656 im = mimage.AxesImage(self, cmap=cmap, norm=norm,
5657 interpolation=interpolation, origin=origin,
5658 extent=extent, filternorm=filternorm,
5659 filterrad=filterrad, resample=resample,
5660 interpolation_stage=interpolation_stage,
5661 **kwargs)
-> 5663 im.set_data(X)
5664 im.set_alpha(alpha)
5665 if im.get_clip_path() is None:
5666 # image does not already have clipping set, clip to axes patch
File /opt/anaconda3/envs/ML4Climate2025/lib/python3.8/site-packages/matplotlib/image.py:710, in _ImageBase.set_data(self, A)
706 self._A = self._A[:, :, 0]
708 if not (self._A.ndim == 2
709 or self._A.ndim == 3 and self._A.shape[-1] in [3, 4]):
--> 710 raise TypeError("Invalid shape {} for image data"
711 .format(self._A.shape))
713 if self._A.ndim == 3:
714 # If the input data has values outside the valid range (after
715 # normalisation), we issue a warning and then clip X to the bounds
716 # - otherwise casting wraps extreme values, hiding outliers and
717 # making reliable interpretation impossible.
718 high = 255 if np.issubdtype(self._A.dtype, np.integer) else 1
TypeError: Invalid shape (4,) for image data
