2.8.4 Artificial Neutral Networks

https://miro.medium.com/max/1000/1*Ivhk4q4u8gCvsX7sFy3FsQ.png
Bio Data-Science

🧠 Inspiration

  • Warren McCulloch and Walter Pitts (1943) create a computational model for neural networks
  • 1958 psychologist Frank Rosenblatt invented the perceptron
  • Werbos's (1975) back-propagation algorithm
    enabled practical training of multi-layer networks
  • Alex Krizhevsky shows that deep neural networks can drastically outperform shallow networks
https://towardsdatascience.com/everything-you-need-to-know-about-activation-functions-in-deep-learning-models-84ba9f82c253, https://arstechnica.com/science/2019/12/how-neural-networks-work-and-why-theyve-become-a-big-business/
Bio Data-Science

✍️ Draw any number

https://www.3blue1brown.com/lessons/neural-networks

⌛ 5 minutes

Bio Data-Science

Architecture of a Single Perceptron

Bio Data-Science
🧠 Linear Regression
  • we start from what we already know
  • linear combination of with the weights ()
  • note, that we can write the intercept as
  • in this way, we can describe any linear model
https://joshuagoings.com/2020/05/05/neural-network/
Bio Data-Science
🧠 Logistic Regression
  • we transformed the linear combination with another function to create a non-linear function (e.g., to get the log of the odds)
  • in ANN we call this function activation function
Bio Data-Science

  • in the brain neurons only fire, if the activation
    of the input neurons pass a certain threshold
Bio Data-Science
Activation Function
  • in ANN this behavior is modeled using different functions
Bio Data-Science
🧠 Perceptron

an artificial neuron using a step function as the activation function

  • linear classifier
  • not very powerful
Bio Data-Science

🧠 Multi-layer Perceptron

  • each green and red dot is a perceptron (think neuron)
  • each connection is associated with a weight ()
  • Input Layer: placeholder for the input values ()
  • Output Layer: perceptron for the predicted variable
  • Hidden Layers: more interconnected perceptrons (manny layers = deep learning)
https://machinelearninggeek.com/multi-layer-perceptron-neural-network-using-python/
Bio Data-Science
More detailed view

  • the activation of each neuron depends on the activation of the neurons that come before (feed-forward network)
https://www.stateoftheart.ai/concepts/f83ec537-a447-4727-b1ff-e6dff4363c14
Bio Data-Science
🧠 Hyper-Parameters for MLPs
  • Architecture of the Network
    • Choice of activation functions
    • Number of hidden layers
    • Number of nodes in hidden layers
    • Connectedness of hidden layers (dropout)
  • In- and Outputs
    • Feature selection
    • Representation oft the In- and Outputs (Encoding, Scaling, etc.)
  • Hyper-Parameters
    • Training algorithm
    • Learning rate
Bio Data-Science
Example
  • 10 nodes in the output-layer (one for each class)
  • two hidden layers, fully connected
  • activation function in the output-layer should represent the probability of the class
  • What are the features in the input layer?
https://www.3blue1brown.com/lessons/neural-networks
Bio Data-Science

  • grey-scale values of pixels
  • normalized
https://www.3blue1brown.com/lessons/neural-networks
Bio Data-Science
  • Pixel values in 8-Bit grey-scale
  • Normalized pixel values between 0 and 1
Bio Data-Science
  • Flattened input vector and notation for input layer

... number of observation
... value of node in the th layer

Bio Data-Science

What is the meaning of the output layer?

  • Depending on the activation function of the output layer
  • a vector with values between 0 and 1, that indicate the class
Bio Data-Science
  • Output layer shows high confidence in the th class

... number of observation

Bio Data-Science

Neural Networks as a Back Box

Bio Data-Science

  • What happens in the hidden layers?
  • this is hard to say for humans as the hundreds of weights are hard to interpret
  • as a model, the problem is deconstructed in subproblems
Bio Data-Science

  • something like this happens in deeper, more complicated networks
  • even simple networks, as in this example are not interpretable for humans
Bio Data-Science

🤓 Task

  • given the following input layer
  • the first hidden layer has two nodes
  • write out the matrix multiplication to get
    the values of the hidden layer
  • assume all weights to be 1

, ,

Hint: if You are not sure about the dimensions of the matrix, start by sketching the network first

Bio Data-Science

🤓

Bio Data-Science

🤓Training ANNs

  • If the have the results of the prediction (), we can compare them to the data in the training set
  • and calculate a cost function
Bio Data-Science
🤓 Cost function
  • the first thing that is new, is that we now have a vector of
    multiple predictions and true values (one for each digit)
  • we can still calculate a singe cost value (scalar)
    e.g. by summing up all squared errors
  • for a single observation we know that

    • the error we make depends on the data () and the weights ()
    • for the single observation we can reduce the cost by adjusting the weights
  • We can apply gradient descent again
Bio Data-Science
Backpropagation
  • However, we have a large network weights
  • so we would have to calculate the partial derivative for a
  • very complicated function (we have many layers and activation functions)
  • the solution: Start at the end and adjust the weight layer by layer
Great video: https://www.youtube.com/watch?v=Ilg3gGewQ5U
Bio Data-Science

🧠 MLP in practical application

  • an MLP with a single hidden layer can approximate almost any given function
  • A MLP with a few layers is similar in perfomance to a tree-based model
  • by the large number of hyper parameters
  • and weight they can be hard to train
  • learning curves are important
Bio Data-Science
🧠 ANN with sklearn
clf = MLPClassifier(hidden_layer_size = [30,30], # two hidden layers with 30 nodes each
                     activation ="logistic" ,
                     solver = "sgd", # stochastic gradient descent
                     learning_rate = 0.1, 
                     max_iter=300).fit(X_train, y_train)
Bio Data-Science
ANN with keras and tesorflow
# define the keras model
model = Sequential()
model.add(Dense(12, input_shape=(8,), activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
Bio Data-Science

Deep Learning

Bio Data-Science
Deep Learning in practical application
  • Framework like fastai make it easy to use pre-trained model without dealing with the details (tensorflow, pytorch, keras)
  • Training large model on high resolution data is expensive and works more efficient with GPUs (graphical processing units) and specialized drivers

See notebook 9 - Image Classification with Deep Learning

Bio Data-Science