Deep Learning: Neural Network in TensorFlow

2023-06-12 skanto Comments 0 Comment

Model

Each pair of input data and the desired answer is called an example.
With the help of the examples, the training process produces the automatically discovered
rules.
A human engineer provides a blueprint for the rules at the outside of training. The
blueprint is encapsulated in a model which forms a hypothesis space for the rules the
machine may possibly learn.
Models vary in terms of how many layers the neural netowrk consists of, what types of
layers they are, and how they are wired together.
With the training data and the model architecture, the training process produces the learned rules, encapsulated in a trained model.
Training Phase -> Inference Phase

Neural network and Deep learning

Neural networks are a subfield of machine learning, one in which the transformation of
the data representation is done by a system with an architecture loosely inspired by
how neurons are connected in human and animal brains.
A frequently encountered theme of neuronal connection is the layer organization.
Many parts of the mammalian brain are organized in a layered fashion. Examples include
the retina, the cerebral cortex(대뇌피질), and the cerebellar cortex(소뇌피질).
Neural network layers are different from pure mathematical functions in that they are generally stateful.
- layer’s memory is captured in its weights.
- weight: a set of numerical values that belong to the layer and govern the details of
  how each input representation is transformed by the layer into an output
  representation.
When a neural network is trained through exposure to training data, the weights get
altered systematically in a way that minimizes a certain value called the loss function.
Generally, backpropagation in a neural network computes the gradient of the loss function with respect to the weights of the network for single input or output.
Basically, a dense layer is used for changing the dimension of the vectors by using
every neuron.
Activation: In neural networks, the activation function is a function that is used for
the transformation of the input values of neurons. Basically, it introduces the
non-linearity into the networks of neural networks so that the networks can learn the
relationship between the input and output values.
Deep Learning is the study and application of deep neural networks, which are, quite
simply, neural networks with many layers(typically, from a dozon to hundreds of layers)
Deep learning(layered representation learning) vs. Shallow learning
Feature Engineering
- Deep learning automates this features engineering
- with deep learning, you learn all features in one pass rather than having to engineer them yourself.
Two essential characteristics
- the incremental, layer-by-layer way in which increasingly complex representations are developed
- the fact that these intermediate incremental representations are learned jointly, each
  layer being updated to folow both the representatioinal needs of the layer above and the needs
  of the layer below.
CUDA(2007): Computer Unified Device Architecture
If hardware and algorithms are the steam engine of the deep-learning revolution, then
data is its coal.
TensorFlow was made open source in November 2015 by a team of engineers working on deep learning at Google.
- data representations called tensors flow through layers and other data-processing nodes,
  allowing inference and training to happen on machine-learning models.
- tensor: multidimensional array
  : In neural networks and deep learning, every piece of data and every computation result
  is represented as a tensor.
- Each tensor has two basic properties: the data type(such as float32 or int32) and the shape
- The tensor is the lingua franca of deep-learning models.
Tensorflow and Keras form an ecosystem that leads the field of deep-learning frameworks in
terms of industrial and academic adoption.
deeplearn.js: released 2017.09

Layer: a data processing module

You can think of as a tunable function from tensors to tensors.

the kernel and bias = weights

To find a good setting for the kernel and bias we need two things
1. a measure that tells us how well we are doing
2. a method to update the weights’ values that next time we will do better than we currently are doing
  according to the measure previously mentioned.

model compilation

a loss function: an error measurement
an optimizer: the algorithm by which the network will update its weights (kernel and bias) based on
the data and the loss function

epoch

each iteration through the complete training set is called an epoch

model’s evaluate method

it is similar to the fit() method in that it calculates the same loss, but evaluate() does not update
the model’s weights.

backpropagation

The directions are critical to the neural network’s learning process. They are determined
by the gradients with respect to the weights and the algorithm for computing the gradients
is called backpropagation
Invented in the 1960s
is one of the foundations of neural networks and deep learning.

gradient of loss

y’ = v * x
loss = square(y’ = y) = square(x * x – y)
how much change in the loss will we get if v is increased by a unit amount

Why do we need gradient?

Because once we have it, we can alter v in the direction opposite to it, so we can get a decrease
in the loss value.

MSE (Mean Squared Error)

If your application might be sensitive to very incorrect outliers, MSE could be better choice than MAE.
Standard transformation or z-score normalization
we will scale our features so that they have zero mean and unit standard deviation.
Refer to this site for more information on zero mean and unit standard deviation.
https://stats.stackexchange.com/questions/305672/what-is-unit-standard-deviation

Adding nonlinearity: Beyond weighted sums

The primary enhancement we will introduce is nonlinearity – a mapping between input and output that
isn’t a simple weighted sum of hte input’s elements.
MLP: Multilayer Perceptron
- an oft-used term that describes neural network that 1) have a simple topology without loops(what
  is referred to as feedforward neural networks) and 2) have a least one hidden layer.
The number of weight parameters for each layer
- This is a count of all the individual numbers that make up the layer’s weights.
Activation Function
- is an element-by-element transform.
Sigmoid function
- is a “squashing” nonlinearity, in the sense that it “squashes” all real values from -infinity to +infinity
  into a much smaller range(0 to +1).

Supiami

The Hidden Life of TREES

Deep Learning: Neural Network in TensorFlow

2023-06-12 skanto Comments 0 Comment

Model

Neural network and Deep learning

Layer: a data processing module

the kernel and bias = weights

model compilation

epoch

model’s evaluate method

backpropagation

gradient of loss

Why do we need gradient?

MSE (Mean Squared Error)

Adding nonlinearity: Beyond weighted sums

답글 남기기 응답 취소