Skip to document

Unit 5. Representation Learning

Dl notes
Course

BE IT (2019) (414442)

234 Documents
Students shared 234 documents in this course
Academic year: 2023/2024
Uploaded by:
Anonymous Student
This document has been uploaded by a student, just like you, who decided to remain anonymous.
Gujarat Technological University

Comments

Please sign in or register to post comments.

Preview text

Unit 5. Representation Learning

Need of Representation Learning

Assume you’re developing a machine-learning algorithm to predict dog breeds based on pictures. Because image data provides all of the answers, the engineer must rely heavily on it when developing the algorithm. Each observation or feature in the data describes the qualities of the dogs. The machine learning system that predicts the outcome must comprehend how each attribute interacts with other outcomes such as Pug, Golden Retriever, and so on.

As a result, if there is any noise or irregularity in the input, the result can be drastically different, which is a risk with most machine learning algorithms. The majority of machine learning algorithms have only a basic understanding of the data. So in such cases, the solution is to provide a more abstract representation of data. It’s impossible to tell which features should be extracted for many tasks. This is where the concept of representation learning takes shape.

What is Representation Learning?

Representation learning is a class of machine learning approaches that allow a system to discover the representations required for feature detection or classification from raw data. The requirement for manual feature engineering is reduced by allowing a machine to learn the features and apply them to a given activity.

In representation learning, data is sent into the machine, and it learns the representation on its own. It is a way of determining a data representation of the features, the distance function, and the similarity function that determines how the predictive model will perform. Representation learning works by reducing high-dimensional data to low-dimensional data, making it easier to discover patterns and anomalies while also providing a better understanding of the data’s overall behaviour.

Methods / Types of Representation Learning

We must employ representation learning to ensure that the model provides invariant and untangled outcomes in order to increase its accuracy and performance. In this section, we’ll look at how representation learning can improve the model’s performance in three different learning frameworks: supervised learning, unsupervised learning.

  1. Supervised Learning

This is referred to as supervised learning when the ML or DL model maps the input X to the output Y. The computer tries to correct itself by comparing model output to ground truth, and

the learning process optimizes the mapping from input to output. This process is repeated until the optimization function reaches global minima.

Even when the optimization function reaches the global minima, new data does not always perform well, resulting in overfitting. While supervised learning does not necessitate a significant amount of data to learn the mapping from input to output, it does necessitate the learned features. The prediction accuracy can improve by up to 17 percent when the learned attributes are incorporated into the supervised learning algorithm.

Using labelled input data, features are learned in supervised feature learning. Supervised neural networks, multilayer perceptrons, and (supervised) dictionary learning are some examples.

  1. Unsupervised Learning

Unsupervised learning is a sort of machine learning in which the labels are ignored in favour of the observation itself. Unsupervised learning isn’t used for classification or regression; instead, it’s used to uncover underlying patterns, cluster data, denoise it, detect outliers, and decompose data, among other things.

When working with data x, we must be very careful about whatever features z we use to ensure that the patterns produced are accurate. It has been observed that having more data does not always imply having better representations. We must be careful to develop a model that is both flexible and expressive so that the extracted features can convey critical information.

Unsupervised feature learning learns features from unlabeled input data by following the methods such as Dictionary learning, independent component analysis, autoencoders, matrix factorization, and various forms of clustering are among examples.

Greedy Layer-Wise Pretraining

While training a neural network, we make a forward propagation and calculate the cost of the neural network and later use the total calculated cost to backpropagate through the layers and update the weights; this process is done until the global minima is obtained. But this technique is efficient on small neural networks(neural networks with a small number of hidden layers).

In the case of large neural networks, traditional training methods are not efficient as it leads to vanishing gradient problem.

Vanishing gradient problem: All the layers using definite activation functions are added to neural networks. The gradients of the loss function tend to zero while backpropagating, making the network hard to train.

Due to this problem, the weights near the input layer will not be updated; only the weights near the output layer get updated.

Transfer Learning :

The ability of a system to recognize and apply knowledge and skills learned in previous tasks

to novel tasks or new domains, which share some commonality.

Why Transfer Learning?

 In some domains, labeled data are in short supply.  In some domains, the labeling cost is very expensive.

 In some domains, the learning process is time consuming

What is Transfer Learning (TL)? :

A major assumption in many machine learning and data mining algorithms is that the training

and future data must be in the same feature space and have the same distribution.

However, in many real-world applications, this assumption may not hold. For example, we

sometimes have a classification task in one domain of interest, but we only have sufficient

training data in another domain of interest, where the latter data may be in a different feature

space or follow a different data distribution.

In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data-labeling efforts.

Approaches to Transfer Learning :

TL Applications:

 Sensor-Network-based Localization,  Text Classification  Image Classification  Video Classification  Social Network Analysis  Logical Inference.

problem in domain adaptation is learning more robust and higher-level feature

representations to reduce the distribution divergences.

Recently, deep learning methods based on autoencoders have been successfully applied in

representation learning for domain adaptation. However, most existing methods of autoencoders rely on the single autoencoder model, which poses challenges for learning

different characteristics of data.

Distributed Representations

Distributed representations of concepts are representations composed of many elements that can be set separately from each other

  • They are one of the most important tools for representation learning

  • They are powerful because they can use n features with k values each to describe kn

different concepts

In deep learning, a distributed representation refers to a way of representing information in which each piece of the representation is distributed across many different neurons in the network. This is in contrast to a localist representation, where each concept or idea is represented by a single, specific neuron.

Distributed representations have several advantages over localist representations. For one, they are more flexible, because they can represent many different concepts using the same set of neurons. This allows the network to learn multiple concepts simultaneously, which can improve its overall performance. Additionally, distributed representations are more robust, because they are not reliant on a single neuron to represent a concept. If one neuron fails, the network can still use the other neurons to represent the concept.

Distributed representations are commonly used in deep learning networks, where they can help the network to learn complex, abstract concepts. For example, in a network that is trained to recognize images of objects, the network might learn a distributed representation of the concept of a "cat" that is spread across many different neurons. This allows the network to recognize a cat in many different contexts and situations, and to make accurate predictions even if the cat is partially obscured or in a different pose.

Variants of CNN: DenseNet

DenseNet :

Dense net is densely connected-convolutional networks. It is very similar to a ResNet with

some-fundamental differences. ResNet is using an additive method that means they take a

previous output as an input for a future layer, & in DenseNet takes all previous output as an

input for a future layer as shown in the above image.

Why do we need DenseNets?

So DenseNet was specially developed to improve accuracy caused by the vanishing gradient

in high-level neural networks due to the long distance between input and output layers & the

information vanishes before reaching its destination.

So suppose we have a capital L number of layers, In a typical network with L layers, there

will be L connections, that is, connections between the layers. However, in a DenseNet, there

will be about L and L plus one by two connections L(L+1)/2. So in a dense net, we have less

number of layers than the other model, so here we can train more than 100 layers of the

model very easily by using this technique.

DenseBlocks And Layers

Important concept in DenseNet.

 Growth rate – This determines the number of feature maps output into individual layers

inside dense blocks.

 Dense connectivity – By dense connectivity, we mean that within a dense block each layer

gets us input feature maps from the previous layer as seen in this figure.

 Composite functions – So the sequence of operations inside a layer goes as follows. So we

have batch normalization, followed by an application of Relu, and then a convolution layer

(that will be one convolution layer)

 Transition layers – The transition layers aggregate the feature maps from a dense block and

reduce its dimensions. So Max Pooling is enabled here.

Was this document helpful?

Unit 5. Representation Learning

Course: BE IT (2019) (414442)

234 Documents
Students shared 234 documents in this course
Was this document helpful?
Unit 5. Representation Learning
Need of Representation Learning
Assume you’re developing a machine-learning algorithm to predict dog breeds based on
pictures. Because image data provides all of the answers, the engineer must rely heavily on it
when developing the algorithm. Each observation or feature in the data describes the qualities
of the dogs. The machine learning system that predicts the outcome must comprehend how
each attribute interacts with other outcomes such as Pug, Golden Retriever, and so on.
As a result, if there is any noise or irregularity in the input, the result can be drastically
different, which is a risk with most machine learning algorithms. The majority of machine
learning algorithms have only a basic understanding of the data. So in such cases, the solution
is to provide a more abstract representation of data. It’s impossible to tell which features
should be extracted for many tasks. This is where the concept of representation learning takes
shape.
What is Representation Learning?
Representation learning is a class of machine learning approaches that allow a system to
discover the representations required for feature detection or classification from raw data.
The requirement for manual feature engineering is reduced by allowing a machine to learn
the features and apply them to a given activity.
In representation learning, data is sent into the machine, and it learns the representation on its
own. It is a way of determining a data representation of the features, the distance function,
and the similarity function that determines how the predictive model will perform.
Representation learning works by reducing high-dimensional data to low-dimensional data,
making it easier to discover patterns and anomalies while also providing a better
understanding of the data’s overall behaviour.
Methods / Types of Representation Learning
We must employ representation learning to ensure that the model provides invariant and
untangled outcomes in order to increase its accuracy and performance. In this section, we’ll
look at how representation learning can improve the models performance in three different
learning frameworks: supervised learning, unsupervised learning.
1. Supervised Learning
This is referred to as supervised learning when the ML or DL model maps the input X to the
output Y. The computer tries to correct itself by comparing model output to ground truth, and