Multi-level Data Representation For Training Deep Helmholtz Machines Jose Miguel Ramos jose.miguel.ramostecnico.ulisboa.pt

2025-05-02 0 0 1.1MB 24 页 10玖币
侵权投诉
Multi-level Data Representation For Training
Deep Helmholtz Machines
Jose Miguel Ramos jose.miguel.ramos@tecnico.ulisboa.pt
Luis Sa-Couto luis.sa.couto@tecnico.ulisboa.pt
Andreas Wichert andreas.wichert@tecnico.ulisboa.pt
Department of Computer Science and Engineering, INESC-ID & Instituto Superior
T´ecnico, University of Lisbon, 2744-016 Porto Salvo, Portugal
Abstract. A vast majority of the current research in the field of Ma-
chine Learning is done using algorithms with strong arguments point-
ing to their biological implausibility such as Backpropagation, deviating
the field’s focus from understanding its original organic inspiration to a
compulsive search for optimal performance. Yet, there have been a few
proposed models that respect most of the biological constraints present
in the human brain and are valid candidates for mimicking some of its
properties and mechanisms. In this paper, we will focus on guiding the
learning of a biologically plausible generative model called the Helmholtz
Machine in complex search spaces using a heuristic based on the Human
Image Perception mechanism. We hypothesize that this model’s learn-
ing algorithm is not fit for Deep Networks due to its Hebbian-like local
update rule, rendering it incapable of taking full advantage of the com-
positional properties that multi-layer networks provide. We propose to
overcome this problem, by providing the network’s hidden layers with
visual queues at different resolutions using a Multi-level Data represen-
tation. The results on several image datasets showed the model was able
to not only obtain better overall quality but also a wider diversity in the
generated images, corroborating our intuition that using our proposed
heuristic allows the model to take more advantage of the network’s depth
growth. More importantly, they show the unexplored possibilities under-
lying brain-inspired models and techniques.
Keywords: Helmholtz Machine ·Biologically-inspired Models ·Deep
Learning ·Generative Models ·Hebbian Learning ·Wake-Sleep.
1 Introduction
Most recent machine learning models have shown great effectiveness at solving a
wide range of complex cognitive tasks [40, 27], and back-propagation algorithms
seem to be at the core of the majority of those models, proving it to be one of the
most reliable and fast ways for machines to learn [4, 31, 29]. Visual pattern recog-
nition is one of the many fields in which back-propagation algorithms thrive [38,
27, 18]. The evolution of these models’ quality has been impressively swift, but
as we get closer to perfection, the possible improvements get evermore difficult
arXiv:2210.14855v1 [cs.LG] 26 Oct 2022
2 J. M. Ramos, L. Sa-Couto, A. Wichert
[2]. For some of the more simple visual tasks like image classification of hand-
written digits in the famous MNIST dataset [28], these models have surpassed
the brain capabilities, performing better than human participants [2, 9].
The surpassing of the human brain’s accuracy is an amazing scientific mark
and allows for more reliable and robust technology.
In the midst of this search for better and more powerful models, grew a firmer
and firmer connection between the two concepts of intelligence and accuracy.
We seem to have been intuitively led to the conclusion that the better a model
performs at a certain task, the more intelligent it is. In a sense, we deviate from
trying to mimic the brain’s biological way of processing information and focus
instead on neural network models that perform better [31, 26].
Nonetheless, even if there are models that compete with the human brain at
performing specific tasks, there is no model that comes close to the robustness
and flexibility of the human brain when dealing with general image classification
and pattern recognition problems.
Therefore, a large part of the scientific community is still focused on the bi-
ologically plausible side of machine learning, proposing new competitive models
that remain an arguably plausible implementation of some human brain mech-
anisms and properties [26, 22, 5, 34, 10].
1.1 Back-propagation’s Biological Plausibility
Despite the obvious biological inspiration of the Back-propagation (Backprop)
algorithm [31, 32], its biological plausibility has been questioned very early on
from its appearance [11, 36]. In recent years, although there have been many
attempts to create biologically plausible and empirically powerful learning algo-
rithms similar to Backprop [30, 4, 26], there is an overall consensus that some
fundamental properties of back-propagation are too difficult for the human brain
to implement [22, 31].
The first and most relevant argument is related to the fact that backprop
synaptic weight updates depend on computations and activation on an entire
chain of neurons whereas biological synapses change their connection strength
solely based on local signals. Furthermore, for this Gradient-based algorithm to
work, biological neurons’ updates would have to be frozen in time waiting for
the signal to reach its final destination where the error comparison is made, and
only after the signal travels backwards the membrane permeability would be
changed in accordance to its success or failure [5].
The second is the fact that back-propagation uses the same weights when
performing forward and backwards passes, which would require identical bidi-
rectional connections in biological neurons that are not present in all parts of
the brain.
And lastly, the fact backprop networks propagate firing probabilities, whereas
biological neurons only propagate neuron spikes [40].
Multi-level Data Representation For Training Deep Helmholtz Machines 3
1.2 Helmholtz Machines’ Biological Inspiration
We propose to look at an older Generative model called Helmholtz Machine
(HM) [12], which uses the Wake-Sleep (WS) algorithm [21] (details in Appendix
A) instead of Back-propagation.
The Wake-Sleep is an unsupervised learning algorithm that uses two different
networks to simultaneously learn a predictive Recognition Model and a genera-
tive Generation Model. Despite not being a completely Hebbian algorithm, its
activation and learning rules are as local as the Hebb rule [33].
Hebbian learning algorithms respect the original proposition made by Hebb
[19], that learning and memory in the brain would arise from increased synaptic
efficacy, triggered by the coordinated firing of the pre- and post-synaptic neurons
[37], and more importantly, they solve the previously mentioned locality problem
because the synaptic weight updates only depend on the previous layer. Thus,
the locality of WS also helps to avoid that problem in a similar way to the
Hebbian rule.
The unsupervised nature of the algorithm, also contributes to its plausibility,
since the human brain’s learning is mostly done with unsupervised data. And un-
like in Back-propagation where it is very difficult to find an implementation that
works by propagating neuron activations instead of firing probabilities, the WS
algorithm can work effectively with both options, solving the third mentioned
back-propagation implausibility argument.
Furthermore, the learning algorithm of these machines is based on the bio-
logical idea of being awake and asleep. Its intuition is that after we experience an
event, we also produce our own variations of those events. This idea can be easily
extrapolated to what happens on a big scale daily, where we experience reality
during our wake phase, and then recreate it in our sleep, but there is a shorter
scale example that perhaps compares better to the actual behavior of the model
that occurs, for example, in the interaction between the human eyes and the
brain. Our brain receives continuous streams of images that our eyes are captur-
ing, and while we are receiving them, we subconsciously try to predict what will
happen in the next frame, and when the reality does not match your expectation,
for example, when a magician pulls a rabbit out of the hat, we become surprised.
The HM network also mimics this behavior, and after receiving an observation
from the world, it will produce a dream, then the network will adjust its weights
in order to create more plausible dreams, and try to reduce the surprise when
experiencing the next event. Likewise, if you see the same magic trick performed
enough times, you will learn to expect what was previously unexpected.
This “reduce of surprise” corresponds to minimizing a quantity very immi-
nent in neuro-scientific research called Free Energy [14, 16], which is “an infor-
mation theory measure that bounds the surprise on sampling some data, given a
generative model” [15]. Thus, the minimization of Free Energy corroborates the
hypothesis that “a biological agent resists the tendency toward disorder through
a minimization of uncertainty” [37, 15, 13] alluded to in the previous example.
4 J. M. Ramos, L. Sa-Couto, A. Wichert
2 Improving Wake-Sleep
In spite of the WS algorithm being interesting from a neuro-scientific perspec-
tive, its’ lack of efficiency [23] and ability to perform as well as other learning
algorithms have led it to be less and less explored in recent years. One of its
biggest disadvantages is that when the complexity of the network increases, the
algorithm’s performance starts to be less impressive. If the complexity of the
world we are trying to mimic increases, our model needs to be able to capture
higher-level abstractions and generalize better, which can be done by increasing
the size of its network [6, 7]. However, by increasing the number of neurons on
a model’s network, the size of the search space also grows. When any model is
searching through the energy surface it can easily get stuck at a sub-optimal
local minima [20], and we believe this is the main problem of the HM with a
large hidden network.
Our proposition to overcome this problem is to provide the algorithm with
a heuristic for it to be more consistently led to optimal solutions.
Heuristics consist of ways to navigate the search space, that guide the algo-
rithm to either find a better solution, find a solution faster, or both. They can
be seen as generic rules that apply to a majority of the cases, allowing the agent
to avoid exploring search paths that seem unpromising.
2.1 Multi-level Data Representation and Human Image Perception
One thing that might help humans understand what they see in a better and
more structured way, is the ability to evaluate a given visual image at different
scales. Many studies point to the fact that the human brain processes visual
inquiries at different resolutions [44, 8]. This multi-level biological visual analysis
could be one of the many keys that enable the human brain to capture the world
it perceives in such a robust and accurate way despite the obvious extreme
complexity of its neural network.
A way to incorporate this multi-level perception into the HM is by using an
Image Pyramid representation of the dataset [35]. The Image Pyramid is a simple
way of having multi-level data representation that enables models to detect
patterns on different scales. It consists of creating lower-level representations of
the original images in a convolutional fashion, reducing an image by a factor
each time, and creating a “sequence of copies of an original image in which both
sample density and resolution are decreased in regular steps” [1], like shown in
Fig. 1. Introducing this data representation to the training of the network would
be in accordance with the high biological plausibility that motivated the interest
in the HM model and by doing so we hope to guide its learning, in a way that first
detects high-level patterns, and then as we add details to the samples, it would
learn more correlations on different scales, acting as a heuristic to overcome the
exponential increase of the search space that inevitably comes with the increase
of the number of hidden layers.
Multi-level Data Representation For Training Deep Helmholtz Machines 5
Fig. 1. Example of an Image Pyramid representation of a handwritten number 5 gen-
erated by continuously downsampling the original image on the left.
2.2 Image Pyramid Heuristic for Helmholtz Machines
One way to guide our model’s training is by configuring its initial position on
the search space, to a zone where we believe the probability of finding a smaller
local minima is higher like the one highlighted in Fig. 2.
Fig. 2. Example of a two-dimensional energy landscape described by a blue curve.
When traveling the energy surface with a non-stochastic gradient method, our model
would move in a way similar to a sphere being dropped in the said landscape, moved by
the force of gravity. We can understand that the starting configuration of our model,
meaning, the starting position on the landscape, would have a major impact on the
absolute value of the minima achieved. There is in this case an optimal starting zone
that we highlighted in green, where if the initial configuration corresponds to a point
in that zone, the minima reached would be generally better.
Weight initialization has been known to have a significant impact on the
model’s convergence state when training with deep neural networks [38, 17].
The idea of the heuristic we want to apply to the learning of the HM is to
initialize the weights of the network so that the initial configuration contains
queues of the image particularities at different scales.
We propose to create a network with multiple hidden layers, with increasing
sizes from top to bottom where each layer must correspond to the size of a
down-sampled image.
摘要:

Multi-levelDataRepresentationForTrainingDeepHelmholtzMachinesJoseMiguelRamosjose.miguel.ramos@tecnico.ulisboa.ptLuisSa-Coutoluis.sa.couto@tecnico.ulisboa.ptAndreasWichertandreas.wichert@tecnico.ulisboa.ptDepartmentofComputerScienceandEngineering,INESC-ID&InstitutoSuperiorTecnico,UniversityofLisbon,...

展开>> 收起<<
Multi-level Data Representation For Training Deep Helmholtz Machines Jose Miguel Ramos jose.miguel.ramostecnico.ulisboa.pt.pdf

共24页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:24 页 大小:1.1MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 24
客服
关注