Learning over time using a neuromorphic adaptive control algorithm for robotic arms_2

2025-04-29 0 0 1.03MB 7 页 10玖币
侵权投诉
Learning over time using a neuromorphic adaptive control
algorithm for robotic arms
Lazar Supic
lazar.supic@accenture.com
Accenture Labs
San Francisco, California, USA
Terrence C. Stewart
terrence.stewart@nrc-cnrc.gc.ca
National Research Council Canada
Ottawa, Canada
ABSTRACT
In this paper, we explore the ability of a robot arm to learn the
underlying operation space dened by the positions (x, y, z) that the
arm’s end-eector can reach, including disturbances, by deploying
and thoroughly evaluating a Spiking Neural Network SNN-based
adaptive control algorithm. While traditional control algorithms
for robotics have limitations in both adapting to new and dynamic
environments, we show that the robot arm can learn the operational
space and complete tasks faster over time. We also demonstrate
that the adaptive robot control algorithm based on SNNs enables
a fast response while maintaining energy eciency. We obtained
these results by performing an extensive search of the adaptive
algorithm parameter space, and evaluating algorithm performance
for dierent SNN network sizes, learning rates, dynamic robot arm
trajectories, and response times. We show that the robot arm learns
to complete tasks 15% faster in specic experiment scenarios such
as scenarios with six or nine random target points.
CCS CONCEPTS
Computing methodologies Machine learning algorithms
;
Computer systems organization Robotics
;
Robotics
;
Networks Network performance evaluation;
KEYWORDS
neuronal ensembles, spiking neural networks, PID control, adaptive
control, learning rate
ACM Reference Format:
Lazar Supic and Terrence C. Stewart. 2022. Learning over time using a neu-
romorphic adaptive control algorithm for robotic arms. In ICONS ’22: Inter-
national Conference on Neuromorphic Systems, July 27–29, 2022, KNOXVILLE,
TENNESSEE. ACM, New York, NY, USA, 7 pages. https://doi.org/XXXXXXX.
XXXXXXX
1 INTRODUCTION
Robotic arms are becoming increasingly prevalent in applications
central to human life, including manufacturing, rehabilitation, and
a growing range of household tasks as assistive devices[
5
,
6
,
11
].
Despite the fact that the capabilities of robotic arms have expanded
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
ICONS ’22, July 27–29, 2022, Knoxville, Tennessee
©2022 Association for Computing Machinery.
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00
https://doi.org/XXXXXXX.XXXXXXX
quickly in recent years and have reached very strong performance
in repetitive, prescribed tasks, their ability to handle unexpected
situations, be exible, and adapt is still quite poor compared to bio-
logical organisms. Therefore, one of the main questions in robotics
is how we can enable robotic arms to learn and execute exibly
and adapt to new and dynamic environments, while preserving the
energy eciency of robotic arm execution of xed tasks.
Neurorobotics is an emerging branch in robotics that combines
neuroscience, robotics, and articial intelligence with the key goal
of embedding brain-inspired algorithms into robots to enable them
to learn better and handle these more complex, dynamic situa-
tions. The most prominent among these brain-inspired algorithms
are Spiking Neural Networks (SNNs), which are articial neural
networks modeled after principles of biological spike-based brain
processing. It has been shown that adaptive robot arm controllers
implemented using SNNs improve spatial accuracy and energy ef-
ciency for typical robot arm tasks, such as reaching, compared
to canonical control algorithms [
4
,
12
]. However, open questions
remain about the underlying learning mechanisms of SNNs, includ-
ing their ability to learn an operation space, their learning rates
over time, as well as the roles of pretraining and online learning for
handling disturbances in the operation space, such as a change in
the trajectory or a change in the weight of the object being handled
by the robotic arm.
A robot arm operates in a three-dimensional operational control
space dened by positions (x, y, z) that the arm’s end-eector can
reach. The ultimate goal of control algorithms is to bring the robot
arm end-eector (Fig. 1) to the desired target position in this space.
To achieve this, proper control commands must be given to the
motors at the robot arm joints. An inverse kinetic algorithm can
compute the angle values at the robot arm joints to reach a given
target location. In theory, these values would be enough to execute
the reach task awlessly. However, due to real-world limitations -
including motors at the joints having their own dynamics (dened
by their transfer functions), friction at the joints, as well as the
weight of the whole system dynamically changing when the arm
starts carrying a heavy object – the ability of the robot arm to reach
the target position accurately is compromised.
In this paper, we explore the ability of a robot arm to learn the un-
derlying operation space, including disturbances, by deploying and
thoroughly evaluating an SNN-based adaptive control algorithm [
4
].
We nd that the neuromorphic adaptive control algorithm can learn
over time and move between the target destination points faster
as it learns. We varied several parameters that aect algorithm
performance to obtain these results, including SNN size, learning
rate, and reaching task scenarios across the operation space. The
key quantitive result achieved in the paper is that we showed that
arXiv:2210.01243v1 [cs.RO] 3 Oct 2022
ICONS ’22, July 27–29, 2022, Knoxville, Tennessee Lazar Supic and Terrence C. Stewart
Figure 1: A. Canonical PID block diagram. B.Spiking Neural Network (SNN) based diagram
the robot arm learns to complete tasks 15% faster in specic experi-
mental scenarios. While the underlying algorithm is one that has
been previously presented ([
8
] [
3
] , our contribution here is to vary
these parameters in order to discover how the system performs
across that space.
2 RELATED WORK
2.1 Classical PID
A traditional way to deal with unknown aspects of a motor control
problem has been to use a Proportional–Integral–Derivative (PID)
controller, shown in Fig. 2. An error signal is rst computed as the
dierence between the target and current angle of the robot arm
joint. The PID controller then processes this error signal in three
parallel branches: in the proportional branch, the error signal is
multiplied by the
𝐾𝑝
proportional constant; in the integral branch,
the error signal is integrated and multiplied by the constant
𝐾𝑖
, and
the derivative branch takes the rst derivative of the error signal,
multiplied by the constant
𝐾𝑑
[
1
]. The constants
𝐾𝑝
,
𝐾𝑖
, and
𝐾𝑑
are
xed in advance based on the dynamics of the system and expected
disturbances. For a well-dened, static environment, this control
algorithm would perform reasonably well. However, in dynamic
environments, this type of control would not be able to adapt to
changing conditions. Instead, it is common to manually re-tune
these parameters across dierent conditions.
2.2 SSN-based adaptive control
To enable robot arms to learn, we used a published adaptive PID
algorithm where the integral branch of the PID is implemented as
a Spiking Neural Network(SNN) using the dynamics adaptation
approach [
4
]. We note that we selected this algorithm because it
has the potential to learn over time, which is not a common ap-
proach in the past. Other algorithms have xed architecture and
predened/tuned parameters. In the SNN based approach the trans-
fer function between the error signal and the integral part of the
control signal is governed by the SNN. SNN connection weights
are learned in real-time using the Prescribed Error Sensitivity (PES)
rule[
10
]. The Neural Engineering Framework (NEF) [
2
] and Nengo
are used as a framework to implement the SNN. Two parameters
of the SNN are of particular interest to this study: the number of
neurons used to implement the SNN and the learning rate for the
PES rule. The proportional coecient for the adaptive controller
𝐾𝑝
is 200, and for the derivative branch is multiplied by coecient
𝐾𝑑
= 10. The coecients were selected via iterative tuning, in accor-
dance with typical PID controller coecient selection. The adaptive
controller is a patented algorithm from ABR and is available at the
following link [7].
3 METHODS
We evaluated the online learning capability of the adaptive control
on the Jaco 6 DOF robotic arm. The key question we want to answer
is whether and how the loaded arm moving between two random
points in the operational space can ’learn’ to execute this task faster.
We utilized the MuJoCo robotic simulator to execute simulations of
the physical characteristics of the robot arm, including movement
dynamics. Specically, the MuJoCo simulator contains a physics-
based model of the Jaco robotic arm. The dynamic and physical
characteristics of the Jaco arm are described in [
9
]. We impose a rm
time limit that tasks should be completed within 2000 simulation
steps, equivalent to 2s in the real-world scenario. This time limit
was selected to give the robot arm an upper bound on timing to
complete the task. If the task is not completed, the arm gets a new
task, i.e. a new target.
We divided the simulation cycle into two phases: the training
phase and the testing/evaluation phase. In the training phase, we
ran 20 simulations per transition between two points. We focused
on varying the learning rate parameter because it is primarily re-
sponsible for the learning capability of the algorithm. Similarly,
because the focus of the study was the learning capability, the neu-
ron ensemble size was varied. The simulation task was the reaching
task, which is common in robotics applications. The trajectories
between targets were selected to be random, in order to sample the
operational space. Specically, the randomly chosen points were
enumerated, and the arm moves between them in increasing nu-
merical order. For example, if we have ve randomly chosen target
points, they are enumerated as 1,2,3,4 and 5. During the training
phase, we move the robot arm’s end-eector in the following order
1->2->3->4->5->1. The robot arm was loaded with an equivalent
weight of 1kg. The time to reach the target point was measured and
plotted. We randomly picked one of the targets from the training
phase during the testing phase and measured the time the arm’s
end-eector needed to reach the target and plotted the histogram
of the resulting times.
In the evaluation phase, we varied several parameters: the size
of the neuron ensemble, the PES learning rate, and the number of
target positions to determine their impact on learning over time.
摘要:

LearningovertimeusinganeuromorphicadaptivecontrolalgorithmforroboticarmsLazarSupiclazar.supic@accenture.comAccentureLabsSanFrancisco,California,USATerrenceC.Stewartterrence.stewart@nrc-cnrc.gc.caNationalResearchCouncilCanadaOttawa,CanadaABSTRACTInthispaper,weexploretheabilityofarobotarmtolearntheund...

展开>> 收起<<
Learning over time using a neuromorphic adaptive control algorithm for robotic arms_2.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:1.03MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注