Learning over time using a neuromorphic adaptive control algorithm for robotic arms_2

2025-04-29 0 0 1.03MB 7 页 10玖币

侵权投诉

Learning over time using a neuromorphic adaptive control

algorithm for robotic arms

Lazar Supic

lazar.supic@accenture.com

Accenture Labs

San Francisco, California, USA

Terrence C. Stewart

terrence.stewart@nrc-cnrc.gc.ca

National Research Council Canada

Ottawa, Canada

ABSTRACT

In this paper, we explore the ability of a robot arm to learn the

underlying operation space dened by the positions (x, y, z) that the

arm’s end-eector can reach, including disturbances, by deploying

and thoroughly evaluating a Spiking Neural Network SNN-based

adaptive control algorithm. While traditional control algorithms

for robotics have limitations in both adapting to new and dynamic

environments, we show that the robot arm can learn the operational

space and complete tasks faster over time. We also demonstrate

that the adaptive robot control algorithm based on SNNs enables

a fast response while maintaining energy eciency. We obtained

these results by performing an extensive search of the adaptive

algorithm parameter space, and evaluating algorithm performance

for dierent SNN network sizes, learning rates, dynamic robot arm

trajectories, and response times. We show that the robot arm learns

to complete tasks 15% faster in specic experiment scenarios such

as scenarios with six or nine random target points.

CCS CONCEPTS

•Computing methodologies →Machine learning algorithms

;

•Computer systems organization →Robotics

;

Robotics

;

•

Networks →Network performance evaluation;

KEYWORDS

neuronal ensembles, spiking neural networks, PID control, adaptive

control, learning rate

ACM Reference Format:

Lazar Supic and Terrence C. Stewart. 2022. Learning over time using a neu-

romorphic adaptive control algorithm for robotic arms. In ICONS ’22: Inter-

national Conference on Neuromorphic Systems, July 27–29, 2022, KNOXVILLE,

TENNESSEE. ACM, New York, NY, USA, 7 pages. https://doi.org/XXXXXXX.

XXXXXXX

1 INTRODUCTION

Robotic arms are becoming increasingly prevalent in applications

central to human life, including manufacturing, rehabilitation, and

a growing range of household tasks as assistive devices[

Despite the fact that the capabilities of robotic arms have expanded

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

ICONS ’22, July 27–29, 2022, Knoxville, Tennessee

ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00

https://doi.org/XXXXXXX.XXXXXXX

quickly in recent years and have reached very strong performance

in repetitive, prescribed tasks, their ability to handle unexpected

situations, be exible, and adapt is still quite poor compared to bio-

logical organisms. Therefore, one of the main questions in robotics

is how we can enable robotic arms to learn and execute exibly

and adapt to new and dynamic environments, while preserving the

energy eciency of robotic arm execution of xed tasks.

Neurorobotics is an emerging branch in robotics that combines

neuroscience, robotics, and articial intelligence with the key goal

of embedding brain-inspired algorithms into robots to enable them

to learn better and handle these more complex, dynamic situa-

tions. The most prominent among these brain-inspired algorithms

are Spiking Neural Networks (SNNs), which are articial neural

networks modeled after principles of biological spike-based brain

processing. It has been shown that adaptive robot arm controllers

implemented using SNNs improve spatial accuracy and energy ef-

ciency for typical robot arm tasks, such as reaching, compared

to canonical control algorithms [

]. However, open questions

remain about the underlying learning mechanisms of SNNs, includ-

ing their ability to learn an operation space, their learning rates

over time, as well as the roles of pretraining and online learning for

handling disturbances in the operation space, such as a change in

the trajectory or a change in the weight of the object being handled

by the robotic arm.

A robot arm operates in a three-dimensional operational control

space dened by positions (x, y, z) that the arm’s end-eector can

reach. The ultimate goal of control algorithms is to bring the robot

arm end-eector (Fig. 1) to the desired target position in this space.

To achieve this, proper control commands must be given to the

motors at the robot arm joints. An inverse kinetic algorithm can

compute the angle values at the robot arm joints to reach a given

target location. In theory, these values would be enough to execute

the reach task awlessly. However, due to real-world limitations -

including motors at the joints having their own dynamics (dened

by their transfer functions), friction at the joints, as well as the

weight of the whole system dynamically changing when the arm

starts carrying a heavy object – the ability of the robot arm to reach

the target position accurately is compromised.

In this paper, we explore the ability of a robot arm to learn the un-

derlying operation space, including disturbances, by deploying and

thoroughly evaluating an SNN-based adaptive control algorithm [

We nd that the neuromorphic adaptive control algorithm can learn

over time and move between the target destination points faster

as it learns. We varied several parameters that aect algorithm

performance to obtain these results, including SNN size, learning

rate, and reaching task scenarios across the operation space. The

key quantitive result achieved in the paper is that we showed that

arXiv:2210.01243v1 [cs.RO] 3 Oct 2022

ICONS ’22, July 27–29, 2022, Knoxville, Tennessee Lazar Supic and Terrence C. Stewart

Figure 1: A. Canonical PID block diagram. B.Spiking Neural Network (SNN) based diagram

the robot arm learns to complete tasks 15% faster in specic experi-

mental scenarios. While the underlying algorithm is one that has

been previously presented ([

] [

] , our contribution here is to vary

these parameters in order to discover how the system performs

across that space.

2 RELATED WORK

2.1 Classical PID

A traditional way to deal with unknown aspects of a motor control

problem has been to use a Proportional–Integral–Derivative (PID)

controller, shown in Fig. 2. An error signal is rst computed as the

dierence between the target and current angle of the robot arm

joint. The PID controller then processes this error signal in three

parallel branches: in the proportional branch, the error signal is

multiplied by the

𝐾𝑝

proportional constant; in the integral branch,

the error signal is integrated and multiplied by the constant

𝐾𝑖

, and

the derivative branch takes the rst derivative of the error signal,

multiplied by the constant

𝐾𝑑

[

]. The constants

𝐾𝑝

𝐾𝑖

, and

𝐾𝑑

are

xed in advance based on the dynamics of the system and expected

disturbances. For a well-dened, static environment, this control

algorithm would perform reasonably well. However, in dynamic

environments, this type of control would not be able to adapt to

changing conditions. Instead, it is common to manually re-tune

these parameters across dierent conditions.

2.2 SSN-based adaptive control

To enable robot arms to learn, we used a published adaptive PID

algorithm where the integral branch of the PID is implemented as

a Spiking Neural Network(SNN) using the dynamics adaptation

approach [

]. We note that we selected this algorithm because it

has the potential to learn over time, which is not a common ap-

proach in the past. Other algorithms have xed architecture and

predened/tuned parameters. In the SNN based approach the trans-

fer function between the error signal and the integral part of the

control signal is governed by the SNN. SNN connection weights

are learned in real-time using the Prescribed Error Sensitivity (PES)

rule[

]. The Neural Engineering Framework (NEF) [

] and Nengo

are used as a framework to implement the SNN. Two parameters

of the SNN are of particular interest to this study: the number of

neurons used to implement the SNN and the learning rate for the

PES rule. The proportional coecient for the adaptive controller

𝐾𝑝

is 200, and for the derivative branch is multiplied by coecient

𝐾𝑑

= 10. The coecients were selected via iterative tuning, in accor-

dance with typical PID controller coecient selection. The adaptive

controller is a patented algorithm from ABR and is available at the

following link [7].

3 METHODS

We evaluated the online learning capability of the adaptive control

on the Jaco 6 DOF robotic arm. The key question we want to answer

is whether and how the loaded arm moving between two random

points in the operational space can ’learn’ to execute this task faster.

We utilized the MuJoCo robotic simulator to execute simulations of

the physical characteristics of the robot arm, including movement

dynamics. Specically, the MuJoCo simulator contains a physics-

based model of the Jaco robotic arm. The dynamic and physical

characteristics of the Jaco arm are described in [

]. We impose a rm

time limit that tasks should be completed within 2000 simulation

steps, equivalent to 2s in the real-world scenario. This time limit

was selected to give the robot arm an upper bound on timing to

complete the task. If the task is not completed, the arm gets a new

task, i.e. a new target.

We divided the simulation cycle into two phases: the training

phase and the testing/evaluation phase. In the training phase, we

ran 20 simulations per transition between two points. We focused

on varying the learning rate parameter because it is primarily re-

sponsible for the learning capability of the algorithm. Similarly,

because the focus of the study was the learning capability, the neu-

ron ensemble size was varied. The simulation task was the reaching

task, which is common in robotics applications. The trajectories

between targets were selected to be random, in order to sample the

operational space. Specically, the randomly chosen points were

enumerated, and the arm moves between them in increasing nu-

merical order. For example, if we have ve randomly chosen target

points, they are enumerated as 1,2,3,4 and 5. During the training

phase, we move the robot arm’s end-eector in the following order

1->2->3->4->5->1. The robot arm was loaded with an equivalent

weight of 1kg. The time to reach the target point was measured and

plotted. We randomly picked one of the targets from the training

phase during the testing phase and measured the time the arm’s

end-eector needed to reach the target and plotted the histogram

of the resulting times.

In the evaluation phase, we varied several parameters: the size

of the neuron ensemble, the PES learning rate, and the number of

target positions to determine their impact on learning over time.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LearningovertimeusinganeuromorphicadaptivecontrolalgorithmforroboticarmsLazarSupiclazar.supic@accenture.comAccentureLabsSanFrancisco,California,USATerrenceC.Stewartterrence.stewart@nrc-cnrc.gc.caNationalResearchCouncilCanadaOttawa,CanadaABSTRACTInthispaper,weexploretheabilityofarobotarmtolearntheund...

展开>> 收起<<

Learning over time using a neuromorphic adaptive control algorithm for robotic arms_2.pdf

共7页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Learning over time using a neuromorphic adaptive control algorithm for robotic arms_2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: