1 Nish A Novel Negative Stimulated Hybrid Activation Function Yildiray Anag una and Sahin Is ika

2025-04-30 0 0 637.36KB 10 页 10玖币
侵权投诉
1
Nish: A Novel Negative Stimulated Hybrid Activation Function
Yildiray Anaguna* and Sahin Isika
aDepartment of Computer Engineering, Eskisehir Osmangazi University, Eskisehir, Turkey
*Corresponding author: yanagun@ogu.edu.tr
Abstract
An activation function has a significant impact on the efficiency and robustness of the neural networks. As
an alternative, we evolved a cutting-edge non-monotonic activation function, Negative Stimulated Hybrid
Activation Function (Nish). It acts as a Rectified Linear Unit (ReLU) function for the positive region and
a sinus-sigmoidal function for the negative region. In other words, it incorporates a sigmoid and a sine
function and gaining new dynamics over classical ReLU. We analyzed the consistency of the Nish for
different combinations of essential networks and most common activation functions using on several most
popular benchmarks. From the experimental results, we reported that the accuracy rates achieved by the
Nish is slightly better than compared to the Mish in classification.
Keywords: Activation Function, Convolutional Neural Network, Deep Learning
1. Introduction
ReLU (Krizhevsky et al., 2017; Nair and Hinton,
2010) has been used extensively as a non-linear
function on deep neural networks; moreover it is one
of the most preferred functions due to better
convergence and gradient propagation compared to
tanh, logistic and sigmoid. ReLU also suffers from
the existence of certain three important cases such
as non-zero mean, negative loss and unlimited
output. Although, ReLU ensures training speed in
the learning phase, since the derivative of the zero
value region is also zero, learning does not occur in
backpropagation (Dying ReLU problem). ReLU-n
(Krizhevsky, 2012a) is another rectifier-based
activation functions proposed to enhance the
efficiency of the convolutional neural network
(CNN). ReLU-n is a limited ReLU at n. When the
value of n is chosen as 6 empirically, it is called
ReLU6. This provides the model to learn the sparse
input earlier. This is equivalent to assuming that
each ReLU unit consists of only 6 replicated bias-
shift Bernoulli units rather than an infinite amount
(Lu et al., 2019). Leaky ReLU (LReLU) (Maas,
2013) tries to solve the vanishing gradient problem
by multiplying by a constant
 
a
for negative values
of the function. Randomized Leaky ReLU (RReLU)
(Xu et al., 2015) generates activation of neurons by
random sampling for negative values. Parametric
ReLU (PReLU) (Kaiming He et al., 2015a) is very
similar to LReLU, but compared to it, the multiplier
is a learnable parameter
 
. The equations of the
ReLU and its variants are given as follows:
2
0 if 0
() if 0
ReLU
x
fxxx
(1)
if 0
() if 0
LReLU
ax x
fx
xx
(2)
(3)
 
, , and , 0,1aij P k l l k k l
if 0
() if 0
PReLU
xx
fx
xx
(4)
Exponential Linear Unit (ELU) activation function
(Clevert et al., 2016) gradually brings slope of the
curve from a constant threshold to origin, in
addition, it contains a negative saturation on the
curve to manage variance of the forward
propagating. It takes a binary value and is therefore
often used as an output layer. Inspired by the ELU,
Scaled ELU (SELU) (Klambauer et al., 2017) is its
parametric activation function variant and does not
have a vanishing gradient problem such as the
ReLU. The SELU is defined for self-normalizing
neural networks and deals with internal
normalization. In other words, each layer preserves
the mean and variance coming from the previous
layers. It ensures this normalization by adjusting
both the mean and variance so that the network
converges faster. However, in deeper neural
networks, it is possible to say that the SELU suffers
from gradient explosion or loses its self-
normalization property. Gaussian Error Linear Unit
(GELU) (Hendrycks and Gimpel, 2016) performs
element-wise neurons firing on a given input tensor.
The GELU nonlinearity sets the weights of the
inputs in according to the magnitude instead of their
sign as in the ReLUs, and deals with the cumulative
density function of the normally distributed inputs.
Sigmoid Linear Unit (SiLU) (Elfwing et al., 2018)
is calculated with the sigmoid function multiplied
by its input, while the Swish activation function
(Ramachandran et al., 2018) is obtained by adding a
trainable β parameter, with slight modification to
SiLU. Although all of these cutting-edge activation
functions perform significantly better than previous
classic activation functions when trained in 3 deep
models, they converge more slowly compared to the
ReLU, Leaky ReLU and PReLU. Based on the
Swish’s self-gating properties, a new self-regulating
non-monotonic activation function Mish (Misra,
2020) tends to increase the efficiency of computer
vision problems. Not only does the Mish ensure
better empirical results than the Swish under most
experimental conditions, but it also overcomes some
of the Swish's drawbacks, particularly in large and
complex architectures such as models with residual
layers.
2. Related Works
Research on new heuristics/strategies that can
outperform classical activation functions is
nowadays one of the key points for finding solutions
to image and signal processing problems with
advanced deep learning mechanisms (Nwankpa et
al., 2020). There is a close and complex relationship
between the activation function and the CNN
structure. Variants of the family of rectified units
have been widely used in popular deep CNNs in
recent years. (Clevert et al., 2015; Kaiming He et al.,
2014; K. He et al., 2015b; Sermanet et al., 2014;
摘要:

1Nish:ANovelNegativeStimulatedHybridActivationFunctionYildirayAnaguna*andSahinIsikaaDepartmentofComputerEngineering,EskisehirOsmangaziUniversity,Eskisehir,Turkey*Correspondingauthor:yanagun@ogu.edu.trAbstractAnactivationfunctionhasasignificantimpactontheefficiencyandrobustnessoftheneuralnetworks.Asa...

展开>> 收起<<
1 Nish A Novel Negative Stimulated Hybrid Activation Function Yildiray Anag una and Sahin Is ika.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:637.36KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注