Universal hidden monotonic trend estimation with contrastive learning Edouard Pineau

2025-05-06 0 0 963.55KB 12 页 10玖币
侵权投诉
Universal hidden monotonic trend estimation
with contrastive learning
Edouard Pineau
EthiFinance
edouard.pineau@ethifinance.com
S´
ebastien Razakarivony
Safran
sebastien.razakarivony@safrangroup.com
Mauricio Gonzalez
EthiFinance
mauricio.gonzalez@ethifinance.com
Anthony Schrapffer
EthiFinance
anthony.schrapffer@ethifinance.com
Abstract
In this paper, we describe a universal method for extracting the underlying
monotonic trend factor from time series data. We propose an approach
related to the Mann-Kendall test, a standard monotonic trend detection
method and call it contrastive trend estimation (CTE). We show that the
CTE method identifies any hidden trend underlying temporal data while
avoiding the standard assumptions used for monotonic trend identification.
In particular, CTE can take any type of temporal data (vector, images,
graphs, time series, etc.) as input. We finally illustrate the interest of our
CTE method through several experiments on different types of data and
problems.
1 Introduction
Our paper focuses on the estimation of a monotonic trend factor underlying temporal data.
Such estimation is interesting in many fields, e.g., health monitoring [38], survival analysis
[35] or climate change monitoring [23]. In all these fields and related trend estimation
problems, we observe samples generated by a monitored system (e.g., an ageing mechanical
system, a credit debtor, earth’s weather and climate conditions) at different times in its life,
and we assume that the state of the system drifts monotonically. These observed samples
may be of any type (e.g., vectors, images, time series, graphs), depending on the monitored
system. Figure 1 illustrates the general context of trend estimation.
More generally, when studying temporal data, it is common to assume the existence of
structural latent factors, supposed meaningful, that generated the data [21]. These com-
ponents are generally allocated into four groups. The trend components are monotonic
long-term signals. The cycle components are factors exhibiting rises and falls that are not
of a fixed frequency. The seasonality components are periodic patterns occurring at a fixed
frequency. The irregularity factors represent the rest of the information (considered as a
noise). We assume independent structural factors. The challenging yet essential task is the
identification of one or several of these factors, that is called blind source separation [8],
independent component analysis [25] or disentanglement [4]. In this paper, the objective
is to detect, isolate and identify only the trend component. [24] shows that if we know
one hidden component under time series data, we can find the others conditionally. Hence,
finding the trend component is not only useful for many monitoring problems, it is relevant
for further analysis.
arXiv:2210.09817v2 [cs.LG] 23 Apr 2023
Figure 1: Illustration of the context of the paper’s contribution. We have a monitored
system Sthat generates data samples (colored curves) at random time. The hidden trend τ
underlying the system (colors from green to red) represents the hidden state of Sthat changes
monotonically until a state restoration is applied (tools in hexagons): samples between two
state restorations form a sequence with a monotonic hidden trend. The relation between trend
and observed data may be an arbitrary function yet assumed to preserve the information
about the trend.
Often, trend estimation methods seek monotonic variations in the values of the data or in
expert-based statistics computed from data [7, 36]. In practice, the trend can be deeply
hidden in the data or may be not well defined because of a lack of information about the
monitored system. Hence, we may not know which variable or statistics to follow to find
the trend.
In this paper, we learn to infer the trend factor from data (of any type) without labels or
expert supervision, using only samples’ time index. To do so, we develop a general method
based on Contrastive Learning (CL). CL recently received high interest in self-supervised
representation learning [33], in particular for time series data (see, e.,g., [11, 3, 45]). Our
CL approach uses a loss inspired by Mann-Kendall test [34], a standard trend detection
method.
The rest of the paper presents our universal trend inference method called Contrastive
Trend Estimation (CTE).Section 2 presents the method. Section 3 analyzes the theoretical
foundation of our method in terms of identifiability. Section 4 lists related works on trend
detection and estimation. Section 5 presents a set of experiments to illustrate the interest of
our approach for trend estimation and survival analysis. Concluding remarks are presented
in Section 6.
2 Contrastive trend detection
Notations. Let Xbe a sequence of NXNobserved samples generated by a monitored
system denoted by S. We assume that a hidden state of Sdrifts monotonically. We note
Xthe dataset of all sequences Xin which there exist a hidden monotonic factor. We note
tithe time index of the ith observed sample, iJ1, NXK. We assume that each sequence
X∈ X has been generated from structural factors through a function F, such that at least
the information about the trend is not annihilated (in blind source separation problems, F
would be assumed invertible). That is, for each Xthere exist ZX:= (τX, cX, sX, X) such
that Xti=F(ZX
ti) ,where τX,cX,sX, and Xrepresent respectively (resp.) the monotonic
trend, the cycle, the seasonality, and the irregularity that generated X. The paper’s goal is
to estimate the factor τXfrom X.
Our CTE approach. For each X X , we select two sampling times tu, tvin
{t1, . . . , tNX}2, such that, without loss of generality (w.l.o.g.), tu< tv. The value of the
hidden trend at the sampling time tfor Xtis noted τX
t. Since we do not have access to the
2
true hidden trend, we need assumptions about τX. We use the natural Assumption 1 to
estimate τX.
Assumption 1. (Monotonicity). For each sequence X∈ X and all sample couples
(Xtu, Xtv), we have that tutvτX
tuτX
tv.
To extract the trend component, we use a neural network (NN) Fφwith parameters φ
that embeds each sample Xtinto a de-dimensional vector space, with which we define
gβ:Rde×Rde[0,1] a parametric logistic regressor defined as follows:
gβ(Xtu, Xtv) = σβ>Fφ(Xtv)β>Fφ(Xtu),(1)
where σ(x):= (1 + ex)1is the sigmoid function. Let Cuv :=1{τX
tuτX
tv}be the indicator
function that describes the trend direction between tuand tvfor any sample X. Under
the Assumption 1, we have also Cuv =1{tutv}. Then, we can build Cuv from sample’s
time indices. We then can learn the posterior distribution p(Cuv|Xtu, Xtv), i.e., learn the
identity:
p(Cuv = 1|Xtu, Xtv) = gβ(Xtu, Xtv).(2)
As for common binary classification problems, training is done by minimizing the binary
cross entropy (BCE) between Cuv and the regressor gβ(Xtu, Xtv), for all pairs of time
indices (tu, tv), XinX, i.e., by minimizing:
R(β, φ;X) = EX∈X
NX
X
i,j=1
Cij log gβXti, Xtj
.(3)
Remark 2. Eq. (3) is similar to the Mann-Kendall statistics of eq. (7) presented in Section
4 of related work.
Once the parameters (φ, β) are fitted, we build an estimator β>Fφ(Xt) of the trend factor
τX
t. In the next section, we show in what extent this estimator effectively estimates the
hidden trend factor.
3 Identifiability study
We assume that Fφis a universal approximation function (e.g., a sufficiently large NN) and
that the amount of data is large enough (equivalent to infinite data) such that we achieve
the identity of eq. (2).
Definition 1. (Minimal sufficiency). A sufficient statistic Tis minimal sufficient if
for any sufficient statistic U, there exists a function hsuch that T=h(U). If Uis also
minimal, then his a bijection.
Proposition 1. β>(Fφ(Xtv)Fφ(Xtu)) is a minimal sufficient statistic for trend label
Cuv.
Proof. First we remind that logistic regression learns likelihood ratios, i.e., Fβ,φ is a log-
likelihood difference. In fact, using the Bayes rule, we get
p(Cuv = 1|Xtu, Xtv) = p(Xtu, Xtv|Cuv = 1)p(Cuv = 1)
p(Xtu, Xtv).(4)
Moreover, using properties of sigmoid function σand eq. (2), we have
eβ>(Fφ(Xtv)Fφ(Xtu)) =p(Cuv = 1|Xtu, Xtv)
p(Cuv = 0|Xtu, Xtv).(5)
3
摘要:

UniversalhiddenmonotonictrendestimationwithcontrastivelearningEdouardPineauEthiFinanceedouard.pineau@ethifinance.comSebastienRazakarivonySafransebastien.razakarivony@safrangroup.comMauricioGonzalezEthiFinancemauricio.gonzalez@ethifinance.comAnthonySchrap erEthiFinanceanthony.schrapffer@ethifinance....

展开>> 收起<<
Universal hidden monotonic trend estimation with contrastive learning Edouard Pineau.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:12 页 大小:963.55KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注