Oflib Facilitating Operations with and on Optical Flow Fields in Python Claudio S. Ravasio120000000264535376 Lyndon Da

2025-04-24 0 0 1019.81KB 15 页 10玖币
侵权投诉
Oflib: Facilitating Operations with and on
Optical Flow Fields in Python
Claudio S. Ravasio1,2[0000000264535376], Lyndon Da
Cruz3[0000000276956354], and Christos Bergeles2[0000000291523194]
1University College London (UCL), United Kingdom
2King’s College London (KCL), United Kingdom
3Moorfields Eye Hospital, London, United Kingdom
Abstract. We present a robust theoretical framework for the charac-
terisation and manipulation of optical flow, i.e 2D vector fields, in the
context of their use in motion estimation algorithms and beyond. The
definition of two frames of reference guides the mathematical deriva-
tion of flow field application, inversion, evaluation, and composition op-
erations. This structured approach is then used as the foundation for
an implementation in Python 3, with the fully differentiable PyTorch
version oflibpytorch supporting back-propagation as required for deep
learning. We verify the flow composition method empirically and pro-
vide a working example for its application to optical flow ground truth
in synthetic training data creation. All code is publicly available.
Keywords: Optical flow; Flow field; Flow vector; Flow composition;
Python; PyTorch; NumPy
1 Introduction
Optical flow as an expression of motion encoding and feature correspondence is
one of the oldest tasks in computer vision, with seminal works such as by Lucas,
Kanade et al [8] dating back to the early 80s. After decades of advances us-
ing variational methods, the extremely successful convolutional neural network
based FlowNet method in 2015 [3] heralded the arrival of well-performing and
efficient end-to-end deep learning methods, usually implemented in Python. This
has quickly become the dominant approach, with performance continuously im-
proving and ever more complex benchmarks being proposed, such as MPI-Sintel,
KITTI, or FlyingThings3D [13,10,9].
In this context, handling optical flow easily and efficiently is of increasing
importance. Many algorithms or their training protocols involve operations with
or on optical flow fields, such as the creation of complex synthetic data [12] or
working with “cues” calculated from bidirectional flow as proposed by Hofin-
ger et al [6]. Implementing this from scratch can be laborious and error-prone.
While there are a great number of publicly available algorithm implementa-
tions as well as methods in Python libraries for the estimation of optical flow
fields [2,4,7], no such wealth of resources exists for their further manipulation.
arXiv:2210.05635v2 [cs.CV] 14 Oct 2022
2 C. S. Ravasio et al.
An extensive search brought only little Python code to light, all being either
algorithm-dependent or severely limited in their scope. Flow visualisation is an
important topic, and the toolboxes flowvid [14] as well as flow-vis [15] have
some interesting capabilities in this regard. In addition to that, there are pack-
ages such as flowpy [17] which also allow for basic flow warping, and add some
utilities with a narrow focus on specific tasks such as reading and writing flows.
The aim of oflib on the other hand is to offer a structured approach to
the concept of flow fields, guided by a framework derived from first principles,
and to provide all methods necessary to perform operations within a reasonable
scope. This involves taking into account the two possible frames of reference for
the flow vectors, as well as tracking undefined areas in outputs. The rigorous
method ensures the mathematically correct implementation of a wide range of
flow operations, including more complex functions – such as flow composition –
not found in any of the previously listed python packages. Full interoperability
with any Python code using NumPy [5] or PyTorch [11] lends oflib a high
potential for reuse by the larger research community. The option to perform
operations batched and on a GPU in particular yields significant speedups, while
differentiability in the context of the Pytorch autograd module allows for the
use in any deep learning algorithm relying on back-propagation for optimisation.
2 Theory
The theoretical framework underpinning oflib is derived from first principles to
ensure a coherent and rigorous approach. This section will first address the two
possible reference frames for optical flow fields and then present the theoretical
basis for the main functionality provided by oflib, focusing on the concrete
operations needed to eventually translate the mathematical definition into code.
2.1 Optical Flow Definition
An optical flow field is defined as a spatial mapping of coordinates at time t1to
coordinates at time t2:
F1
2:= X17→ X2;F1
2=X2X1(1)
where Xtcorresponds to the set of continuous feature coordinates xbeing
mapped at time t, and F1
2is the resulting array of flow vectors between the
feature sets at times t1and t2. In the context of image sequences, this is equiv-
alent to creating a mapping between the image feature coordinates iXtin the
frame at time t1to coordinates in the frame at time t2. We distinguish between
two possible frames of reference, “source” and “target”4, illustrated in Figure 1:
4The terms “forward” and “backward” flow often used in literature are avoided here,
as it can lead to confusion e.g. in the context of reverse “backward” flows.
Oflib: Facilitating Operations with and on Optical Flow Fields in Python 3
Source, “s”: In this case, coordinates on a discretised regular grid at time
t1, termed the “source domain”, are mapped to coordinates in continuous
space at time t2. Applied to images, this indicates each pixel in the first
image is matched with some position in the second image - but not every
pixel at time t2has a known source correspondence at time t1.
Target, “t”: The second option means that for each coordinate on a discre-
tised regular grid at time t2, or the “target domain”, there is a mapping to a
coordinate in continuous space at time t1. Each pixel in the second image is
matched with some position in the first image - but not every pixel at time
t1has a known target correspondence at time t2.
Equation (1) can therefore be extended as follows:
“Source” reference: F1
2:= G17→ X2;F1
2=X2G1
“Target” reference: F1
2:= X17→ G2;F1
2=G2X1
(2)
where the underlined number in F1
2indicates whether the source or the target
of the mapping is on a discretised regular grid Gin 2D space, spanning the pixel
range from 0 to H1 vertically and 0 to W1 horizontally, where Hand W
are the flow field height and width, respectively.
Note that while F1
2̸=−F2
1, as the inverse mapping of G17→ X2is not
G27→ X1, the following relationships do hold true:
F1
2=F2
1
inv G17→ X2= (X27→ G1)inv F1
2=F2
1
F1
2=F2
1
inv X17→ G2= (G27→ X1)inv F1
2=F2
1
(3)
F1
2
F1
2
Fig. 1: Two frames of reference, points at time t1in red, at t2in blue. Left:
“source” means all pixels at time t1, i.e. coordinates on a discrete grid G, are
mapped to a new location at time t2.Right: “target” means all pixels on this
grid Gat time t2are matched with a different previous location at time t1.
2.2 Flow Application
Given data on the spatial grid Gat time t1such as an image iG1, an optical
flow field F1
2on the same grid Gcan be applied to it to calculate the warped
摘要:

Oflib:FacilitatingOperationswithandonOpticalFlowFieldsinPythonClaudioS.Ravasio1,2[0000−0002−6453−5376],LyndonDaCruz3[0000−0002−7695−6354],andChristosBergeles2[0000−0002−9152−3194]1UniversityCollegeLondon(UCL),UnitedKingdom2King’sCollegeLondon(KCL),UnitedKingdom3MoorfieldsEyeHospital,London,UnitedKin...

展开>> 收起<<
Oflib Facilitating Operations with and on Optical Flow Fields in Python Claudio S. Ravasio120000000264535376 Lyndon Da.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:1019.81KB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注