On Background Bias in Deep Metric Learning Konstantin Kobs and Andreas Hotho University of W urzburg Am Hubland 97074 W urzburg Germany

2025-05-02 0 0 5.67MB 8 页 10玖币
侵权投诉
On Background Bias in Deep Metric Learning
Konstantin Kobs and Andreas Hotho
University of W¨urzburg, Am Hubland, 97074 W¨urzburg, Germany
ABSTRACT
Deep Metric Learning trains a neural network to map input images to a lower-dimensional embedding space such
that similar images are closer together than dissimilar images. When used for item retrieval, a query image is
embedded using the trained model and the closest items from a database storing their respective embeddings
are returned as the most similar items for the query. Especially in product retrieval, where a user searches for
a certain product by taking a photo of it, the image background is usually not important and thus should not
influence the embedding process. Ideally, the retrieval process always returns fitting items for the photographed
object, regardless of the environment the photo was taken in. In this paper, we analyze the influence of the
image background on Deep Metric Learning models by utilizing five common loss functions and three common
datasets. We find that Deep Metric Learning networks are prone to so-called background bias, which can lead to
a severe decrease in retrieval performance when changing the image background during inference. We also show
that replacing the background of images during training with random background images alleviates this issue.
Since we use an automatic background removal method to do this background replacement, no additional manual
labeling work and model changes are required while inference time stays the same. Qualitative and quantitative
analyses, for which we introduce a new evaluation metric, confirm that models trained with replaced backgrounds
attend more to the main object in the image, benefitting item retrieval systems.
Keywords: Deep Metric Learning, Background Bias, Item Retrieval
1. INTRODUCTION
Deep Metric Learning (DML) is the task of learning a neural network to embed input items (in this case, images)
such that embeddings of similar items are closer together than embeddings of dissimilar items.3This technique
is often used for face recognition, person reidentification, and item retrieval.4For instance in item retrieval, a
query image of an item is used to find semantically similar images by identifying the closest images in embedding
space. Two images are deemed similar if they show the same item. Given this definition, the background of
the images should not play a role in the embedding process, since objects can be photographed in different
environments and thus appear in front of different backgrounds. Similar desired properties can be defined for
other DML applications such as person reidentification.
Previous analytical work for the different task of content classification shows that neural networks suffer
from so-called background bias, i.e. they use information from the image background to identify the image
category. For example, image classifiers trained to identify ships often focus on the water and not on the ship
itself. This way, the classifier is not able to identify ships at land.5
Since DML does not classify images but embeds them, the findings on background bias from the literature are
not directly transferable to these models. If background bias was also present in DML models, image backgrounds
would influence the embedding process. Then, taking a picture of an object on the street or in a studio setup
could lead to different search results when searched for in item retrieval methods, resulting in performance
degradations of the item retrieval system. Figure 1shows such a situation: Placing the bike in front of a brick
wall or a studio backdrop gives completely different nearest neighbor search results. This is not desirable, since
the retrieval system should only take the main object into account.
Further author information: (Send correspondence to K.K.)
K.K.: E-mail: kobs@informatik.uni-wuerzburg.de
A.H.: E-mail: hotho@informatik.uni-wuerzburg.de
arXiv:2210.01615v1 [cs.CV] 4 Oct 2022
Figure 1. Retrieval results for two query images (first column) based on the distance of embeddings of a DML model
trained on the Stanford Online Products1dataset with the Contrastive loss.2The second query image shows the exact
same object as the first one, but we exchange the background using an image editing software. Ideally, the embeddings
for the two images should be similar since they show the same object, leading to similar retrieval results. However,
both queries result in very different retrieval results mostly based on background similarity. While the first row only
shows images that have a white background, the second one only shows images with patterns resembling the brick wall
background in the query image. This behavior is not desirable in item retrieval systems. In this paper, we investigate the
influence of the background on the retrieval performance of DML models.
In this paper, we investigate background bias in DML by conducting multiple experiments on three standard
DML datasets (Cars196,6CUB200,7Stanford Online Products1) and five different DML loss functions. We design
a test setting where we replace image backgrounds with other images and measure the retrieval’s performance
drop compared to the unmodified images; larger drops in performance indicate that the model relies more on the
background. We show that, depending on the dataset, models can suffer from severe background bias. To combat
this behavior, we apply a simple but effective training strategy that does require no additional manual labeling
work or model changes and keeps the same inference times. For this, we extract the main object from the images
during training using a salient object detection method8and put them onto randomly selected background images.
We show that this technique, which we call BGAugment, indeed improves performance in our test setting, even
though no foreground/background segmentation is available during testing, indicating that the model learns to
focus less on the background. To verify this, we qualitatively and quantitatively analyze the resulting models
and show that the model trained with BGAugment attends more to the main object instead of the background,
leading to better performance when backgrounds change. For this, we introduce a metric that quantifies the
focus of the model on the foreground and background.
Our contributions in this paper are threefold:
We are the first to show that DML models suffer from background bias, depending on the dataset
We apply a simple but effective method to alleviate background bias in DML for item retrieval that does
not require additional labeling work, model changes, or increases in inference time
We compare and analyze models trained using both methods qualitatively and quantitatively using input
attribution methods and propose a new metric that quantifies the focus of a model on the foreground
2. RELATED WORK
In recent years, a large corpus of literature has investigated background bias in classification neural networks.
They find that neural networks often use indicators from the background of images, such as the environment,
Our code is available at https://github.com/LSX-UniWue/background-bias-in-dml
摘要:

OnBackgroundBiasinDeepMetricLearningKonstantinKobsandAndreasHothoUniversityofWurzburg,AmHubland,97074Wurzburg,GermanyABSTRACTDeepMetricLearningtrainsaneuralnetworktomapinputimagestoalower-dimensionalembeddingspacesuchthatsimilarimagesareclosertogetherthandissimilarimages.Whenusedforitemretrieval,a...

展开>> 收起<<
On Background Bias in Deep Metric Learning Konstantin Kobs and Andreas Hotho University of W urzburg Am Hubland 97074 W urzburg Germany.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:5.67MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注