The Inventory is Dark and Full of Misinformation Understanding Ad Inventory Pooling in the Ad-Tech Supply Chain Yash Vekaria

2025-04-24 0 0 1.58MB 18 页 10玖币
侵权投诉
The Inventory is Dark and Full of Misinformation:
Understanding Ad Inventory Pooling in the Ad-Tech Supply Chain
Yash Vekaria
University of California, Davis
Rishab Nithyanand
University of Iowa
Zubair Shafiq
University of California, Davis
Abstract—Ad-tech enables publishers to programmatically sell
their ad inventory to millions of demand partners through a
complex supply chain. The complexity and opacity of the ad-
tech supply chain can be exploited by low-quality publishers
(e.g., misinformation websites) to deceptively monetize their ad
inventory. To combat such deception, the ad-tech industry has
developed transparency standards and brand safety products.
In this paper, we show that these developments still fall short
of preventing deceptive monetization. Specifically, we focus on
how publishers can exploit the ad-tech supply chain, subvert
ad-tech transparency standards, and undermine brand safety
protections by pooling their ad inventory with unrelated sites.
This type of deception is referred to as “dark pooling.” Our
study shows that dark pooling is commonly employed by
misinformation publishers on various major ad exchanges, and
allows misinformation publishers to deceptively sell their ad
inventory to reputable brands. Our work suggests the need for
improved vetting of ad exchange supply partners, the adoption
of new ad-tech transparency standards that enable end-to-end
validation of the ad-tech supply chain, and the widespread
deployment of independent audits like ours.
1. Introduction
The complexity of online advertising lends itself to fraud.
A key to the success of online advertising is the ability
of advertisers and publishers to programmatically buy and
sell ad inventory across hundreds of millions of websites
in real-time [1]. Notably, Real-Time Bidding (RTB) allows
publishers to list their ad inventory for auction at an ad
exchange [2]. The ad exchange then asks its demand partners
to bid on the ad inventory listed by its supply partners, based
on the associated contextual and behavioral information.
The ad-tech supply chain is complex because it relies on
hundreds of specialized entities to effectively buy and sell
the ad inventory in real-time and at scale [3]. Adding to this
complexity, each ad impression often gets sold and resold
through multiple parallel or waterfall auctions [4]. Such
scale and complexity, combined with the opaque nature of
the ad-tech supply chain, makes it a ripe target for fraud and
abuse [5]–[13]. One of the most common types of ad fraud
involves creating low-quality websites and monetizing their
ad inventory. Fraudsters attempt to drive large volumes of
traffic to their website through various illicit means such as
bots, underground marketplaces, traffic exchanges, or even
driving legitimate traffic through click-bait and viral propa-
ganda [14]–[16]. A notable example that motivated our work
is that of the “Macedonian fake news complex” [17]–[19].
In this scheme, fraudsters created misinformation websites
with misleading and clickbait headlines, aiming to go viral
on social media, which led to tens of millions of monetized
ad impressions.
Advertisers are invested in preventing fraud. Ad-tech has
safeguards to protect against this type of ad fraud by block-
ing the ad inventory of low-quality websites even when the
ad impressions might be from legitimate users. Specifically,
brand safety features supported by demand-side platforms
aim to allow advertisers to block ad inventory of web pages
that contain hardcore violence, hate speech, pornography,
or other types of potentially objectionable content [20].
All the effort of fraudsters would be wasted if they are
unable to monetize their ad inventory through programmatic
advertising due to these brand safety features. Fraudsters
are known to exploit the opaque nature of the complex ad-
tech supply chain to undermine brand safety protections
by misrepresenting their ad inventory [21]. For example,
in domain spoofing [22], low-quality publishers mimic the
URLs of reputable publishers in their ad inventory, thus
deceiving reputable brands into purchasing their ad space
even when their original domain is blocked due to brand
safety concerns [23]–[25]. To combat ad fraud resulting
from misrepresented ad inventory, the Interactive Advertis-
ing Bureau (IAB) introduced two transparency standards.
ads.txt [26] requires publishers to disclose all authorized
sellers of their ad inventory. sellers.json [27] requires
ad exchanges to disclose all publishers and intermediate
sellers involved in selling the ad inventory. Together, when
correctly implemented, these standards can reduce ad fraud
by enabling buyers to verify the sources of the inventory
they are purchasing.
Transparency mechanisms to prevent fraud are falling
short. There is increasing concern that the ads.txt and
sellers.json standards are either not widely adopted,
implemented in ways that do not facilitate effective supply-
chain validation, or intentionally subverted by malicious
actors in a variety of ways. In this paper, we empirically
investigate these concerns. We find that the ads.txt and
sellers.json disclosures are plagued by a large number
of compliance issues and misrepresentations. Most notably,
we find extensive evidence of “pooling” of ad inventory
arXiv:2210.06654v3 [cs.CR] 14 Oct 2023
from unrelated websites — a practice known in the industry
as “dark pooling.” This makes it impossible for a buyer
to reliably identify the sources of the ad inventory (i.e.,
where their ad will ultimately be placed). Dark pooling
effectively enables low-quality publishers to “launder” their
ad inventory, making it indistinguishable from that of well-
reputed publishers. To gain insight into how low-quality
publishers might circumvent the transparency required by
the ads.txt and sellers.json standards, we selected
a set of well-known misinformation websites as a case
study. This choice is motivated by the known instances
where ads from reputable brands have inadvertently ended
up on such websites in the past [28]–[33]. Focusing on these
misinformation websites, we confirm: (1) their widespread
failure to comply with the ads.txt and sellers.json
standards; and (2) widespread prevalence of ad inventory
pooling. We also find instances of reputable brands buying
ad impressions on these misinformation websites, perhaps
unintentionally. Taken together, we make three key contri-
butions.
Measuring compliance with the transparency standards of
ads.txt and sellers.json.We study a set of control
and well-known misinformation websites to compare their
compliance with ads.txt and sellers.json. We find
that although compliance issues are widespread even in the
control set of websites, they are significantly more prevalent
on misinformation websites.
Measuring the prevalence of (dark) pooling. We measure
the high prevalence of ad inventory pooling by our control
and misinformation websites. By analyzing the ads.txt
and sellers.json files, we identified nearly 80 thousand
instances of pooling. We find that the misinformation pools
are significantly more than twice as likely to pool ad inven-
tory from unrelated websites than those that do not contain a
misinformation website. Upon further analysis of ad-related
metadata in network traffic, we confirmed the use of 297
pools across 38 ad exchanges by misinformation websites.
Measuring the (in)effectiveness of brand safety tools. We
find ads from 55 reputable brands, including Forbes, Go-
Daddy, Harvard, Intel, Microsoft, Nike, Samsung, Tumblr,
Yahoo!, Verizon, and Wayfair, on misinformation websites.
We investigate the correlation between the prevalence of
pooling and ads from reputable brands on misinformation
websites. We find that misinformation websites that are part
of at least one dark pool are nearly 20% more likely to
attract ads from reputable brands than those that are not part
of a dark pool. The responses to our disclosures indicate
that reputable brands are generally unaware of their ads
appearing on misinformation websites despite several using
a brand safety service.
While there is some anecdotal evidence of a general
lack of compliance with the ad-tech transparency standards
and dark pooling [34], [35], it does not systematically study
these issues at scale. To the best of our knowledge, our
work is the first to systematically study compliance with ad
transparency standards and (dark) pooling at scale.
Ad
Advertisers DSP SSPAd exchange Publisher User
sellers.json ads.txt
1
2
3
4
Figure 1: Programmatic advertising ecosystem: When a user
visits a publisher website (Step ), the publisher puts its
ad-inventory for sale on ad exchanges via SSPs in real-
time (Step ). Advertisers bid for these slots via DSPs
(Step ). Advertisement of the winning bid is displayed
to the user on the publisher website (Step ). To mitigate
fraud, advertisers use sellers.json of ad exchanges and
ads.txt of publishers to verify who is and who is not an
authorized seller of a given inventory.
2. Background
In this section, we provide a high-level overview of the
mechanisms behind the supply of programmatic ads (§2.1)
and the vulnerabilities in the ad supply chain (§2.2).
2.1. Programmatic advertising
Although there are a variety of mechanisms for program-
matic advertising (e.g., real-time bidding, header bidding,
exchange bidding) and the participating organizations might
differ, the types of entities involved in the supply chain
remain the same for each mechanism.
The programmatic advertising supply chain. Program-
matic advertising is made possible by the following entities
illustrated in Figure 1: supply-side platforms (SSPs) for pub-
lishers to list their ad inventory in real-time, ad exchanges
(AdX) which aggregate the inventory of multiple SSPs and
facilitate bidding on individual ad slots, and demand-side
platforms (DSPs) which allow advertisers and brands to
identify targets for their ad creatives by suitably bidding
on the inventory listed at ad exchanges. These entities work
together to create a supply chain for ads as follows: When
a user visits a publisher, the ad inventory associated with
that visit is put up for auction at an AdX by the SSP. DSPs,
operating on behalf of advertisers and brands, then make
bids on the ad inventory available at the AdX. These bids
are informed by what is known (to the DSP) about the user
and the publisher. The winner of the auction is then notified
by the AdX and the associated ad creative is used to fill the
ad slot on the publisher’s website.
Transparency in the supply chain. Crucial to the operation
of the ad supply chain is that the participating organizations
can trust that publishers and AdXs are not misrepresenting
their inventories or their relationships with other entities. For
example, DSPs need to confirm that the ad inventory that
they are bidding on is actually associated with a particular
publisher. Similarly, DSPs also need to confirm that the
AdXs that they are purchasing ad inventory from are actually
authorized to (re)sell that inventory. The absence of trust in
2
this supply chain can lead to situations where DSPs place
premium bids for ad slots that are actually associated with
non-premium publishers — ultimately leading to a brand’s
ad creative appearing on websites that they may not want
to be associated with. To foster trust and enable DSPs
(Demand-Side Platforms) to perform basic verification of
the ad inventory, the Interactive Advertising Bureau (IAB)
introduced two standards: ads.txt and sellers.json.
The ads.txt standard. The ads.txt1standard
(introduced in 2017) aims to address ad inventory
fraud by requiring each publisher domain to main-
tain an ads.txt file at the root level directory (e.g.,
publisher.example/ads.txt). The ads.txt file is
supposed to contain entries for all AdXs that are authorized
to sell or resell the ad inventory of the publisher. Each entry
in the ads.txt file contains the following fields:
the authorized AdX,
the publisher ID assigned to the publisher domain within
the AdX network, and
the authorized relationship between the publisher and
authorized AdX — i.e., whether the AdX is authorized
as a DIRECT seller or RESELLER of inventory for the
domain.
How ads.txt helps prevent fraud. When an ad request is
sent by a publisher to an AdX (which issues bid requests
to DSPs), the request contains the publisher ID and the
domain associated with the inventory being listed. Impor-
tantly, because publisher IDs are typically associated with
an organization and not a domain, it is possible for multiple
domains to share the same publisher ID. ads.txt enables
verification that a website is not spoofing the domain in their
ad requests. More specifically, ads.txt allows:
AdXs to verify that the publisher ID in the ad request
matches the publisher ID associated with the domain in
the ad request and
DSPs to verify that the AdX claiming to (re)sell the
inventory of a domain is authorized by the domain to
do so.
Before the ads.txt standard, there were no mechanisms
to facilitate such checks and the sale of fraudulent inventory
was widespread [21].
The sellers.json standard. Similar to the
ads.txt standard, sellers.json aims to miti-
gate ad inventory fraud and misrepresentation. The
sellers.json standard2requires each AdX and SSP
to maintain a sellers.json file at the root level
directory (e.g., adx.example/sellers.json).3This
sellers.json file must contain an entry for each entity
that may be paid for inventory purchased through the AdX
1. “ads” in ads.txt stands for Authorized Digital Sellers. Full spec-
ification of the ads.txt standard is available at: https://iabtechlab.com/
wp-content/uploads/2021/03/ads.txt-1.0.3.pdf
2. Full specification of the sellers.json standard is available at:
https://iabtechlab.com/wp-content/uploads/2019/07/Sellers.json Final.pdf
3. We observed that several AdXs, including Google, use non-
standard paths — e.g., Google’s sellers.json is located at https:
//storage.googleapis.com/adx-rtb-dictionaries/sellers.json
— i.e., one entry for each partner that is an inventory
source for the AdX. Each entry in the sellers.json
file contains the following fields:
the seller type which indicates whether the entry is as-
sociated with a PUBLISHER, an INTERMEDIARY (i.e.,
inventory reseller AdX), or BOTH (i.e., this entity has
their own inventory and also resells other inventory);
the seller ID associated with the inventory source (same
as the publisher ID in ads.txt if this entry is associated
with a publisher. From this point onwards we will refer
to seller ID or publisher ID as seller ID); and
the name and domain associated with the seller ID (these
fields may be marked as “confidential” by AdXs to
protect the privacy of publishers).
How sellers.json helps prevent fraud. When a bid
request is received by a DSP from an AdX that is compliant
with the sellers.json standard, it must contain infor-
mation about the provenance of the inventory in a Supply
Chain Object (SCO).4At a high level, the sellers.json
file provides a mechanism for DSPs to identify and verify
all the entities listed in this SCO. This is done as follows:
When a bid request is received by the DSP, it should use
the AdX’s sellers.json file to verify that the final
AdX has an authorized relationship with the prior holder
(an SSP or another AdX) of the inventory.
The previous step is applied recursively (on all inter-
mediate neighbors in the SCO) to verify the end-to-end
authenticity of the inventory.
The DSP then uses the sellers.json files of all
intermediaries and the ads.txt file of the publisher to
verify that the publisher is legitimate and (re)sellers who
handle the publisher’s inventory are authorized to do so.
This capability for end-to-end validation of the SCO (Supply
Chain Object) allows DSPs to identify instances where the
ad inventory originates from low-quality publishers using
fraudulent ads.txt files or is being sold by malicious
intermediaries.
2.2. Supply chain vulnerabilities
Despite the introduction of the ads.txt and
sellers.json standards, there remain various vulner-
abilities in the ad inventory supply chain. Our investiga-
tion focuses on the vulnerabilities that enable low-quality
publishers to monetize their ad inventory by misrepresent-
ing or obscuring its source. Some of these vulnerabili-
ties arise from misrepresentations in the ads.txt and
sellers.json files, while others arise from pooling their
low-quality inventory with the inventory of unrelated high-
quality publishers. We refer to the former as inventory
misrepresentation and the latter as dark pooling.
Inventory misrepresentation. Inventory misrepresentation
arises from misrepresentations of ad inventory by publishers.
It can be identified by discrepancies in the publisher’s
4. Supply Chain Object (SCO) contains an ordered list of all the entities
involved in the ad transaction (e.g., publisher SSP reseller AdX).
3
ads.txt file and is possible when DSPs and AdXs do
not follow the ads.txt and sellers.json standards.
Some examples of these misrepresentations include:
a publisher’s ads.txt file might incorrectly use seller
IDs of other publishers to suggest an authorized relation-
ship with an AdX to boost the perception of its inventory.
(Misrepresentations #1 and #2)
a publisher’s ads.txt file might incorrectly indicate
that a popular AdX is an authorized (re)seller of its
inventory to boost its reputation with other AdXs. (Mis-
representation #3)
a publisher’s ads.txt file might have more than
one entry of the same seller type for an AdX or
sellers.json files might associate a seller ID with
multiple publishers or sellers making ads.txt and
sellers.json verification unreliable. (Misrepresenta-
tions #4 and #9)
a publisher’s ads.txt file might list authorized relation-
ships with (re)sellers that do not have sellers.json
files, making end-to-end verification impossible. (Misrep-
resentation #8)
Dark pooling. Pooling is a common strategy to share
resources in online advertising. Consider, for example, the
case where two or more publishers are owned by the same
parent organization. In such scenarios, the ability to share
advertising infrastructure and AdX accounts allows for more
efficient operation and management. One way to identify
the occurrence of pooling is by noting a single AdX-
issued ‘seller ID’ shared by multiple publisher websites.
Dark pools are pools in which seller IDs are shared by
organizationally-unrelated publishers (possibly of differing
reputation). Note that “dark pooling” is a term of art that
is commonly used in industry. While pooling is not itself a
“dark” practice, pooling seller IDs of unrelated publishers
is considered a “dark” practice because it deceives potential
buyers about the actual source of the ad inventory [34], [35].
The seller ID defined in ads.txt and
sellers.json standards is also defined in the RTB
protocol [36], [37]. Note that the payment after successful
completion of an RTB auction is made to the publisher
(i.e., the seller) associated with the seller ID [38]. Hence,
it should be noted that simply using another domain’s
seller ID in ad requests from a website will result in any
ad-related payments being made to the owner of the seller
ID. Therefore, for revenue sharing, the creation of these
pools needs to be facilitated either through intermediaries
(e.g., SSPs) or by collaboration between publishers.
End-to-end validation of pooled supply chains. Pooling
leads to a break down of any brand or DSP’s ability to
perform end-to-end verification of the ad inventory supply
chain. Specifically, the final step of verification highlighted
in §2.1 cannot be meaningfully completed unless all do-
mains associated with a publisher’s account are publicly
known (and unfortunately, this is not the case). This is
because the end-to-end verification of the ad inventory
supply chain, as specified by the IAB, implicitly relies on
trust that seller IDs are actually associated with specific
organizations and that these associations are verified by
AdXs. We illustrate this with an example.
Consider a publisher website sportsnews.example
which has a legitimate subsidiary: nbanews.example.
The publisher registers for an account with a popular
AdX (adx) and is issued the seller ID sellerid after
being vetted by adx. It is expected that this website
can now share this seller ID with its subsidiaries. Both
websites will now list adx as a DIRECT seller through
the sellerid account in their ads.txt files.
The publisher now decides to share adx-issued
seller ID with fakesportsnews.example, an-
other sports news website but of low quality, for
a cut of the revenue generated from ads shown on
fakesportsnews.example. In its ads.txt file,
fakesportsnews.example now adds adx as a
DIRECT seller and also lists sellerid as its seller ID.
Note that fakesportsnews.example would other-
wise be unable to get directly listed on adx and monetize
its ad inventory due to its low quality.
When an ad request for some inventory is sent from
fakesportsnews.example, all basic supply chain
validation checks are successful because the seller ID
sellerid is in fact registered by adx in their
sellers.json file. Any bidding DSP will therefore
operate under the assumption that the website receiving
their ads has been vetted by adx and is associated with
sportsnews.example.
Complications only arise if the verifier notices that
sellerid was only registered to the owner of
sportsnews.example and the bid request actu-
ally originated at fakesportsnews.example. How-
ever, invalidating the bid request simply because of
this inconsistency will mean that even legitimate sub-
sidiaries such as nbanews.example cannot pool their
inventory. Instead, additional checks are required to
ascertain whether fakesportsnews.example and
sportsnews.example are related or whether adx
vetted fakesportsnews.example as well. This is-
sue remains unaddressed by current validation mecha-
nisms.
Caveat. The example described assumes collaboration
between publishers — sportsnews.example and
fakesportsnews.example. This might be inadvertent
in some cases — e.g., if sportsnews.example and
fakesportsnews.example are both assigned the same
seller ID through a common intermediary (an SSP, for
example as shown in Figure 2).
In sum, by pooling various unrelated websites under a
single seller ID, low-quality publishers can “launder” their
ad inventory, rendering it indistinguishable from the inven-
tory of high-quality publishers. Moreover, this can occur
when an AdX provides the seller ID to a trusted publisher
(or an SSP), which then inadequately vets the low-quality
publishers whose inventory it pools. Figure 2 illustrates this
scenario of syndication-based pooling by some intermediary
SSP. As we show later, such pooling is common. In fact, we
4
摘要:

TheInventoryisDarkandFullofMisinformation:UnderstandingAdInventoryPoolingintheAd-TechSupplyChainYashVekariaUniversityofCalifornia,DavisRishabNithyanandUniversityofIowaZubairShafiqUniversityofCalifornia,DavisAbstract—Ad-techenablespublisherstoprogrammaticallyselltheiradinventorytomillionsofdemandpart...

展开>> 收起<<
The Inventory is Dark and Full of Misinformation Understanding Ad Inventory Pooling in the Ad-Tech Supply Chain Yash Vekaria.pdf

共18页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:18 页 大小:1.58MB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 18
客服
关注