The Inventory is Dark and Full of Misinformation Understanding Ad Inventory Pooling in the Ad-Tech Supply Chain Yash Vekaria

2025-04-24 0 0 1.58MB 18 页 10玖币

侵权投诉

The Inventory is Dark and Full of Misinformation:

Understanding Ad Inventory Pooling in the Ad-Tech Supply Chain

Yash Vekaria

University of California, Davis

Rishab Nithyanand

University of Iowa

Zubair Shaﬁq

University of California, Davis

Abstract—Ad-tech enables publishers to programmatically sell

their ad inventory to millions of demand partners through a

complex supply chain. The complexity and opacity of the ad-

tech supply chain can be exploited by low-quality publishers

(e.g., misinformation websites) to deceptively monetize their ad

inventory. To combat such deception, the ad-tech industry has

developed transparency standards and brand safety products.

In this paper, we show that these developments still fall short

of preventing deceptive monetization. Speciﬁcally, we focus on

how publishers can exploit the ad-tech supply chain, subvert

ad-tech transparency standards, and undermine brand safety

protections by pooling their ad inventory with unrelated sites.

This type of deception is referred to as “dark pooling.” Our

study shows that dark pooling is commonly employed by

misinformation publishers on various major ad exchanges, and

allows misinformation publishers to deceptively sell their ad

inventory to reputable brands. Our work suggests the need for

improved vetting of ad exchange supply partners, the adoption

of new ad-tech transparency standards that enable end-to-end

validation of the ad-tech supply chain, and the widespread

deployment of independent audits like ours.

1. Introduction

The complexity of online advertising lends itself to fraud.

A key to the success of online advertising is the ability

of advertisers and publishers to programmatically buy and

sell ad inventory across hundreds of millions of websites

in real-time [1]. Notably, Real-Time Bidding (RTB) allows

publishers to list their ad inventory for auction at an ad

exchange [2]. The ad exchange then asks its demand partners

to bid on the ad inventory listed by its supply partners, based

on the associated contextual and behavioral information.

The ad-tech supply chain is complex because it relies on

hundreds of specialized entities to effectively buy and sell

the ad inventory in real-time and at scale [3]. Adding to this

complexity, each ad impression often gets sold and resold

through multiple parallel or waterfall auctions [4]. Such

scale and complexity, combined with the opaque nature of

the ad-tech supply chain, makes it a ripe target for fraud and

abuse [5]–[13]. One of the most common types of ad fraud

involves creating low-quality websites and monetizing their

ad inventory. Fraudsters attempt to drive large volumes of

trafﬁc to their website through various illicit means such as

bots, underground marketplaces, trafﬁc exchanges, or even

driving legitimate trafﬁc through click-bait and viral propa-

ganda [14]–[16]. A notable example that motivated our work

is that of the “Macedonian fake news complex” [17]–[19].

In this scheme, fraudsters created misinformation websites

with misleading and clickbait headlines, aiming to go viral

on social media, which led to tens of millions of monetized

ad impressions.

Advertisers are invested in preventing fraud. Ad-tech has

safeguards to protect against this type of ad fraud by block-

ing the ad inventory of low-quality websites even when the

ad impressions might be from legitimate users. Speciﬁcally,

brand safety features supported by demand-side platforms

aim to allow advertisers to block ad inventory of web pages

that contain hardcore violence, hate speech, pornography,

or other types of potentially objectionable content [20].

All the effort of fraudsters would be wasted if they are

unable to monetize their ad inventory through programmatic

advertising due to these brand safety features. Fraudsters

are known to exploit the opaque nature of the complex ad-

tech supply chain to undermine brand safety protections

by misrepresenting their ad inventory [21]. For example,

in domain spooﬁng [22], low-quality publishers mimic the

URLs of reputable publishers in their ad inventory, thus

deceiving reputable brands into purchasing their ad space

even when their original domain is blocked due to brand

safety concerns [23]–[25]. To combat ad fraud resulting

from misrepresented ad inventory, the Interactive Advertis-

ing Bureau (IAB) introduced two transparency standards.

ads.txt [26] requires publishers to disclose all authorized

sellers of their ad inventory. sellers.json [27] requires

ad exchanges to disclose all publishers and intermediate

sellers involved in selling the ad inventory. Together, when

correctly implemented, these standards can reduce ad fraud

by enabling buyers to verify the sources of the inventory

they are purchasing.

Transparency mechanisms to prevent fraud are falling

short. There is increasing concern that the ads.txt and

sellers.json standards are either not widely adopted,

implemented in ways that do not facilitate effective supply-

chain validation, or intentionally subverted by malicious

actors in a variety of ways. In this paper, we empirically

investigate these concerns. We ﬁnd that the ads.txt and

sellers.json disclosures are plagued by a large number

of compliance issues and misrepresentations. Most notably,

we ﬁnd extensive evidence of “pooling” of ad inventory

arXiv:2210.06654v3 [cs.CR] 14 Oct 2023

from unrelated websites — a practice known in the industry

as “dark pooling.” This makes it impossible for a buyer

to reliably identify the sources of the ad inventory (i.e.,

where their ad will ultimately be placed). Dark pooling

effectively enables low-quality publishers to “launder” their

ad inventory, making it indistinguishable from that of well-

reputed publishers. To gain insight into how low-quality

publishers might circumvent the transparency required by

the ads.txt and sellers.json standards, we selected

a set of well-known misinformation websites as a case

study. This choice is motivated by the known instances

where ads from reputable brands have inadvertently ended

up on such websites in the past [28]–[33]. Focusing on these

misinformation websites, we conﬁrm: (1) their widespread

failure to comply with the ads.txt and sellers.json

standards; and (2) widespread prevalence of ad inventory

pooling. We also ﬁnd instances of reputable brands buying

ad impressions on these misinformation websites, perhaps

unintentionally. Taken together, we make three key contri-

butions.

Measuring compliance with the transparency standards of

ads.txt and sellers.json.We study a set of control

and well-known misinformation websites to compare their

compliance with ads.txt and sellers.json. We ﬁnd

that although compliance issues are widespread even in the

control set of websites, they are signiﬁcantly more prevalent

on misinformation websites.

Measuring the prevalence of (dark) pooling. We measure

the high prevalence of ad inventory pooling by our control

and misinformation websites. By analyzing the ads.txt

and sellers.json ﬁles, we identiﬁed nearly 80 thousand

instances of pooling. We ﬁnd that the misinformation pools

are signiﬁcantly more than twice as likely to pool ad inven-

tory from unrelated websites than those that do not contain a

misinformation website. Upon further analysis of ad-related

metadata in network trafﬁc, we conﬁrmed the use of 297

pools across 38 ad exchanges by misinformation websites.

Measuring the (in)effectiveness of brand safety tools. We

ﬁnd ads from 55 reputable brands, including Forbes, Go-

Daddy, Harvard, Intel, Microsoft, Nike, Samsung, Tumblr,

Yahoo!, Verizon, and Wayfair, on misinformation websites.

We investigate the correlation between the prevalence of

pooling and ads from reputable brands on misinformation

websites. We ﬁnd that misinformation websites that are part

of at least one dark pool are nearly 20% more likely to

attract ads from reputable brands than those that are not part

of a dark pool. The responses to our disclosures indicate

that reputable brands are generally unaware of their ads

appearing on misinformation websites despite several using

a brand safety service.

While there is some anecdotal evidence of a general

lack of compliance with the ad-tech transparency standards

and dark pooling [34], [35], it does not systematically study

these issues at scale. To the best of our knowledge, our

work is the ﬁrst to systematically study compliance with ad

transparency standards and (dark) pooling at scale.

Advertisers DSP SSPAd exchange Publisher User

sellers.json ads.txt

Figure 1: Programmatic advertising ecosystem: When a user

visits a publisher website (Step ❶), the publisher puts its

ad-inventory for sale on ad exchanges via SSPs in real-

time (Step ❷). Advertisers bid for these slots via DSPs

(Step ❸). Advertisement of the winning bid is displayed

to the user on the publisher website (Step ❹). To mitigate

fraud, advertisers use sellers.json of ad exchanges and

ads.txt of publishers to verify who is and who is not an

authorized seller of a given inventory.

2. Background

In this section, we provide a high-level overview of the

mechanisms behind the supply of programmatic ads (§2.1)

and the vulnerabilities in the ad supply chain (§2.2).

2.1. Programmatic advertising

Although there are a variety of mechanisms for program-

matic advertising (e.g., real-time bidding, header bidding,

exchange bidding) and the participating organizations might

differ, the types of entities involved in the supply chain

remain the same for each mechanism.

The programmatic advertising supply chain. Program-

matic advertising is made possible by the following entities

illustrated in Figure 1: supply-side platforms (SSPs) for pub-

lishers to list their ad inventory in real-time, ad exchanges

(AdX) which aggregate the inventory of multiple SSPs and

facilitate bidding on individual ad slots, and demand-side

platforms (DSPs) which allow advertisers and brands to

identify targets for their ad creatives by suitably bidding

on the inventory listed at ad exchanges. These entities work

together to create a supply chain for ads as follows: When

a user visits a publisher, the ad inventory associated with

that visit is put up for auction at an AdX by the SSP. DSPs,

operating on behalf of advertisers and brands, then make

bids on the ad inventory available at the AdX. These bids

are informed by what is known (to the DSP) about the user

and the publisher. The winner of the auction is then notiﬁed

by the AdX and the associated ad creative is used to ﬁll the

ad slot on the publisher’s website.

Transparency in the supply chain. Crucial to the operation

of the ad supply chain is that the participating organizations

can trust that publishers and AdXs are not misrepresenting

their inventories or their relationships with other entities. For

example, DSPs need to conﬁrm that the ad inventory that

they are bidding on is actually associated with a particular

publisher. Similarly, DSPs also need to conﬁrm that the

AdXs that they are purchasing ad inventory from are actually

authorized to (re)sell that inventory. The absence of trust in

this supply chain can lead to situations where DSPs place

premium bids for ad slots that are actually associated with

non-premium publishers — ultimately leading to a brand’s

ad creative appearing on websites that they may not want

to be associated with. To foster trust and enable DSPs

(Demand-Side Platforms) to perform basic veriﬁcation of

the ad inventory, the Interactive Advertising Bureau (IAB)

introduced two standards: ads.txt and sellers.json.

The ads.txt standard. The ads.txt1standard

(introduced in 2017) aims to address ad inventory

fraud by requiring each publisher domain to main-

tain an ads.txt ﬁle at the root level directory (e.g.,

publisher.example/ads.txt). The ads.txt ﬁle is

supposed to contain entries for all AdXs that are authorized

to sell or resell the ad inventory of the publisher. Each entry

in the ads.txt ﬁle contains the following ﬁelds:

•the authorized AdX,

•the publisher ID assigned to the publisher domain within

the AdX network, and

•the authorized relationship between the publisher and

authorized AdX — i.e., whether the AdX is authorized

as a DIRECT seller or RESELLER of inventory for the

domain.

How ads.txt helps prevent fraud. When an ad request is

sent by a publisher to an AdX (which issues bid requests

to DSPs), the request contains the publisher ID and the

domain associated with the inventory being listed. Impor-

tantly, because publisher IDs are typically associated with

an organization and not a domain, it is possible for multiple

domains to share the same publisher ID. ads.txt enables

veriﬁcation that a website is not spooﬁng the domain in their

ad requests. More speciﬁcally, ads.txt allows:

•AdXs to verify that the publisher ID in the ad request

matches the publisher ID associated with the domain in

the ad request and

•DSPs to verify that the AdX claiming to (re)sell the

inventory of a domain is authorized by the domain to

do so.

Before the ads.txt standard, there were no mechanisms

to facilitate such checks and the sale of fraudulent inventory

was widespread [21].

The sellers.json standard. Similar to the

ads.txt standard, sellers.json aims to miti-

gate ad inventory fraud and misrepresentation. The

sellers.json standard2requires each AdX and SSP

to maintain a sellers.json ﬁle at the root level

directory (e.g., adx.example/sellers.json).3This

sellers.json ﬁle must contain an entry for each entity

that may be paid for inventory purchased through the AdX

1. “ads” in ads.txt stands for Authorized Digital Sellers. Full spec-

iﬁcation of the ads.txt standard is available at: https://iabtechlab.com/

wp-content/uploads/2021/03/ads.txt-1.0.3.pdf

2. Full speciﬁcation of the sellers.json standard is available at:

https://iabtechlab.com/wp-content/uploads/2019/07/Sellers.json Final.pdf

3. We observed that several AdXs, including Google, use non-

standard paths — e.g., Google’s sellers.json is located at https:

//storage.googleapis.com/adx-rtb-dictionaries/sellers.json

— i.e., one entry for each partner that is an inventory

source for the AdX. Each entry in the sellers.json

ﬁle contains the following ﬁelds:

•the seller type which indicates whether the entry is as-

sociated with a PUBLISHER, an INTERMEDIARY (i.e.,

inventory reseller AdX), or BOTH (i.e., this entity has

their own inventory and also resells other inventory);

•the seller ID associated with the inventory source (same

as the publisher ID in ads.txt if this entry is associated

with a publisher. From this point onwards we will refer

to seller ID or publisher ID as seller ID); and

•the name and domain associated with the seller ID (these

ﬁelds may be marked as “conﬁdential” by AdXs to

protect the privacy of publishers).

How sellers.json helps prevent fraud. When a bid

request is received by a DSP from an AdX that is compliant

with the sellers.json standard, it must contain infor-

mation about the provenance of the inventory in a Supply

Chain Object (SCO).4At a high level, the sellers.json

ﬁle provides a mechanism for DSPs to identify and verify

all the entities listed in this SCO. This is done as follows:

•When a bid request is received by the DSP, it should use

the AdX’s sellers.json ﬁle to verify that the ﬁnal

AdX has an authorized relationship with the prior holder

(an SSP or another AdX) of the inventory.

•The previous step is applied recursively (on all inter-

mediate neighbors in the SCO) to verify the end-to-end

authenticity of the inventory.

•The DSP then uses the sellers.json ﬁles of all

intermediaries and the ads.txt ﬁle of the publisher to

verify that the publisher is legitimate and (re)sellers who

handle the publisher’s inventory are authorized to do so.

This capability for end-to-end validation of the SCO (Supply

Chain Object) allows DSPs to identify instances where the

ad inventory originates from low-quality publishers using

fraudulent ads.txt ﬁles or is being sold by malicious

intermediaries.

2.2. Supply chain vulnerabilities

Despite the introduction of the ads.txt and

sellers.json standards, there remain various vulner-

abilities in the ad inventory supply chain. Our investiga-

tion focuses on the vulnerabilities that enable low-quality

publishers to monetize their ad inventory by misrepresent-

ing or obscuring its source. Some of these vulnerabili-

ties arise from misrepresentations in the ads.txt and

sellers.json ﬁles, while others arise from pooling their

low-quality inventory with the inventory of unrelated high-

quality publishers. We refer to the former as inventory

misrepresentation and the latter as dark pooling.

Inventory misrepresentation. Inventory misrepresentation

arises from misrepresentations of ad inventory by publishers.

It can be identiﬁed by discrepancies in the publisher’s

4. Supply Chain Object (SCO) contains an ordered list of all the entities

involved in the ad transaction (e.g., publisher →SSP →reseller →AdX).

ads.txt ﬁle and is possible when DSPs and AdXs do

not follow the ads.txt and sellers.json standards.

Some examples of these misrepresentations include:

•a publisher’s ads.txt ﬁle might incorrectly use seller

IDs of other publishers to suggest an authorized relation-

ship with an AdX to boost the perception of its inventory.

(Misrepresentations #1 and #2)

•a publisher’s ads.txt ﬁle might incorrectly indicate

that a popular AdX is an authorized (re)seller of its

inventory to boost its reputation with other AdXs. (Mis-

representation #3)

•a publisher’s ads.txt ﬁle might have more than

one entry of the same seller type for an AdX or

sellers.json ﬁles might associate a seller ID with

multiple publishers or sellers making ads.txt and

sellers.json veriﬁcation unreliable. (Misrepresenta-

tions #4 and #9)

•a publisher’s ads.txt ﬁle might list authorized relation-

ships with (re)sellers that do not have sellers.json

ﬁles, making end-to-end veriﬁcation impossible. (Misrep-

resentation #8)

Dark pooling. Pooling is a common strategy to share

resources in online advertising. Consider, for example, the

case where two or more publishers are owned by the same

parent organization. In such scenarios, the ability to share

advertising infrastructure and AdX accounts allows for more

efﬁcient operation and management. One way to identify

the occurrence of pooling is by noting a single AdX-

issued ‘seller ID’ shared by multiple publisher websites.

Dark pools are pools in which seller IDs are shared by

organizationally-unrelated publishers (possibly of differing

reputation). Note that “dark pooling” is a term of art that

is commonly used in industry. While pooling is not itself a

“dark” practice, pooling seller IDs of unrelated publishers

is considered a “dark” practice because it deceives potential

buyers about the actual source of the ad inventory [34], [35].

The seller ID deﬁned in ads.txt and

sellers.json standards is also deﬁned in the RTB

protocol [36], [37]. Note that the payment after successful

completion of an RTB auction is made to the publisher

(i.e., the seller) associated with the seller ID [38]. Hence,

it should be noted that simply using another domain’s

seller ID in ad requests from a website will result in any

ad-related payments being made to the owner of the seller

ID. Therefore, for revenue sharing, the creation of these

pools needs to be facilitated either through intermediaries

(e.g., SSPs) or by collaboration between publishers.

End-to-end validation of pooled supply chains. Pooling

leads to a break down of any brand or DSP’s ability to

perform end-to-end veriﬁcation of the ad inventory supply

chain. Speciﬁcally, the ﬁnal step of veriﬁcation highlighted

in §2.1 cannot be meaningfully completed unless all do-

mains associated with a publisher’s account are publicly

known (and unfortunately, this is not the case). This is

because the end-to-end veriﬁcation of the ad inventory

supply chain, as speciﬁed by the IAB, implicitly relies on

trust that seller IDs are actually associated with speciﬁc

organizations and that these associations are veriﬁed by

AdXs. We illustrate this with an example.

•Consider a publisher website sportsnews.example

which has a legitimate subsidiary: nbanews.example.

The publisher registers for an account with a popular

AdX (adx) and is issued the seller ID sellerid after

being vetted by adx. It is expected that this website

can now share this seller ID with its subsidiaries. Both

websites will now list adx as a DIRECT seller through

the sellerid account in their ads.txt ﬁles.

•The publisher now decides to share adx-issued

seller ID with fakesportsnews.example, an-

other sports news website but of low quality, for

a cut of the revenue generated from ads shown on

fakesportsnews.example. In its ads.txt ﬁle,

fakesportsnews.example now adds adx as a

DIRECT seller and also lists sellerid as its seller ID.

Note that fakesportsnews.example would other-

wise be unable to get directly listed on adx and monetize

its ad inventory due to its low quality.

•When an ad request for some inventory is sent from

fakesportsnews.example, all basic supply chain

validation checks are successful because the seller ID

sellerid is in fact registered by adx in their

sellers.json ﬁle. Any bidding DSP will therefore

operate under the assumption that the website receiving

their ads has been vetted by adx and is associated with

sportsnews.example.

•Complications only arise if the veriﬁer notices that

sellerid was only registered to the owner of

sportsnews.example and the bid request actu-

ally originated at fakesportsnews.example. How-

ever, invalidating the bid request simply because of

this inconsistency will mean that even legitimate sub-

sidiaries such as nbanews.example cannot pool their

inventory. Instead, additional checks are required to

ascertain whether fakesportsnews.example and

sportsnews.example are related or whether adx

vetted fakesportsnews.example as well. This is-

sue remains unaddressed by current validation mecha-

nisms.

Caveat. The example described assumes collaboration

between publishers — sportsnews.example and

fakesportsnews.example. This might be inadvertent

in some cases — e.g., if sportsnews.example and

fakesportsnews.example are both assigned the same

seller ID through a common intermediary (an SSP, for

example as shown in Figure 2).

In sum, by pooling various unrelated websites under a

single seller ID, low-quality publishers can “launder” their

ad inventory, rendering it indistinguishable from the inven-

tory of high-quality publishers. Moreover, this can occur

when an AdX provides the seller ID to a trusted publisher

(or an SSP), which then inadequately vets the low-quality

publishers whose inventory it pools. Figure 2 illustrates this

scenario of syndication-based pooling by some intermediary

SSP. As we show later, such pooling is common. In fact, we

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TheInventoryisDarkandFullofMisinformation:UnderstandingAdInventoryPoolingintheAd-TechSupplyChainYashVekariaUniversityofCalifornia,DavisRishabNithyanandUniversityofIowaZubairShafiqUniversityofCalifornia,DavisAbstract—Ad-techenablespublisherstoprogrammaticallyselltheiradinventorytomillionsofdemandpart...

展开>> 收起<<

The Inventory is Dark and Full of Misinformation Understanding Ad Inventory Pooling in the Ad-Tech Supply Chain Yash Vekaria.pdf

共18页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

The Inventory is Dark and Full of Misinformation Understanding Ad Inventory Pooling in the Ad-Tech Supply Chain Yash Vekaria

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: