BLAB Reporter Automated journalism covering the Blue Amazon Yan Vianna Sym Escola Politécnica

2025-04-27 0 0 154.23KB 4 页 10玖币
侵权投诉
BLAB Reporter: Automated journalism covering the Blue Amazon
Yan Vianna Sym
Escola Politécnica
Universidade de São Paulo
São Paulo, Brazil
yan.sym@usp.br
João Gabriel Moura Campos
Escola Politécnica
Universidade de São Paulo
São Paulo, Brazil
joaogcampos@usp.br
Fabio Gagliardi Cozman
Escola Politécnica
Universidade de São Paulo
São Paulo, Brazil
fgcozman@usp.br
Abstract
This demo paper introduces the BLAB Re-
porter, a robot-journalist covering the Brazil-
ian Blue Amazon. The Reporter is based
on a pipeline architecture for Natural Lan-
guage Generation; it offers daily reports,
news summaries and curious facts in Brazil-
ian Portuguese. By collecting, storing and
analysing structured data from publicly avail-
able sources, the robot-journalist uses domain
knowledge to generate and publish texts in
Twitter. Code and corpus are publicly avail-
able 1.
1 Introduction
Data-to-text Natural Language Generation (NLG)
is the computational process of generating mean-
ingful and coherent natural text or speech to de-
scribe non-linguistic input data (Reiter and Dale,
2000). Successful examples of data-to-text systems
can be found in both academia and industry, with
applications in weather forecasting (Belz,2008),
image captions and chatbots (Adamopoulou and
Moussiades,2020). Amongst NLG applications,
robot-journalism is one of the most prominent en-
deavors thanks to the high volume of structured
data streams available, which enables automated
systems to report recurrent information with high-
fidelity and lexical variety (Teixeira et al.,2020).
An interesting domain for data-to-text genera-
tion is ocean monitoring. For instance, global at-
tention was drawn in 2021 to a container ship that
obstructed the Suez Canal for six consecutive days,
causing a global shortage of essential commodities,
including medical supplies and medicines during
the coronavirus pandemic. Accurate and low la-
tency information reports can be very helpful in
these situations, but communicating to general au-
diences usually demands coverage by specialized
human journalists. To address this issue, we present
1https://github.com/C4AI/blab-reporter
our robot-journalist named BLAB Reporter, a NLG
system based on a pipeline architecture that gen-
erates daily reports, news, content summarization
and curious facts about the Blue Amazon and pub-
lishes them on Twitter in Brazilian Portuguese
2
.
The Blue Amazon is the exclusive economic zone
(EEZ) of Brazil, with an offshore area of 3.6 mil-
lion square kilometers along the Brazilian coast,
an area rich in marine biodiversity and energy re-
sources (Wiesebron,2013). The BLue Amazon
Brain (BLAB) is a project aiming to address com-
plex questions about the marine ecosystem, and
integrates a number of services aimed at dissemi-
nating information about the Blue Amazon.
2 System overview
Our system follows a pipeline architecture that con-
verts non-linguistic data into text in 6 steps: Con-
tent Selection, Discourse Ordering, Text Structur-
ing, Lexicalization, Referring Expression Genera-
tion and Textual Realization (Ferreira et al.,2019).
Our system also comprises two additional steps:
Data Acquisition (for extracting and storing infor-
mation from multiple data streams in a structured
format) and Summarization (for summarizing news
in the form of small consecutive tweets). This kind
of architecture, depicted in Figure 1, allows for
trustworthy output as well as easy access to and
maintenance of sub-modules.
The grammar used by the model was built by
first running the content selection step in previ-
ous data and generating 30 non-linguistic reports.
These non-linguistic reports were then manually
verbalized and the input and output representations
for each pipeline module were manually annotated.
When deployed, each module draws on the se-
lected combination of templates using rule-based
approaches. Because we deal with a sensitive do-
main, we opted to use the pipeline architecture
2https://twitter.com/BLAB_Reporter
arXiv:2210.06431v1 [cs.CL] 8 Oct 2022
摘要:

BLABReporter:AutomatedjournalismcoveringtheBlueAmazonYanViannaSymEscolaPolitécnicaUniversidadedeSãoPauloSãoPaulo,Brazilyan.sym@usp.brJoãoGabrielMouraCamposEscolaPolitécnicaUniversidadedeSãoPauloSãoPaulo,Braziljoaogcampos@usp.brFabioGagliardiCozmanEscolaPolitécnicaUniversidadedeSãoPauloSãoPaulo,Brazi...

展开>> 收起<<
BLAB Reporter Automated journalism covering the Blue Amazon Yan Vianna Sym Escola Politécnica.pdf

共4页,预览1页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:4 页 大小:154.23KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 4
客服
关注