Expertise diversity of teams predicts originality and long-term impact in science and technology Weihua Li

2025-04-27 0 0 981.67KB 15 页 10玖币
侵权投诉
Expertise diversity of teams predicts originality and long-term impact in science and
technology
Weihua Li
LMIB, NLSDE, BDBC, and School of Artificial Intelligence, Beihang University, Beijing, China
Department of Advanced Interdisciplinary Research, Pengcheng Laboratory, Shenzhen, China
Zhongguancun Laboratory, Beijing, China and
Qianyuan Laboratory, Hangzhou, China
Hongwei Zheng
Beijing Academy of Blockchain and Edge Computing, Beijing, China
(Dated: March 25, 2025)
Despite the growing importance of collaboration networks in producing innovative science and
technology, it remains unclear how expertise diversity among team members relates to the original-
ity and impact of the work they produce. Here, drawing on statistical physics, we develop a new
computational method to quantify the expertise distance of researchers based on their prior career
histories and apply it to 23 million scientific publications and 4 million patents. We find that across
science and technology, teams with expertise diversity tend to produce work with greater original-
ity. Teams with more diverse expertise exhibit substantially higher long-term impact (10 years),
increasingly attracting larger cross-disciplinary influence. This impact premium of expertise diver-
sity among team members becomes especially pronounced when other dimensions of team diversity
are missing, as teams within the same institution or country appear to disproportionately reap the
benefits of expertise diversity. While gender-diverse teams have relatively higher impact on average,
teams with varied levels of gender diversity all seem to benefit from increased expertise diversity.
Given the growing knowledge demands on individual researchers, implementation of incentives for
innovative research, and the tradeoffs between short-term and long-term impacts, these results may
have implications for funding, assembling, and retaining teams with originality and long-lasting
impacts.
Keywords: Collaboration networks |Diversity |Science of science |Innovation
I. INTRODUCTION
People are increasingly more likely to form collaboration networks in producing innovative work across a wide range
of creative domains[1–7]. The accumulation of knowledge[8] and the specialization of individual researchers[9, 10] un-
derscore the necessity of forming interdisciplinary teams to tackle complex challenges. In an era where narrow expertise
from a single field may prove insufficient to address pressing societal issues, collaborative efforts of teams spanning
traditional disciplinary boundaries become indispensable[11–15]. Teamwork also allows researchers to establish profes-
sional connections, cultivate networks, and engage with a broader audience, both within and beyond their immediate
research community[16]. As growing scholarly attention has been paid to increasing diversity, equity, and inclusion
in science, the scientific community demands more efforts to improve representational diversity in the composition of
research teams[17, 18]. A better understanding of how to assemble teams with diverse expertise is therefore essential
for coordinating collective actions, fostering interdisciplinary thinking, and integrating existing expertise to tackle
new challenges across scientific and technological domains[19–25].
However, it remains unclear how prior expertise diversity among team members is related to the originality and
the impact of the work the team produces. Some indicators have focused on quantifying the diversity of the prior
expertise among individuals, by calculating for example the Pearson correlation of the distribution across technological
classes distributions[26], a research overlap score using medical subject headings (MeSH) terms[21], a frequency-
inverse document frequency (TF-IDF) method for research content similarity[27], latent semantic analysis for the
topic similarity between researchers[24], a cosine distance estimate for prior expertise disparity[28], and a distance
metric based on the Jaccard dissimilarity of researchers’ references[29]. Although these methods can approximate
expertise diversity among team members, an effective measure should also explicitly account for the relatedness of
research disciplines[30, 31]. For instance, ecology has more interactions with environmental science and evolutionary
biology than condensed matter physics. Similarly, artificial intelligence is more deeply influenced by mathematics
Corresponding author: hwzheng@pku.edu.cn
arXiv:2210.04422v3 [physics.soc-ph] 22 Mar 2025
2
and computer science compared to organic chemistry. Therefore, ecologists should on average have higher expertise
similarity to environmental scientists and evolutionary biologists than condensed matter physicists, and the knowledge
possessed by artificial intelligence researchers may be more related to the knowledge of mathematicians and computer
scientists than that of organic chemists.
In part to tackle these challenges, researchers have developed several methods to infer the diversity of knowledge
scope from the product the team has developed, by using a paper’s references or citations, using measurements such
as the distinction of fields, entropy, Gini coefficient, and the Rao-Stirling (RS) index, to quantify interdisciplinarity
of the work and its association with impact[30, 32–36]. While these measures allow researchers to quantify the
interdisciplinarity of the work and its association with originality and impact, as proxies for knowledge diversity, they
depend upon the paper that a team has produced, by analyzing its references or citations, which are only available
after the fact, i.e., after the paper has been published. Moreover, these metrics are designed to quantify the diversity of
knowledge scope implemented in a specific piece of work, rather than a comparative measure to estimate the disparity
of expertise among individual scholars. It remains unclear how the composition of team expertise is related to the
originality and impact of research a team is about to produce. Understanding the association between the expertise
diversity among team members and the outcome the team produces is crucial for funding and investment decisions,
and highlights the importance of developing measures to quantify and combine diverse prior knowledge to inform
fruitful collaboration strategies.
To address these challenges, we propose a new metric to identify and quantify the diversity of prior expertise
among team members that extends beyond single disciplines. Our expertise distance metric explicitly accounts for the
relatedness of scientific fields and draws on the disciplinary distributions of prior career histories among collaborators.
The expertise diversity of a team is then obtained as the average distance of all possible pairwise coauthorships, which
correlates with a broad range of indicators of scholarly diversity in terms of the combination of past knowledge, and
other dimensions of diversity among team members regarding their affiliations, nationality, and gender.
A large body of work has focused on understanding the interplay between team composition and outcomes, probing
dimensions of team diversity around affiliations[2], ethnicity[37], gender[38], expertise[12], technical background[39],
problem-solving ability[40], intelligence[41], and more. In the technology sector, inventors from distinct social groups
tend to generate patents with greater collaborative creativity[42]. In the online knowledge-sharing community, po-
larized teams composed of a balanced proportion of ideologically diverse Wikipedia editors produce articles of higher
quality than homogeneous teams[43]. Some scholars have found that multidisciplinarity exhibits an inverted U-
shape relationship with scientific impact, although the vast majority of papers indicate a positive correlation with
multisciplinarity[44]. Furthermore, the value of multidisciplinary research might require a longer period to be rec-
ognized by the scientific community, leading to delayed impact[45]. Overall, studies from diverse domains have
demonstrated that the impact of team outcomes improves with team diversity. Thus, creative output in science and
technology may be heavily rooted in the composition of team members’ expertise and background, which is largely
determined during the team assembly process.
Originality is often regarded as a core goal in science and technology. Recent work has suggested that creative ideas
often emerge from new and unconventional combinations of knowledge from diverse disciplines, research methods, or
frameworks[16, 46]. Innovation can be spurred when proven methodologies in one domain are introduced to solve
problems in a fresh area[47]. Creativity is more likely to emerge from diverse teams as researchers integrate concepts
and methods drawn from diverse disciplines, forging connections between seemingly disparate concepts or bodies
of knowledge[1, 48]. Although studies have suggested that high multidisciplinarity is associated with high impact,
our findings suggest that scientific teamwork with multidisciplinary approaches has little correlation with originality,
quantified by the disruption score, and has negative correlations with originality in technology. Research by expertise-
diverse teams, in contrast, is positively correlated with high originality in science and technology. Therefore, despite
its close relation with multidisciplinarity, the expertise diversity of teams not only presents a new quantitative measure
to understand science but also provides new perspectives for the originality of team outcomes.
Moreover, while scholars have argued that multidisciplinarity is associated with the impact of research[49, 50], we
find that research teams with high expertise diversity exhibit no significant impact advantage in the short- (2 years)
or mid-term (5 years). This pattern persists for teams spanning both scientific and technological domains. We find
that, instead, teams with high expertise diversity enjoy a substantive impact premium of their work in the long-term
(10 years), increasingly attracting cross-disciplinary influence in the longer run. The long-term effect of expertise
diversity becomes more prominent as team size and citation time window grow. In particular, when other dimensions
of diversity are missing, teams formed in the same institution or country disproportionately harness the benefit of
expertise diversity. These results may have implications for fostering and retaining innovative teams with more diverse
knowledge composition among team members.
3
II. EXPERTISE DISTANCE METRIC
We use the field distribution of the publication history of individual authors to quantify the expertise distance
among coauthors. Suppose for field f, the cumulative reference vector vf= (v1
f, ..., vn
f) records the numbers of
references of papers from field fmade to all fields, where nis the number of fields[15, 30]. We then define the unit
vector of field fas ef=vf/||vf||, where ||vf|| =qPm(vm
f)2.
For an author i, let the publication vector pirecord the number of publications author ihad in each field up to
year t. The goal is to introduce a definition of expertise distance between a pair of authors that accounts for the
relatedness between fields. An explicit form of piis pi=a1
ie1+... +af
ief+... +an
ien, where af
iis the number
of papers author ipublished in field f. We want to first normalize piand obtain expertise vector qiof unit length
qi=pi/||pi|| =q1
ie1+... +qf
ief+... +qn
ienand qi= (q1
i, ..., qf
i, ..., qn
i). Note that different from the definition of
field vector length, here we explicitly account for the heterogeneous relatedness between fields and define
||pi||2= (a1
ie1+... +af
ief+... +an
ien)2=X
j,k
aj
iak
i(ej·ek).(1)
Similarly, the cosine distance between expertise vectors of authors iand jcan be calculated as qi·qj=qiM qjT,
and the distance metric can be obtained as
dij =q||qiqj||2=p22qi·qj=q22qiM qjT,(2)
where M= (eiej)i,j is the matrix indicating the closeness between fields using cosine distance. Analogously, if we
let pi= (p1
i, ..., pf
i, ..., pn
i), the length of publication vector of author ican be obtained as
||pi|| =qpiM piT.(3)
To estimate the prior expertise distance of a pair of authors (i, j) that coauthored a paper sat year t, we first
build their respective publication vectors using all papers up to year t. We use the level 1 field classification in the
MAG dataset, where each article is assigned to at least one scientific field. For the crude publication vector pi(t)
of author i,af
idenotes the number of papers ipublished in the field fup to t. To account for authors with a
reasonably long publishing career, we include only productive authors who have published at least 5 papers up to
t. The mean expertise distance of the paper dsis the average distance of all possible coauthor pairs (i, j) among
selected productive authors.
Papers with at least two productive authors are eligible to obtain a measure of team expertise distance, and those
written by solitary authors are not considered in this study. To allow for a meaningfully long prior publication record
of individual researchers, we consider only authors who have published at least 5 papers by the time of collaboration
when estimating the expertise distance of the team. Thus, a proportion of less productive or early-career researchers
are dropped and we obtain a subset of team-authored papers eligible for a distance metric.
III. DATA
We use the publication and citation data in 1950-2019 from the Microsoft Academic Graph (MAG) dataset[51]. We
select papers published in journals for science, technology, and social sciences, namely biology, business, chemistry,
computer science, economics, engineering, environmental science, geography, geology, materials science, mathematics,
medicine, physics, political science, sociology, and papers published in conferences for computer science. We also
extract patent data from the MAG archive.
There are some known limitations and deficiencies of MAG, particularly regarding citation data[52]. To address
this issue, we implement a rigorous data filtering process before the analysis (see also Supplementary Information).
In particular, we exclusively include papers sourced from journals or conference proceedings within MAG, and ensure
that all papers contain author affiliation information. After data filtering procedures, we retain a total number of
22.8 million papers, and 354 million citations from papers to papers. We also extract 4.4 million patents, which make
29.9 million citations among themselves, and 5.6 million citations to the selected subset of research articles.
摘要:

Expertisediversityofteamspredictsoriginalityandlong-termimpactinscienceandtechnologyWeihuaLiLMIB,NLSDE,BDBC,andSchoolofArtificialIntelligence,BeihangUniversity,Beijing,ChinaDepartmentofAdvancedInterdisciplinaryResearch,PengchengLaboratory,Shenzhen,ChinaZhongguancunLaboratory,Beijing,ChinaandQianyuan...

展开>> 收起<<
Expertise diversity of teams predicts originality and long-term impact in science and technology Weihua Li.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:981.67KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注