Using Deep Learning to Find the Next Unicorn A Practical Synthesis LeleCao

2025-05-06 7 0 1.79MB 49 页 10玖币

侵权投诉

Using Deep Learning to Find the Next

Unicorn: A Practical Synthesis

Lele Cao

Motherbrain, EQT

lele.cao@eqtpartners.com caolele@gmail.com

Vilhelm von Ehrenheim

Motherbrain, EQT

vilhelm.vonehrenheim@eqtpartners.com

Sebastian Krakowski

House of Innovation, Stockholm School of Economics

sebastian.krakowski@hhs.se

Xiaoxue Li

Department of Political Science, Stockholm University

xiaoxue.li@statsvet.su.se

Alexandra Lutz

Motherbrain, EQT

alexandra.lutz@eqtpartners.com

A condensed version [1] is peer reviewed and published by IJCAI 2023 (The 32nd International Joint

Conference on Artiﬁcial Intelligence) Workshop: https://aclanthology.org/2023.finnlp-1.6.

Chicago Citation Format:

Cao, Lele, Vilhelm von Ehrenheim, Sebastian Stan, Xiaoxue Li, and Alexandra Lutz. "Using Deep Learning

to Find the Next Unicorn: A Practical Synthesis on Optimization Target, Feature Selection, Data Split

and Evaluation Strategy."Proceedings of the IJCAI Joint Workshop on the 5th Financial Technology and

Natural Language Processing (FinNLP) and the 2nd Multimodal AI for Financial Forecasting (Muffin),

pp. 63-73, 2023.

Please send correspondence to the ﬁrst author – Lele Cao, Motherbrain AI Research, EQT Group, Regeringsgatan 25,

11153 Stockholm, Sweden; e-mail: caolele@gmail.com.

arXiv:2210.14195v2 [q-fin.CP] 10 Jun 2024

Abstract

Startups often represent newly established business models associated with disruptive innovation

and high scalability. They are commonly regarded as powerful engines for economic and

social development. Meanwhile, startups are heavily constrained by many factors such as

limited ﬁnancial funding and human resources. Therefore, the chance for a startup to eventually

succeed is as rare as “spotting a unicorn in the wild”. Venture Capital (VC) strives to identify

and invest in unicorn startups during their early stages, hoping to gain a high return. To avoid

entirely relying on human domain expertise and intuition, investors usually employ data-driven

approaches to forecast the success probability of startups. Over the past two decades, the

industry has gone through a paradigm shift moving from conventional statistical approaches

towards becoming machine-learning (ML) based. Notably, the rapid growth of data volume

and variety is quickly ushering in deep learning (DL), a subset of ML, as a potentially superior

approach in terms of capacity and expressivity. In this work, we carry out a literature review

and synthesis on DL-based approaches, covering the entire DL life cycle. The objective is a)

to obtain a thorough and in-depth understanding of the methodologies for startup evaluation

using DL, and b) to distil valuable and actionable learning for practitioners. To the best of our

knowledge, our work is the ﬁrst of this kind.

Keywords— Startup, Success Prediction, Unicorn, Deep Learning, Machine Learning, Venture

Capital, Investment, Big Data, Practical Synthesis

Using Deep Learning to Find the Next Unicorn: A Practical Synthesis Cao et al.

1 Introduction

A “startup” has many variants of deﬁnitions; up until this date there is no consensus on the standard

deﬁnition. Santisteban and Mauricio [2] synthesized many popular deﬁnitions, and discovered

some common labels such as “new”, “small”, “rapid growth”, “high risk”, where “small” is often

approximated by limited ﬁnancial funds and human resources [3]. Much of the literature, e.g., [4],

associates startups with disruptive innovation and high scalability. As a result,

“A startup is a dynamic, ﬂexible, high risk, and recently established company that

typically represents a reproducible and scalable business model. It provides innovative

products and/or services, and has limited ﬁnancial funds and human resources.” [2

–

Since startups stimulate growth, generate jobs and tax revenues, and promote many other

socioeconomically beneﬁcial factors [5], they are commonly regarded as powerful engines for

economic and social development, especially after economic, environment, and epidemic crisis such

as COVID-19

[6]. As the startups continue to develop, they often increasingly rely on external

funds (as opposed to internal funds from founders and co-founders), from either domestic or foreign

capital markets, to unlock a high rate of growth that usually corresponds to a “hockey stick” growth

curve (i.e. a linear line on a log scale) [7].

Startups may receive funds from multiple sources like Venture Capital (VC) and debt ﬁnancing;

up till this date, the dominating source has been VC. As an industry, VC seeks opportunities to

invest in startups with great potential (in the sense of ﬁnancial returns) to grow and successfully exit.

The risk-return trade-oﬀ tells us that the potential return rises with a corresponding increase in risk

Coronavirus disease 2019 (COVID-19) is a contagious disease caused by the severe acute respiratory syndrome

coronavirus 2 (SARS-CoV-2). The ﬁrst case was identiﬁed in 2019.

Statistics revealing the high risk of funding startups: on average, only around 60% of the startups survive for more

than 3 years since founded [8]; top 2% of VC funds receive 95% of the returns in the entire industry [9]; VC typically

has only 10% rate of achieving an ROI (return on investment) of 100% or more [10,11].

Using Deep Learning to Find the Next Unicorn: A Practical Synthesis Cao et al.

As a consequence, VC ﬁrms usually strive to mitigate this risk by improving their 1) deal sourcing

and screening and 2) value-add process [12]. In this survey, we will focus on the published work

around the former approach, i.e. ﬁnding the startup unicorn

as accurately as possible during the

deal sourcing phase.

Finding the unicorn from candidate startups is a complex task with great uncertainty because

of many factors such as vague and prone-to-change business ideas, no proof-of-concept prototype

when applicable, no organic revenue. This creates a low information situation, where VC ﬁrms often

have to make investment decisions based on insuﬃcient information (e.g. lack of ﬁnancial data) [14].

Therefore a VC’s deal sourcing process traditionally turns out to be manual and empirical, leaving

estimations of the ROI (return on investment) heavily dependent on the human investors’ decisions.

As pointed out in [15], human investors are inherently biased and intuition alone cannot consistently

drive good decisions. A better approach should leverage big data to

•

debias the decisions, so that the individual investment decision made for a particular startup is

expected to drive lower risk and higher ROI;

•

enable automation, so that more startups can be evaluated without requesting extra amount of

time.

To that end, over the past two decades, data driven approaches have been dominating the research

around startup success prediction (i.e. identifying startups that eventually turn into unicorns).

However, the majority is analytical and statistical as opposed to ML (machine learning) approaches.

Conventional statistical work (e.g. [2,16

–

33]) mostly starts with deﬁning some hypotheses

, followed

3Deal sourcing is the process by which investors identify investment opportunities.

Unicorn and near-unicorn startups are private, venture-backed ﬁrms with a valuation of at least $500 million at

some point [13].

Hypothesis often assumes certain impact of some factors to startup success. For example “the founder’s past

entrepreneurial experience inﬂuences the likelihood of success [32]”.

Using Deep Learning to Find the Next Unicorn: A Practical Synthesis Cao et al.

Startup

(input data)

ML Model Invest?

y = {0 - bad, 1 - good}

Figure 1: High-level overview of ML (machine learning) based startup sourcing

The ML model is trained to approximate a function

𝑓(·)

so that the input data

describing a startup

can be mapped to an output variable

𝑦

indicating the recommended investment propensity that can

be either discrete (good vs. bad) or continuous (success probability).

by testing them using statistical tools; the outcome of these work is often conclusions around

correlation and/or causality between some factors and the success likelihood of startups.

In conventional statistical research, good research hypotheses need to be simple, concise, precise,

testable; and most importantly, they should be grounded in past knowledge, gained from the literature

review or from theory [34]. Therefore, it is not a easy task to come up with good hypotheses. Over

the last few years, researchers have started investigating the possibility to perform hypothesis mining

from data using ML algorithms to avoid manually deﬁning hypotheses upfront. Hypothesis mining

aims to summarize (instead of manually deﬁne) hypotheses by carrying out explainability analysis

(cf. Section 9) on the trained ML models [35]. For example, with a labeled (i.e. knowing which

startups eventually become unicorns) dataset containing many attributes for many companies; one

can directly start oﬀ with training an ML model to predict unicorns (i.e. prediction target) using

the entire dataset (all companies and attributes). By explaining and quantifying how the change of

certain attributes would change the prediction target, one may distil hypothesis that describes the

relation between the attributes in scope and the prediction target. In comparison to exploratory data

analysis, hypothesis mining is a much more structured procedure that trains an ML model using

the entire dataset at hand. As illustrated in Figure 1, the ML-based approaches [35

–

55] require

practitioners to deﬁne the input data

and annotation

𝑦

(labeling good or bad investment according

to some criteria) before training a model

𝑓(·)

that maps

𝑦

, i.e.

𝑦=𝑓(x)

. There are already a

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

UsingDeepLearningtoFindtheNextUnicorn:APracticalSynthesisLeleCaoMotherbrain,EQTlele.cao@eqtpartners.comcaolele@gmail.comVilhelmvonEhrenheimMotherbrain,EQTvilhelm.vonehrenheim@eqtpartners.comSebastianKrakowskiHouseofInnovation,StockholmSchoolofEconomicssebastian.krakowski@hhs.seXiaoxueLiDepartmentofP...

展开>> 收起<<

Using Deep Learning to Find the Next Unicorn A Practical Synthesis LeleCao.pdf

共49页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Using Deep Learning to Find the Next Unicorn A Practical Synthesis LeleCao

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: