Learnware Small Models Do Big Zhi-Hua Zhou Zhi-Hao Tan National Key Laboratory for Novel Software Technology

2025-04-29 0 0 5.05MB 23 页 10玖币

侵权投诉

Learnware: Small Models Do Big

Zhi-Hua Zhou, Zhi-Hao Tan

National Key Laboratory for Novel Software Technology

Nanjing University, Nanjing 210023, China

zhouzh@nju.edu.cn

Abstract

There are complaints about current machine learning techniques such as the requirement

of a huge amount of training data and proﬁcient training skills, the diﬃculty of continual

learning, the risk of catastrophic forgetting, the leaking of data privacy/proprietary, etc.

Most research eﬀorts have been focusing on one of those concerned issues separately, paying

less attention to the fact that most issues are entangled in practice. The prevailing big

model paradigm, which has achieved impressive results in natural language processing and

computer vision applications, has not yet addressed those issues, whereas becoming a serious

source of carbon emissions. This article oﬀers an overview of the learnware paradigm, which

attempts to enable users not need to build machine learning models from scratch, with the

hope of reusing small models to do things even beyond their original purposes, where the

key ingredient is the speciﬁcation which enables a trained model to be adequately identiﬁed

to reuse according to the requirement of future users who know nothing about the model

in advance.

1. Introduction

Machine learning has achieved great success, while there are lots of complaints about the

requirement of a huge amount of training data (particularly data with labels), the diﬃculty

of adapting a trained model to changing environments, and the embarrassment of catas-

trophic forgetting when reﬁning a trained model incrementally is demanded, etc. There

are great eﬀorts such as weakly supervised learning [29] trying to reduce the requirement

arXiv:2210.03647v3 [cs.LG] 30 Oct 2023

of labeled training data, open-environment machine learning [30] trying to enable learning

models to adapt to environments, continual learning [4] trying to help deep neural networks

resist forgetting; however, these issues are still far from solved.

Indeed, most eﬀorts have been focusing on one of those concerned issues separately, paying

less attention to the fact that most issues are entangled in practice. For example, a well-

studied technique of weakly supervised learning for reducing the requirement of labeled

training data is to collect and exploit a huge amount of unlabeled data drawn from the

distribution the same as that of the labeled training data, paying less attention to the fact

that in changing environments the data distributions are subject to change inherently. For

another example, an eﬀective approach to cope with changing environments is to emphasize

data received in very recent timeslots since the changes have not yet caused signiﬁcant

diﬀerences, paying less attention to the fact that the emphasis on very recent data may

tend to aggravate the severity of catastrophic forgetting.

There are many other issues, e.g., most ordinary users can hardly produce well-performed

models starting from scratch, due to the lack of proﬁcient training skills; in many real-world

tasks the data privacy/proprietary issue may disable data sharing, leading to the diﬃculty

of sharing experience among diﬀerent users; in really big data applications, it is generally

unaﬀordable or even infeasible to hold the whole data to support many passes of scanning.

The prevailing deep learning big model paradigm, which has achieved impressive results in

natural language processing and computer vision applications [20, 3], has not yet addressed

the above issues. Note that each big model is targeted to a task (or task class) planned in

advance, generally helpless to others, e.g., a big model trained for face recognition can hardly

be helpful to ﬁnancial futures trading. It would be too ambitious to build a pre-trained big

model for every possible task, because the number of possible tasks can be unimaginably

big or even inﬁnite. In addition, sadly, the training of big models is becoming a serious

source of carbon emissions threatening our environment.

Admitting the usefulness of big models in their speciﬁcally targeted tasks, is there any

paradigm oﬀering the possibility of tackling the above issues simultaneously?

This article overviews the progress of learnware, a paradigm oﬀering a promising answer to

the above question. It attempts to systematically reuse small models to do things that may

even be beyond their original purposes, and enables users not need to build their machine

learning models from scratch.

2. The Learnware Proposal

The learnware paradigm was proposed in [28]. A learnware is a well-performed trained

machine learning model with a speciﬁcation which enables it to be adequately identiﬁed to

reuse according to the requirement of future users who know nothing about the learnware

in advance.

The developer or owner 1of a trained machine learning model (no matter whether the

model is a deep neural network, a support vector machine, or a decision tree, etc.) can

spontaneously submit her trained model into a learnware market. If the learnware market

decides to accept the model, it assigns a speciﬁcation to the model and accommodates it

in the market. The learnware market should not be small, otherwise it can hardly oﬀer

help for various tasks; it would be common to accommodate thousands or millions of well-

performed models submitted by diﬀerent developers, on diﬀerent tasks, using diﬀerent data,

optimizing diﬀerent objectives, etc.

Once the learnware market has been built, when a user is going to tackle a machine learning

task, she can do it in the following way rather than building her model from scratch. As the

comic in Figure 1 illustrates, she can submit her requirement to the learnware market, and

then the market will identify and deploy some helpful learnware(s) by considering the learn-

ware speciﬁcation. The learnware can be applied by the user directly, or adapted/polished

by user’s own data for better usage, or exploited in other ways to help improve the model

built from the user’s own data. No matter which mechanism for model reuse is adopted,

1There are situations where the developer and owner of a trained machine learning model are diﬀerent.

Here, for simplicity, we do not distinguish them and assume that the developer holds all rights of the model.

Specification

melon

good to cut

Specification

Recommend

Submit

Requirement

meat

good to cut

I can use it

directly or use

my data to

make it

sharper!

Figure 1: An analogy of learnware.

the whole process can be much less expensive and more eﬃcient than building a model from

scratch by herself.

The learnware proposal oﬀers the possibility of addressing most issues concerned in Sec-

tion 1:

Lack of training data: Strong machine learning models can be attained even for tasks

with small data, because the models are built upon well-performed learnwares, and only a

small amount of data are needed for adaptation or reﬁnement for most cases.

Lack of training skills: Strong machine learning models can be attained even for ordi-

nary users with little training skills, because the users can get help from well-performed

learnwares rather than building a model from scratch by themselves.

Catastrophic forgetting: A learnware will always be accommodated in the learnware

market once it is accepted, unless every aspect of its function can be replaced by other

learnwares. Thus, the old knowledge in the learnware market is always held. Nothing to

be forgotten.

Continual learning: The learnware market naturally realizes continual and lifelong learn-

ing, because with the constant submissions of well-performed learnwares trained from di-

verse tasks, the knowledge held in the learnware market is being continually enriched.

Data privacy/proprietary: The developers only submit their models without sharing

their own data, and thus, the data privacy/proprietary can be well preserved. Although

one could not deny the possibility of reverse engineering the models, the risk would be too

small compared with many other privacy-preserving solutions.

Unplanned tasks: The learnware market is to be open to all legal developers. Thus, there

would exist helpful learnwares in the market unless a task is new to all legal developers.

Moreover, some new tasks, though no developer has built models for them specially, could

be addressed by selecting and assembling some existing learners.

Carbon emission: Assembling small models may oﬀer good-enough performance for most

applications; thus, one may have less interest to train too many big models. The possibility

of reusing other developers’ models can help reduce repetitive development. Besides, a

not-so-good model for one user may be very helpful for another user. No training cost

wasted.

Though the learnware proposal shows a bright future, there is much work to be done to

make it a reality. Section 3-5 will present some of our progress.

3. The Design

There are three important entities: developers, users, and the market. The developers are

usually machine learning experts who produce and want to share/sell their well-performed

trained machine learning models. The users need machine learning services but usually

have only limited data and lack machine learning knowledge and skills. The learnware

market accepts/buys well-performed trained models from developers, accommodates them

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Learnware:SmallModelsDoBigZhi-HuaZhou,Zhi-HaoTanNationalKeyLaboratoryforNovelSoftwareTechnologyNanjingUniversity,Nanjing210023,Chinazhouzh@nju.edu.cnAbstractTherearecomplaintsaboutcurrentmachinelearningtechniquessuchastherequirementofahugeamountoftrainingdataandproficienttrainingskills,thedifficulty...

展开>> 收起<<

Learnware Small Models Do Big Zhi-Hua Zhou Zhi-Hao Tan National Key Laboratory for Novel Software Technology.pdf

共23页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Learnware Small Models Do Big Zhi-Hua Zhou Zhi-Hao Tan National Key Laboratory for Novel Software Technology

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: