Generative Adversarial Nets Can we generate a new dataset based on only one training set

2025-04-27 0 0 693.46KB 9 页 10玖币

侵权投诉

Generative Adversarial Nets:

Can we generate a new dataset based on only one

training set?

Lan V. Truong∗

Department of Engineering

University of Cambridge

Cambridge, CB2 1PZ

lt407@cam.ac.uk

Abstract

A generative adversarial network (GAN) is a class of machine learning frameworks

model is pitted against an adversary: a discriminative model that learns to determine

whether a sample is from the model distribution or the data distribution. GAN

generates new samples from the same distribution as the training set. In this work,

we aim to generate a new dataset that has a different distribution from the training

set. In addition, the Jensen-Shannon divergence between the distributions of the

generative and training datasets can be controlled by some target

δ∈[0,1]

. Our

work is motivated by applications in generating new kinds of rices which have

similar characteristics as a good rice.

1 INTRODUCTION

Representation learning is a set of techniques that allows a system to automatically discover the

representations from raw data needed for feature detection or classiﬁcation from raw data. This

replaces manual feature engineering and allows a machine to both learn the features and use them

to perform a speciﬁc task. Feature learning can be either supervised or unsupervised. In supervised

feature learning, features are learned using labeled input data. Examples include supervised neural

networks, multilayer perceptron and (supervised) dictionary learning. In unsupervised feature

learning, features are learned with unlabeled input data. Examples include dictionary learning,

independent component analysis, autoencoders, matrix factorization and various forms of clustering.

1.1 Related Papers

In the last few years, deep learning based generative models have gained more and more interest due

to (and implying) some amazing improvements in the ﬁeld. Relying on huge amount of data, well-

designed networks architectures and smart training techniques, deep generative models have shown

an incredible ability to produce highly realistic pieces of content of various kind, such as images,

texts and sounds. Among these deep generative models, two major families stand out and deserve a

special attention: Generative Adversarial Networks (GANs) [

] and Variational Autoencoders (VAEs)

[4].

A variational autoencoder can be deﬁned as being an autoencoder [

] whose training is regularised to

avoid overﬁtting and ensure that the latent space has good properties that enable generative process.

∗

Use footnote for providing further information about author (webpage, alternative address)—not for

acknowledging funding agencies.

Preprint. Under review.

arXiv:2210.06005v1 [cs.LG] 12 Oct 2022

Tolstikhin et al. proposed a Wasserstein Autoencoder (WAE), which minimizes a penalized form of

the Wasserstein distance between the model distribution and the generative distribution [

]. WAE

shares many of the properties of VAEs such as stable training, encoder-decoder architecture, nice

latent manifold structure while generating samples of better quality, as measured by the FID score.

A generative adversarial network (GAN) is a class of machine learning frameworks designed by

Goodfellow et al. in 2014 [

]. In GAN, the generative model learns to map from a latent space to a

data distribution of interest, while the discriminative model distinguishes candidates produced by the

generator from the true data distribution. The generative network’s training objective is to increase

the error rate of the discriminative network. Generative adversarial networks have applications in

many ﬁelds such as fashion, art and advertising, science, video games, and audio synthesis. There is

a veritable zoo of GAN variants. Conditional GANs [

] are similar to standard GANs except they

allow the model to conditionally generate samples based on additional information. For example,

if we want to generate a cat face given a dog picture, we could use a conditional GAN. The GAN

game is a general framework and can be run with any reasonable parametrization of the generator

and discriminator

. In the original paper, the authors demonstrated it using multilayer perceptron

networks and convolutional neural networks. Many alternative architectures have been tried such as

Deep convolutional GAN [6], Self-attention GAN [1], Flow-GAN [3].

1.2 Motivations

There were some new variants of GAN which allow the use of multiple data distributions and the

generated ones such as the conditional GAN. However, these new variants of GAN require least

two different training sets to generate a new one. In many applications in practice, we would like

to generate a new dataset which have the same characteristic as a reference one. In this work, we

aim to develop a new variant of GAN which allows to perform this task. Our work is motivated by

applications in generating new kinds of rices which have similar characteristics as a good rice.

More speciﬁcally, assume that we have

datasets with unknown distribution

p1, p2,··· , pL

for

some

L≥1

. We aim to generate a new dataset which has a different distribution from the training

datasets. In addition, the Jensen-Shannon divergence between the distribution of the generative

dataset and a mixture data distribution can be controlled, i.e.

JSD(PL

l=1 αlpl,pg)≤δ

for some

given non-negative tuple

(α1, α2,··· , αL)

satisfying

i=1 αi= 1

and

δ∈[0,1]

. For

L= 1

, our

algorithm generates a new dataset such that the Jensen-Shannon divergence between the distributions

of the generative and the training data is upper bounded by some target δ∈[0,1].

This additional “controllable property" is very important in many applications. For example, we

sometimes need to generate a new cat gender (images) which owns most properties as an old gender of

cats. In many other applications, we may increase the number of new generated images by lessening

the distance requirement between the distributions of data and generated ones compared with GAN

or conditional GANs.

1.3 Contributions

Our main contributions include:

•

We develop a new technique which allows to control the total variation between the distribu-

tion of the random vectors

and

where

y=x+z

and

is a sparse random vector with

ﬁxed distribution.

•

We propose a mechanism to which allows to loosen Jensen-Shannon divergence between

the distribution of the generated distribution and the data distribution in the Goodfellow et

al’s model [2].

•

We extend this new model to allows the use of multiple data distributions as in the conditional

GAN.

•

We illustrate our ideas on datasets Cfar10 and Cfar100, and generate new datasets based on

only one dataset or a mixture of these two datasets for different values of δ.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

GenerativeAdversarialNets:Canwegenerateanewdatasetbasedononlyonetrainingset?LanV.TruongDepartmentofEngineeringUniversityofCambridgeCambridge,CB21PZlt407@cam.ac.ukAbstractAgenerativeadversarialnetwork(GAN)isaclassofmachinelearningframeworksdesignedbyGoodfellowetal.in2014.IntheGANframework,thegenerat...

展开>> 收起<<

Generative Adversarial Nets Can we generate a new dataset based on only one training set.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Generative Adversarial Nets Can we generate a new dataset based on only one training set

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: