Data types as a more ergonomic frontend for Grammar-Guided Genetic Programming_2

2025-04-27 0 0 576.83KB 9 页 10玖币

侵权投诉

Data types as a more ergonomic frontend for

Grammar-Guided Genetic Programming

Guilherme Espada

gjespada@ciencias.ulisboa.pt

LASIGE, Faculdade de Ciências da

Universidade de Lisboa

Portugal

Leon Ingelse

lingelse@lasige.di.fc.ul.pt

LASIGE, Faculdade de Ciências da

Universidade de Lisboa

Portugal

Paulo Canelas

pacsantos@ciencias.ulisboa.pt

School of Computer Science

Carnegie Mellon University

USA

LASIGE, Faculdade de Ciências da

Universidade de Lisboa

Portugal

Pedro Barbosa

psbarbosa@ciencias.ulisboa.pt

LASIGE, Faculdade de Ciências da

Universidade de Lisboa

Portugal

Instituto de Medicina Molecular

Portugal

Alcides Fonseca

alcides@ciencias.ulisboa.pt

LASIGE, Faculdade de Ciências da

Universidade de Lisboa

Portugal

Abstract

Genetic Programming (GP) is an heuristic method that can

be applied to many Machine Learning, Optimization and

Engineering problems. In particular, it has been widely used

in Software Engineering for Test-case generation, Program

Synthesis and Improvement of Software (GI).

Grammar-Guided Genetic Programming (GGGP) approaches

allow the user to rene the domain of valid program solu-

tions. Backus Normal Form is the most popular interface for

describing Context-Free Grammars (CFG) for GGGP. BNF

and its derivatives have the disadvantage of interleaving the

grammar language and the target language of the program.

We propose to embed the grammar as an internal Domain-

Specic Language in the host language of the framework.

This approach has the same expressive power as BNF and

EBNF while using the host language type-system to take

advantage of all the existing tooling: linters, formatters, type-

checkers, autocomplete, and legacy code support. These tools

have a practical utility in designing software in general, and

GP systems in particular.

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are not

made or distributed for prot or commercial advantage and that copies bear

this notice and the full citation on the rst page. Copyrights for components

of this work owned by others than ACM must be honored. Abstracting with

credit is permitted. To copy otherwise, or republish, to post on servers or to

redistribute to lists, requires prior specic permission and/or a fee. Request

permissions from permissions@acm.org.

Conference acronym ’XX, June 03–05, 2018, Woodstock, NY

ACM ISBN 978-1-4503-XXXX-X/18/06... $15.00

hps://doi.org/XXXXXXX.XXXXXXX

We also present Meta-Handlers, user-dened overrides

of the tree-generation system. This technique extends our

object-oriented encoding with more practicability and ex-

pressive power than existing CFG approaches, achieving the

same expressive power of Attribute Grammars, but without

the grammar vs target language duality.

Furthermore, we evidence that this approach is feasible,

showing an example Python implementation as proof. We

also compare our approach against textual BNF-representations

w.r.t. expressive power and ergonomics. These advantages

do not come at the cost of performance, as shown by our

empirical evaluation on 5 benchmarks of our example imple-

mentation against PonyGE2. We conclude that our approach

has better ergonomics with the same expressive power and

performance of textual BNF-based grammar encodings.

CCS Concepts: •Computer systems organization →Em-

bedded systems

;Redundancy; Robotics;

•Networks →

Network reliability.

Keywords:

Grammar-guided Genetic Programming, Strongly-

Typed Genetic Programming, Genetic Programming Frame-

work

ACM Reference Format:

Guilherme Espada, Leon Ingelse, Paulo Canelas, Pedro Barbosa,

and Alcides Fonseca. 2018. Data types as a more ergonomic frontend

for Grammar-Guided Genetic Programming. In Proceedings of Make

sure to enter the correct conference title from your rights conrmation

emai (Conference acronym ’XX). ACM, New York, NY, USA, 9 pages.

hps://doi.org/XXXXXXX.XXXXXXX

arXiv:2210.04826v1 [cs.PL] 10 Oct 2022

Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Espada et al.

1 Introduction

Genetic Programming (GP) has been applied to several do-

mains [

], including test-case generation [

], program syn-

thesis [

] and automatic optimization of programs [

Being a search-based method, GP often requires the user to

encode domain-specic restrictions to the possible solutions.

Throughout the Genetic Programming procedure, many

solutions (also called programs or trees) are generated, mod-

ied and combined in order to search the one that has the

best performance or solves the problem. One of the main

challenges of using GP is to dene the boundaries of the

search-space in a way that is performant and ecient. En-

coding domain-knowledge in the denition of the search

space is desirable, so many candidate, but invalid solutions

are removed from the search.Grammar-Guided and Strongly-

Typed Genetic Programming (GGGP [

] and STGP [

])

are two popular families of approaches to guide the random

solution generation by the user. To restrict valid solutions,

GGGP relies on user-dened grammars while STGP relies

on user-dened type annotations.

Within each family of approaches, there are dierent trade-

os between expressive power and computational complex-

ity. Context-Free Grammars [

] can be used with little over-

head in processing, while Attribute Grammars [

] can en-

code more complex restrictions. Similarly, polymorphic type

systems [

] are more expressive than monomorphic ones,

but they have a higher overhead during the evolutionary

process.

However, in practice, popular implementations of these

systems are limited to Context-Free grammars (CFG) (e.g.,

PonyGE2 [

]) and base types (e.g., DEAP [

] and Jenet-

ics [36]).

Problem Statement

GGGP requires users to express the search-space of valid pro-

grams in the Backus Normal Form (BNF) or one of its deriva-

tions, such as the Extended Backus Normal Form (EBNF).

We consider BNF and its derivatives to be a textual editing

medium of the target language, as concepts in the target

language (like types, functions, literals) have no meaning in

the grammar language. As such, users are treating the target

language as character streams, instead of treating them as

semantic code units, as they do in code editors.

The BNF approach is not ergonomic nor error-prone due

to the interleaving of two languages: the BNF meta-syntax

and the syntax of the target language, i.e., the language of

the generated programs. As an example, Listing 1 depicts

two common errors that both novices and experts do while

writing grammars. These are not detected immediately, and

are typically only revealed during execution.

Ideally, these two bugs would be detected immediately in

most development environments: the type-checker would

identify a missing import, and the type mismatch in the

1<S> ::= <S> "+" <S> | "x" | 1 | np.array([<A>])

2<A> ::= [0-9]+,<A> | [0-9]+

Listing 1.

Example of a grammar with two possible bugs:

the

module may not be imported, and the

operator may

be applied to a string and integer, causing a crash.

1<Real> ::= <RealLit> | DATASET[GE_RANGE:3] | <

RealLit> + <RealLit> | np.average(<Vector>)

2<RealLit> ::= <d>.<d>

3<d> ::= GE_RANGE:10<d> |

4<Vector> ::= DATASET[3 + GE_RANGE:3]

Listing 2. PonyGE2 grammar for a subset of VectorialGP

operator. Is it really necessary that, to implement a grammar,

developers have to lose access to an array of tooling that

has shown to be useful in software engineering? Or is there

a more ergonomic interface that is integrated into the host

language ecosystem, providing access to code editors, syntax

highlighting, linting, formatting, test suites, etc...?

If there is, such an approach could reduce the barrier to

adoption of GGGP, STGP, and similar systems, increasing

the productivity of practitioners and reducing the number of

possible bugs that are not caught earlier due the mismatch

between the host language and the grammar language.

It is important to notice that usability has been identied

as a relevant open challenge for GP both in 2010 [

], and

again in 2020, when it was identied as one of the 10 major

challenges for the following 15 years [

]. This work ad-

dresses exactly that problem, especially in the case where

practitioners encode domain information to improve the

performance of GP.

Approach

Our rst contribution is an object-oriented encoding of gram-

mars to be used in GGGP, regardless of representation. In

short, non-terminals are represented as abstract classes (or

interfaces, or traits) and productions are represented as con-

crete classes that implement the corresponding abstract class.

As a running example, we will use grammars to describe

the VectorialGP example in Azzali et al

. [2]

. Listing 2 de-

scribes that subset using a PonyGE2’s avor of BNF, while

Listing 3 encodes the same grammar using the proposed

approach, both implemented in Python.

There is a direct correspondence between the abstract and

concrete types, and the non-terminals and productions of

the grammar. This direct correspondence is an advantage of

this approach, that does not add more cognitive overhead

when using the Object-Oriented encoding.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DatatypesasamoreergonomicfrontendforGrammar-GuidedGeneticProgrammingGuilhermeEspadagjespada@ciencias.ulisboa.ptLASIGE,FaculdadedeCiênciasdaUniversidadedeLisboaPortugalLeonIngelselingelse@lasige.di.fc.ul.ptLASIGE,FaculdadedeCiênciasdaUniversidadedeLisboaPortugalPauloCanelaspacsantos@ciencias.ulisboa....

展开>> 收起<<

Data types as a more ergonomic frontend for Grammar-Guided Genetic Programming_2.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Data types as a more ergonomic frontend for Grammar-Guided Genetic Programming_2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: