ObSynth An Interactive Synthesis System for Generating Object Models from Natural Language Speciﬁcations

2025-05-02 0 0 3.47MB 25 页 10玖币

侵权投诉

ObSynth: An Interactive Synthesis System for

Generating Object Models from Natural Language

Speciﬁcations

Alex Gu gua@mit.edu

Tamara Mitrovska?tamaram@mit.edu

Daniela Velez?dvelez@mit.edu

Jacob Andreas jda@mit.edu

Armando Solar-Lezama asolar@csail.mit.edu

Massachusetts Institute of Technology, Cambridge, MA, USA

Abstract

We introduce ObSynth, an interactive system leveraging the domain knowledge em-

bedded in large language models (LLMs) to help users design object models from high

level natural language prompts. This is an example of speciﬁcation reiﬁcation, the pro-

cess of taking a high-level, potentially vague speciﬁcation and reifying it into a more

concrete form. We evaluate ObSynth via a user study, leading to three key ﬁndings:

ﬁrst, object models designed using ObSynth are more detailed, showing that it often

synthesizes ﬁelds users might have otherwise omitted. Second, a majority of objects,

methods, and ﬁelds generated by ObSynth are kept by the user in the ﬁnal object

model, highlighting the quality of generated components. Third, ObSynth altered the

workﬂow of participants: they focus on checking that synthesized components were

correct rather than generating them from scratch, though ObSynth did not reduce the

time participants took to generate object models.

1 Introduction

Recent years have seen several applications of large language models (LLMs) to support software

development. For example, GitHub’s Copilot has demonstrated the potential of LLMs to help

programmers during the development process, and AlphaCode (Li et al., 2022) has demonstrated

the possibility of solving programming competition problems using LLMs. However, both of these

systems focus on very local problems—writing the next few lines of code, or a single self-contained

algorithm. Creating software, however, requires much more than implementing functions from

well-deﬁned speciﬁcations. In particular, an important part of software development is leveraging

domain knowledge to turn high-level application requirements into a detailed description of all

the components and interfaces that will make up the application.

We propose this speciﬁcation reiﬁcation task as a new challenge at the intersection of human-

computer interaction and program synthesis. Speciﬁcation reiﬁcation is the problem of taking a

high-level, potentially vague speciﬁcation of a problem and reifying it into a more concrete form.

For example, consider a developer who is designing a classroom management application in an

object-oriented language. Existing program synthesis systems can implement speciﬁc functions

?Tamara Mitrovksa and Daniela Velez contributed equally and are ordered alphabetically by last name.

arXiv:2210.11468v1 [cs.SE] 20 Oct 2022

Figure 1: A sample object model for a restaurant management application, designed with the

help of ObSynth. ObSynth is our interactive tool for designing object models consisting of objects,

ﬁelds, types, and methods.

in this application—for example, a function to search for students who have not submitted an

assignment—but before a developer gets to that point, they ﬁrst need to design the application

itself. This involves deciding which objects they need, and for each object deciding on their

ﬁelds and methods, and for each method deciding what its speciﬁcation should be. Speciﬁcation

reiﬁcation is about deriving this design from the high-level description of the application. Because

of the vague nature of this task, it is important that systems addressing these tasks involve humans

in the loop.

In addition to introducing the new task, this paper presents ObSynth, a prototype interactive

system for speciﬁcation reiﬁcation. ObSynth focuses on a key sub-task of the full speciﬁcation

reiﬁcation problem: designing an object model that will make up an application from a natural

language speciﬁcation alone. We deﬁne an object model to be a set of objects, ﬁelds, and methods.

Each ﬁeld has a name and a type; types may be primitives (int, boolean, ﬂoat, string, datetime),

other objects, or lists of either. In our model, each method simply has a name (though we hope

to extend this in future work). As an example, consider the speciﬁcation “I want a restaurant

management app tracking customers, their reservations, their orders, and menu items.”. An example of

an object model created via interaction with ObSynth is shown in Fig. 1. In this example model,

the objects are named customer, reservation, order, menu item, menu. In Fig. 1, the customer object

has a ﬁeld named address of type string and a ﬁeld named reservation of type List[reservation], the

list type indicated by the many icon. In Fig. 1, two of the customer object’s methods are named

makeReservation and updateContactInfo.

The design of object models is challenging for a few reasons. First, the initial description is

not a complete speciﬁcation for the object model, so relevant pieces of the object model must be

inferred. For example, in Fig. 1, the menu object is not speciﬁed in the prompt, nor is the cus-

tomer object’s phone number ﬁeld or makeReservation method. The existence of these ﬁelds must be

inferred by the system from its background knowledge of objects in the world and their relevant

attributes. Second, the model must understand the relationship between objects, e.g., that a cus-

tomer has one name and a list of reservations but no price. Even with a sophisticated language

model, getting every detail right in one shot is challenging, so it is important for such a system

to be designed for interactivity. This way, the tool can enhance the user’s creativity rather than

attempting to substitute for it.

We present ObSynth, our synthesis-based interactive tool for solving this task. First, in Sec. 3,

we present the ObSynth UI, describing the workﬂow when interacting with our system. Next, in

Sec. 4, we discuss in depth how we use LLM’s to equip ObSynth with automation capabilities.

Finally, in Sec. 5, we discuss our user study, highlighting the ways ObSynth improves the process

of creating object models. Our contributions are as follows:

1. We introduce and highlight a new task in program synthesis: speciﬁcation reiﬁcation. We

also introduce the object model synthesis task as an important sub-problem of speciﬁcation

reiﬁcation.

2. We design an interactive system, ObSynth, that assists humans in completing this task by

automating parts of the process. Instead of designing the object model purely from scratch,

ObSynth synthesizes a set of initial object model that the user can then build off of. Users

can also ask ObSynth to automatically add relevant objects, methods, and ﬁelds at any point

in the design process.

3. We conduct a user study (n=11)that helps us understand how ObSynth can help par-

ticipants design better object models. Through this user study, we discovered three key

ﬁndings. First, object models designed using ObSynth are more detailed, showing that it

often synthesizes ﬁelds users might have otherwise omitted. Second, a majority of objects,

methods, and ﬁelds generated by ObSynth were kept by the user in the ﬁnal object model,

highlighting the quality of generated components. Third, ObSynth altered the workﬂow of

participants: they focus on checking that synthesized components were correct rather than

generating them from scratch. However, ObSynth did not reduce the time participants took

to generate object models.

2 Related Work

Program Synthesis: The ﬁeld of program synthesis has had a long history, with a variety of ap-

proaches summarized by Gulwani et al. (2017). The ﬁrst line of approaches to appear mostly

focused on inductive synthesis (matching a set of input-output examples) approaches such as

bottom-up search (Alur et al., 2015), top-down search (Feser et al., 2015), type-directed search (Os-

era and Zdancewic, 2015), and constraint-solving (Singh and Solar-Lezama, 2016). Later, however,

richer forms of program speciﬁcations were used for synthesis.

In recent years, with new developments in machine learning, there have been more and more

works exploring the potential of augmenting traditional synthesis techniques with neural net-

works; (Chaudhuri et al., 2021) provides a complete survey. These include approaches to learn

abstractions and libraries from scratch (Ellis et al., 2020; Wong et al., 2021; Nye et al., 2020b),

execution-guided approaches that evaluate partial program states (Nye et al., 2020a; Gupta et al.,

2020; Chen et al., 2018), and approaches guided by natural language information (Wong et al.,

2021; Ye et al., 2020b,a; Nye et al., 2019; Polosukhin and Skidanov, 2018).

Ontologies and Knowledge Graphs: There has also been a body of work that aims to build

ontologies and knowledge graphs of natural language concepts, such as Yago (Suchanek et al.,

2007), WordNet (Miller, 1995), and DBpedia (Auer et al., 2007). While these knowledge graphs

have been applied in traditional NLP tasks such as question answering (Boi´

nski et al., 2020),

they are unable to provide speciﬁc insights for our synthesis task such as synthesizing ﬁelds for a

certain object. As an example, when searching for nearest neighbors related to student, WordNet

comes up with synonyms such as pupil,educatee, and scholar, while Yago provides a Wikipedia

page for a student, a deﬁniton of a student in Spanish, and an image containing many students.

In addition, our synthesis task is very contextual: the ﬁelds of a student object would be very

different if we were designing an app for teachers to manage the classroom vs a social app for

students to make friends with one another. It is difﬁcult to capture this form of context via

ontologies and knowledge graphs.

Large Language Models: Recent years has also seen the birth of new works leveraging large

language models (LLMs) like GPT-3 (Brown et al., 2020) to perform program synthesis. A few

months ago, GitHub released a powerful code autocompletion tool called GitHub Copilot which

uses context such as natural language comments and previous code in order to generate code.

Copilot is built off of OpenAI’s powerful machine learning model Codex (Chen et al., 2021),

which translates natural language to code in almost a dozen programming languages. CodeBERT

(Feng et al., 2020) learns representations of code and natural language for downstream tasks like

code search and code documentation generation. Heyman et al. (2021) use GPT-2 trained on a

corpus of well-documented and commented code to synthesize programs for data science and

machine learning. Building off of LLMs, Austin et al. (2021) incorporate human feedback to

repair generated code.

There have been other works combining traditional program synthesis techniques with large

language models. Verbruggen et al. (2021) uses traditional inductive synthesis techniques with

GPT-3 to learn small intermediate functions that cannot be represented symbolically. Jigsaw

(Jain et al., 2021) uses LLMs to synthesize code but use program analysis techniques to do post-

processing. Rahmani et al. (2021) take a component-based synthesis approach guided by LLMs

which, for example, help rank candidate programs. Our approach differs from existing works

in LLMs in that we approach synthesis from a global view, generating the overall structure of

applications rather than the local structure of code itself.

3 The ObSynth Frontend

In this section, we describe our vision of how users interact with ObSynth to generate a ﬁnal

speciﬁcation from a text prompt. Fig. 2 shows the steps of the ObSynth workﬂow at a high level,

while Fig. 3 shows the concrete UI users work with at each step. In Sec. 3.1, we explain steps

(1)-(4), where the user speciﬁes an initial text prompt and works with ObSynth to obtain an initial

full object model. Then, in Sec. 3.2, we explain step (5), where users use ObSynth to tweak this

object model to ﬁt their use case.

3.1 From Text to Initial Object Model

The user starts by specifying a text prompt as shown in Fig. 3, step (1). As a running example,

we use the prompt “I want a restaurant management app tracking customers, their reservations, their

orders, and menu items.” When the user enters this prompt, ObSynth synthesizes a list of object

names without ﬁelds or methods—in this case customer, reservation, order, menu item, table, waiter,

as shown in Fig. 3, step (2). The user can then add, edit, or delete these object names in the same

UI. ObSynth also provides an additional functionality, which will attempt to synthesize additional

relevant objects (the purple Auto Add Object button). This feature could potentially suggest objects

that the user may not have thought of themselves. After specifying a list of object names, ObSynth

will automatically synthesize a set of ﬁelds, types, and method names for each object, as seen in

step (4). This generates an initial object model speciﬁcation for the text prompt.

3.2 From Initial Object Model to Final Speciﬁcation

Since different users may have different use cases, ObSynth gives users the ﬂexibility to edit the

object model as desired. After generating a full object model, users see the UI shown in the bottom

panel of Fig. 3. We ﬁrst explain the UI features (grey buttons), and then move to the synthesis

features (blue buttons).

Figure 2: Complete workﬂow of a user synthesizing an object model from a text speciﬁcation

using ObSynth. Orange indicates user interaction, blue indicates ObSynth automation.

ObSynth UI features: As shown in Fig. 3, ObSynth’s UI allows users to easily add, delete, and

edit their own objects, ﬁelds, and methods at all stages of the process. Deleted objects, ﬁelds, and

methods can also be restored. Users can easily toggle the multiplicity of an object ﬁeld by clicking

the one/many button. ObSynth also has a one-way/two-way button that adds reverse object-ﬁeld

relationships: if a student object has a teacher ﬁeld and there is a teacher object, the button will

toggle whether the teacher object has a student ﬁeld. Finally, ObSynth ensures that when the

user changes the name of an object, all other ﬁelds having that object type are renamed. All these

buttons are shown in grey in Fig. 3.

ObSynth synthesis features: However, what makes ObSynth unique is its synthesis capabilities,

shown in the blue buttons in Fig. 3. When the user clicks Begin, a set of initial object names is

automatically synthesized for them. When the user clicks Generate Fields and Methods, ﬁelds and

methods for each object are likewise automatically populated. The blue Auto Add buttons allow

users to synthesize speciﬁc parts of the object model: Auto Add Field synthesizes a new ﬁeld, type,

and multiplicity for the current object. Auto Add Method synthesizes a new method name for an

existing object. Auto Add Object synthesizes a new relevant object and fully populates it with

ﬁelds (including types and multiplicity) and method names. These synthesis tools ease the user

in the development of a suitable object model for their use case, especially giving the user ideas

for objects, ﬁelds, and methods they might have overlooked.

4 The ObSynth Backend

In Sec. 3, we described the frontend of ObSynth and how users can interact with the synthesis

features of ObSynth to design an object model. In this section, we go into the precise technical

details of how these synthesis components work.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ObSynth:AnInteractiveSynthesisSystemforGeneratingObjectModelsfromNaturalLanguageSpecicationsAlexGugua@mit.eduTamaraMitrovska?tamaram@mit.eduDanielaVelez?dvelez@mit.eduJacobAndreasjda@mit.eduArmandoSolar-Lezamaasolar@csail.mit.eduMassachusettsInstituteofTechnology,Cambridge,MA,USAAbstractWeintroduce...

展开>> 收起<<

ObSynth An Interactive Synthesis System for Generating Object Models from Natural Language Speciﬁcations.pdf

共25页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

ObSynth An Interactive Synthesis System for Generating Object Models from Natural Language Speciﬁcations

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: