ObSynth An Interactive Synthesis System for Generating Object Models from Natural Language Specifications

2025-05-02 0 0 3.47MB 25 页 10玖币
侵权投诉
ObSynth: An Interactive Synthesis System for
Generating Object Models from Natural Language
Specifications
Alex Gu gua@mit.edu
Tamara Mitrovska?tamaram@mit.edu
Daniela Velez?dvelez@mit.edu
Jacob Andreas jda@mit.edu
Armando Solar-Lezama asolar@csail.mit.edu
Massachusetts Institute of Technology, Cambridge, MA, USA
Abstract
We introduce ObSynth, an interactive system leveraging the domain knowledge em-
bedded in large language models (LLMs) to help users design object models from high
level natural language prompts. This is an example of specification reification, the pro-
cess of taking a high-level, potentially vague specification and reifying it into a more
concrete form. We evaluate ObSynth via a user study, leading to three key findings:
first, object models designed using ObSynth are more detailed, showing that it often
synthesizes fields users might have otherwise omitted. Second, a majority of objects,
methods, and fields generated by ObSynth are kept by the user in the final object
model, highlighting the quality of generated components. Third, ObSynth altered the
workflow of participants: they focus on checking that synthesized components were
correct rather than generating them from scratch, though ObSynth did not reduce the
time participants took to generate object models.
1 Introduction
Recent years have seen several applications of large language models (LLMs) to support software
development. For example, GitHub’s Copilot has demonstrated the potential of LLMs to help
programmers during the development process, and AlphaCode (Li et al., 2022) has demonstrated
the possibility of solving programming competition problems using LLMs. However, both of these
systems focus on very local problems—writing the next few lines of code, or a single self-contained
algorithm. Creating software, however, requires much more than implementing functions from
well-defined specifications. In particular, an important part of software development is leveraging
domain knowledge to turn high-level application requirements into a detailed description of all
the components and interfaces that will make up the application.
We propose this specification reification task as a new challenge at the intersection of human-
computer interaction and program synthesis. Specification reification is the problem of taking a
high-level, potentially vague specification of a problem and reifying it into a more concrete form.
For example, consider a developer who is designing a classroom management application in an
object-oriented language. Existing program synthesis systems can implement specific functions
?Tamara Mitrovksa and Daniela Velez contributed equally and are ordered alphabetically by last name.
1
arXiv:2210.11468v1 [cs.SE] 20 Oct 2022
Figure 1: A sample object model for a restaurant management application, designed with the
help of ObSynth. ObSynth is our interactive tool for designing object models consisting of objects,
fields, types, and methods.
in this application—for example, a function to search for students who have not submitted an
assignment—but before a developer gets to that point, they first need to design the application
itself. This involves deciding which objects they need, and for each object deciding on their
fields and methods, and for each method deciding what its specification should be. Specification
reification is about deriving this design from the high-level description of the application. Because
of the vague nature of this task, it is important that systems addressing these tasks involve humans
in the loop.
In addition to introducing the new task, this paper presents ObSynth, a prototype interactive
system for specification reification. ObSynth focuses on a key sub-task of the full specification
reification problem: designing an object model that will make up an application from a natural
language specification alone. We define an object model to be a set of objects, fields, and methods.
Each field has a name and a type; types may be primitives (int, boolean, float, string, datetime),
other objects, or lists of either. In our model, each method simply has a name (though we hope
to extend this in future work). As an example, consider the specification “I want a restaurant
management app tracking customers, their reservations, their orders, and menu items.”. An example of
an object model created via interaction with ObSynth is shown in Fig. 1. In this example model,
the objects are named customer, reservation, order, menu item, menu. In Fig. 1, the customer object
has a field named address of type string and a field named reservation of type List[reservation], the
list type indicated by the many icon. In Fig. 1, two of the customer object’s methods are named
makeReservation and updateContactInfo.
The design of object models is challenging for a few reasons. First, the initial description is
not a complete specification for the object model, so relevant pieces of the object model must be
inferred. For example, in Fig. 1, the menu object is not specified in the prompt, nor is the cus-
tomer object’s phone number field or makeReservation method. The existence of these fields must be
inferred by the system from its background knowledge of objects in the world and their relevant
attributes. Second, the model must understand the relationship between objects, e.g., that a cus-
tomer has one name and a list of reservations but no price. Even with a sophisticated language
model, getting every detail right in one shot is challenging, so it is important for such a system
to be designed for interactivity. This way, the tool can enhance the user’s creativity rather than
attempting to substitute for it.
We present ObSynth, our synthesis-based interactive tool for solving this task. First, in Sec. 3,
we present the ObSynth UI, describing the workflow when interacting with our system. Next, in
Sec. 4, we discuss in depth how we use LLM’s to equip ObSynth with automation capabilities.
2
Finally, in Sec. 5, we discuss our user study, highlighting the ways ObSynth improves the process
of creating object models. Our contributions are as follows:
1. We introduce and highlight a new task in program synthesis: specification reification. We
also introduce the object model synthesis task as an important sub-problem of specification
reification.
2. We design an interactive system, ObSynth, that assists humans in completing this task by
automating parts of the process. Instead of designing the object model purely from scratch,
ObSynth synthesizes a set of initial object model that the user can then build off of. Users
can also ask ObSynth to automatically add relevant objects, methods, and fields at any point
in the design process.
3. We conduct a user study (n=11)that helps us understand how ObSynth can help par-
ticipants design better object models. Through this user study, we discovered three key
findings. First, object models designed using ObSynth are more detailed, showing that it
often synthesizes fields users might have otherwise omitted. Second, a majority of objects,
methods, and fields generated by ObSynth were kept by the user in the final object model,
highlighting the quality of generated components. Third, ObSynth altered the workflow of
participants: they focus on checking that synthesized components were correct rather than
generating them from scratch. However, ObSynth did not reduce the time participants took
to generate object models.
2 Related Work
Program Synthesis: The field of program synthesis has had a long history, with a variety of ap-
proaches summarized by Gulwani et al. (2017). The first line of approaches to appear mostly
focused on inductive synthesis (matching a set of input-output examples) approaches such as
bottom-up search (Alur et al., 2015), top-down search (Feser et al., 2015), type-directed search (Os-
era and Zdancewic, 2015), and constraint-solving (Singh and Solar-Lezama, 2016). Later, however,
richer forms of program specifications were used for synthesis.
In recent years, with new developments in machine learning, there have been more and more
works exploring the potential of augmenting traditional synthesis techniques with neural net-
works; (Chaudhuri et al., 2021) provides a complete survey. These include approaches to learn
abstractions and libraries from scratch (Ellis et al., 2020; Wong et al., 2021; Nye et al., 2020b),
execution-guided approaches that evaluate partial program states (Nye et al., 2020a; Gupta et al.,
2020; Chen et al., 2018), and approaches guided by natural language information (Wong et al.,
2021; Ye et al., 2020b,a; Nye et al., 2019; Polosukhin and Skidanov, 2018).
Ontologies and Knowledge Graphs: There has also been a body of work that aims to build
ontologies and knowledge graphs of natural language concepts, such as Yago (Suchanek et al.,
2007), WordNet (Miller, 1995), and DBpedia (Auer et al., 2007). While these knowledge graphs
have been applied in traditional NLP tasks such as question answering (Boi´
nski et al., 2020),
they are unable to provide specific insights for our synthesis task such as synthesizing fields for a
certain object. As an example, when searching for nearest neighbors related to student, WordNet
comes up with synonyms such as pupil,educatee, and scholar, while Yago provides a Wikipedia
page for a student, a definiton of a student in Spanish, and an image containing many students.
In addition, our synthesis task is very contextual: the fields of a student object would be very
different if we were designing an app for teachers to manage the classroom vs a social app for
students to make friends with one another. It is difficult to capture this form of context via
ontologies and knowledge graphs.
3
Large Language Models: Recent years has also seen the birth of new works leveraging large
language models (LLMs) like GPT-3 (Brown et al., 2020) to perform program synthesis. A few
months ago, GitHub released a powerful code autocompletion tool called GitHub Copilot which
uses context such as natural language comments and previous code in order to generate code.
Copilot is built off of OpenAI’s powerful machine learning model Codex (Chen et al., 2021),
which translates natural language to code in almost a dozen programming languages. CodeBERT
(Feng et al., 2020) learns representations of code and natural language for downstream tasks like
code search and code documentation generation. Heyman et al. (2021) use GPT-2 trained on a
corpus of well-documented and commented code to synthesize programs for data science and
machine learning. Building off of LLMs, Austin et al. (2021) incorporate human feedback to
repair generated code.
There have been other works combining traditional program synthesis techniques with large
language models. Verbruggen et al. (2021) uses traditional inductive synthesis techniques with
GPT-3 to learn small intermediate functions that cannot be represented symbolically. Jigsaw
(Jain et al., 2021) uses LLMs to synthesize code but use program analysis techniques to do post-
processing. Rahmani et al. (2021) take a component-based synthesis approach guided by LLMs
which, for example, help rank candidate programs. Our approach differs from existing works
in LLMs in that we approach synthesis from a global view, generating the overall structure of
applications rather than the local structure of code itself.
3 The ObSynth Frontend
In this section, we describe our vision of how users interact with ObSynth to generate a final
specification from a text prompt. Fig. 2 shows the steps of the ObSynth workflow at a high level,
while Fig. 3 shows the concrete UI users work with at each step. In Sec. 3.1, we explain steps
(1)-(4), where the user specifies an initial text prompt and works with ObSynth to obtain an initial
full object model. Then, in Sec. 3.2, we explain step (5), where users use ObSynth to tweak this
object model to fit their use case.
3.1 From Text to Initial Object Model
The user starts by specifying a text prompt as shown in Fig. 3, step (1). As a running example,
we use the prompt “I want a restaurant management app tracking customers, their reservations, their
orders, and menu items.” When the user enters this prompt, ObSynth synthesizes a list of object
names without fields or methods—in this case customer, reservation, order, menu item, table, waiter,
as shown in Fig. 3, step (2). The user can then add, edit, or delete these object names in the same
UI. ObSynth also provides an additional functionality, which will attempt to synthesize additional
relevant objects (the purple Auto Add Object button). This feature could potentially suggest objects
that the user may not have thought of themselves. After specifying a list of object names, ObSynth
will automatically synthesize a set of fields, types, and method names for each object, as seen in
step (4). This generates an initial object model specification for the text prompt.
3.2 From Initial Object Model to Final Specification
Since different users may have different use cases, ObSynth gives users the flexibility to edit the
object model as desired. After generating a full object model, users see the UI shown in the bottom
panel of Fig. 3. We first explain the UI features (grey buttons), and then move to the synthesis
features (blue buttons).
4
Figure 2: Complete workflow of a user synthesizing an object model from a text specification
using ObSynth. Orange indicates user interaction, blue indicates ObSynth automation.
ObSynth UI features: As shown in Fig. 3, ObSynth’s UI allows users to easily add, delete, and
edit their own objects, fields, and methods at all stages of the process. Deleted objects, fields, and
methods can also be restored. Users can easily toggle the multiplicity of an object field by clicking
the one/many button. ObSynth also has a one-way/two-way button that adds reverse object-field
relationships: if a student object has a teacher field and there is a teacher object, the button will
toggle whether the teacher object has a student field. Finally, ObSynth ensures that when the
user changes the name of an object, all other fields having that object type are renamed. All these
buttons are shown in grey in Fig. 3.
ObSynth synthesis features: However, what makes ObSynth unique is its synthesis capabilities,
shown in the blue buttons in Fig. 3. When the user clicks Begin, a set of initial object names is
automatically synthesized for them. When the user clicks Generate Fields and Methods, fields and
methods for each object are likewise automatically populated. The blue Auto Add buttons allow
users to synthesize specific parts of the object model: Auto Add Field synthesizes a new field, type,
and multiplicity for the current object. Auto Add Method synthesizes a new method name for an
existing object. Auto Add Object synthesizes a new relevant object and fully populates it with
fields (including types and multiplicity) and method names. These synthesis tools ease the user
in the development of a suitable object model for their use case, especially giving the user ideas
for objects, fields, and methods they might have overlooked.
4 The ObSynth Backend
In Sec. 3, we described the frontend of ObSynth and how users can interact with the synthesis
features of ObSynth to design an object model. In this section, we go into the precise technical
details of how these synthesis components work.
5
摘要:

ObSynth:AnInteractiveSynthesisSystemforGeneratingObjectModelsfromNaturalLanguageSpecicationsAlexGugua@mit.eduTamaraMitrovska?tamaram@mit.eduDanielaVelez?dvelez@mit.eduJacobAndreasjda@mit.eduArmandoSolar-Lezamaasolar@csail.mit.eduMassachusettsInstituteofTechnology,Cambridge,MA,USAAbstractWeintroduce...

展开>> 收起<<
ObSynth An Interactive Synthesis System for Generating Object Models from Natural Language Specifications.pdf

共25页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:25 页 大小:3.47MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 25
客服
关注