visualcode-description

This document proposes a method for automated code understanding and reuse by learning visual semantics from graphical outputs of programs. It highlights the challenges in identifying reusable code due to the lack of functional specifications and suggests using deep learning techniques to extract and represent program semantics visually. The approach aims to create a knowledge base of reusable components that can be searched using natural language queries, enhancing software development efficiency.

Uploaded by

Shiv Raj Pant

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

visualcode-description

Uploaded by

Shiv Raj Pant

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Learning Visual Semantics for Automated Code Understanding and

Reuse
Haipeng Cai (Washington State University)

1 Abstract
Modern software is rarely developed from scratch, but rather often created from existing code. While
this reuse-oriented software development paradigm leads to enhanced productivity and quality, identifying
reusable code of desirable functionalities is challenging. A major obstacle lies in the common absence
of functional specification, along with the difficulty of automatically understanding code semantics. This
project proposes to overcome these challenges by learning visual semantics of programs from their graphical
outputs. A corpus of graphical programs and their outputs will be used to curate an incrementally growing
knowledge base of reusable components, which will offer reusable code search against natural-language queries.
To that end, latest advances in computer vision and image understanding empowered by deep learning will
be exploited to compute graphical features hence semantic description of programs. Beyond code reuse, the
graphical features will immediately support semantic program comprehension and comparison independent
of the programming languages used, among other software engineering tasks.

2 Problem and Motivation

Contemporary software processes typically feature modular design, resulted from separation of concerns [9].
Compared to developing a component from requirements solicitation all the way to testing and validation,
reusing an existing component that meets the requirements can substantially reduce development cost.
Meanwhile, component-based development tends to result in software products of low coupling and high
cohesion, which eventually benefits quality assurance and maintenance [4]. However, despite rich potential
resources of reusable components (e.g., a plethora of open source projects on github), it remains challenging
to automatically identify components that meet desired requirements. First, the document that describes
the functionalities of these components would greatly ease the task, yet such documents (e.g., requirements
specification, or even informative code comments) are usually unavailable. Second, specification mining and
recovery techniques exist [11], yet their resulting descriptions/representations are too coarse to be amendable
for components identification. Third, conventional program analysis falls far short of automated semantics
inference; even semantic dependencies can only be roughly approximated (through syntactic analysis) [8].
A technique that accurately extracts the semantics specification from a given program would address
these challenges hence to enable automated discovery of desirable components for reuse. More broadly, the
technique would also unprecedentedly facilitate automatic semantics understanding, which potentially has
numerous applications in software engineering and systems. While it remains intractable to automatically
deduce program semantics from the code itself, it would be promising to learn the semantics from graphical
outputs of the program, if any. Empowered by deep learning, today’s visual object recognition and image
understanding techniques have reached the level of performance equal to or beyond humans. Exploiting
these advances would open a new door for overcoming key challenges in understanding and reusing programs
with graphical outputs (e.g., GUIs), referred to as graphical programs, across different languages. To most
end users, software applications today are typically graphical (e.g., mobile apps are mostly GUI software).
Thus, this work can have a broad impact on modern software engineering.

3 Proposed Work
We propose to innovate code understanding and reuse by learning code semantics of programs from their
execution outputs that are graphical, referred to as visual semantics. The key insight is that graphical

Graphical program
① ② ③ Code understanding
Visualization Graphical Visual Graphical Semantics and reuse
outputs understanding features
Test inputs engine engine learning
Other applications
Figure 1: Overview of the proposed cross-language visual semantics learning approach.
1
representations (e.g., a 3D tube) are much easier for humans to understand, hence intuitively also easier for
a computer program to learn, than is the algorithmic logic of the code (e.g., complex geometric computations)
that produces those representations. As illustrated in Figure 1, the proposed approach will consist of three
key technical modules, as numbered in the figure and elaborated below.
Visualization engine. Given a graphical program P , we will generate the graphical outputs of P through a
visualization engine. Build utilities/resources (e.g., makefiles and other build configurations) and test inputs
coming with P will be utilized to automate the execution, and advanced image acquisition/manipulation
frameworks (e.g., Processing and VTK) will be used to capture and index the graphical outputs.
For each program, two levels of graphical outputs will be archived: application-level output as a result
of executing the whole program (e.g., against an integration/system test), and unit-level output resulting
from the execution of a functional unit (e.g., a method) of the program (e.g., against a unit test). The first
will help identify the entire program as a reusable component while the second a particular reusable routine
(e.g., the method). In the cases of lacking adequate test inputs, we will utilize state-of-the-art automated
test generators to augment the existing test suite associated with the program. In particular, targeted input
generation (e.g., [10]) techniques will be used to trigger graphical outputs when necessary (e.g., with graphical
APIs set as the targets). We will leverage our prior experiences in automatic project build management [3]
and data visualization [2, 1] for developing the holistic visualization engine.
Visual understanding engine. Once the program is transformed from code to graphical representations
via the visualization engine, the next step is to transform each representation to low-level visual segments
(elements) that are amendable to machine learning. To that end, we will build a dedicated engine based
on state-of-the-art visual-understanding techniques [7] to accurately segment the graphical outputs and
recognize separable visual elements therein for each program P . The engine will treat the outputs of P as
a sequence S of images and label identified objects in each image (e.g., a rectangle). In addition, graphical
attributes of each object corresponding to common visual encoding variables will also be computed, such
as color, size, orientation, and shape. These attributes and labels from all images in S will then constitute
the graphical features, and be synthesized to form a single feature vector, to represent P . Thanks to
the innovations in deep learning, the accuracy of current visual object recognition approaches [6] will be
sufficient for learning program code semantics in this project. For instance, convolutional neural networks
have attained 95% accuracy for recognizing objects in real-world scenes. The graphical outputs of most
programs are two-dimensional thus would not even be as complex as those scene pictures.
The resulting graphical features will empower a range of applications. For example, a natural-language
description of visual semantics can be generated from the feature vector for a given program (e.g., via an
NLG engine), which directly supports code-semantics understanding. Another application is to use such
vectors for cross-language semantic code differencing, which is not currently available. Relevant libraries and
frameworks (e.g., OpenCV+TensorFlow) will be employed for developing the visual understanding engine.
Semantics clustering. To support automated code understanding and reuse, we propose to build a
searchable knowledge base of graphical programs through unsupervised learning. For practical use, this
knowledge base would need to include a sizable and diverse set of code samples. To start with, we propose
to mine public software repositories (e.g., Apache, github) to garner a reasonable number of open-source
projects. The visualization engine will aid in selecting graphical programs out of these projects (e.g., by
checking whether the program attempts to open the display device). In addition, example/illustrating
programs for graphics and visualization programming will be mined and included in the starting sample set.
Next, we will apply a clustering algorithm on this sample set to divide it into visual semantics clusters.
Each cluster will represent a set of programs similar in their graphical outputs, and will be labeled according
to an aggregation of the graphical features that represent its member programs. The clustered sample set
will form the initial knowledge base, which will continuously grow by incorporating new graphical programs
and their features through incremental clustering [5]. When searching this knowledge base for reusable
components, the natural-language description of a user request will be decomposed into parts of speech in
terms of primitive object labels (names) and visual encoding variables. The cluster label that best matches
the user request will be retrieved, and all programs in the cluster will be returned to the user for reference.
Further matching based on per-program feature vector would identify the best-matching program out of the
cluster, which will be returned as the optimal answer. Implementing the knowledge base and natural-language
querying will be eased by leveraging respective existing libraries (e.g., scikit-learn and openNLP).

2
References Cited
[1] H. Cai. Parallel rendering for legible illustrative visualizations of dense geometries on commodity CPUs.
International Journal of Image and Graphics, 16(1), 2016.
[2] H. Cai, J. Chen, A. P. Auchus, and D. H. Laidlaw. Inshape: In-situ shape-based interactive multiple-view
exploration of diffusion mri visualizations. In International Symposium on Visual Computing, pages
706–715, 2012.
[3] H. Cai and R. Santelices. A comprehensive study of the predictive accuracy of dynamic change-impact
analysis. Journal of Systems and Software, 103:248–265, 2015.
[4] X. Cai, M. R. Lyu, K.-F. Wong, and R. Ko. Component-based software engineering: technologies,
development frameworks, and quality assurance schemes. In Asia-Pacific Software Engineering
Conference, pages 372–379, 2000.
[5] M. Charikar, C. Chekuri, T. Feder, and R. Motwani. Incremental clustering and dynamic information
retrieval. SIAM Journal on Computing, 33(6):1417–1440, 2004.
[6] K. Grauman and B. Leibe. Visual object recognition. Synthesis lectures on artificial intelligence and
machine learning, 5(2):1–181, 2011.
[7] Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu, and M. S. Lew. Deep learning for visual understanding:
A review. Neurocomputing, 187:27–48, 2016.
[8] A. Podgurski and L. A. Clarke. A formal model of program dependences and its implications for software
testing, debugging, and maintenance. IEEE Transactions on Software Engineering, 16(9):965–979, 1990.
[9] R. S. Pressman. Software engineering: a practitioner’s approach. Palgrave Macmillan, 2005.
[10] S. Rasthofer et al. Making malory behave maliciously: Targeted fuzzing of Android execution
environments. In International Conference on Software Engineering, pages 300–311, 2017.
[11] S. Shoham, E. Yahav, S. J. Fink, and M. Pistoia. Static specification mining using automata-based
abstractions. IEEE Transactions on Software Engineering, 34(5):651–666, 2008.