visualcode-description
visualcode-description
Reuse
Haipeng Cai (Washington State University)
1 Abstract
Modern software is rarely developed from scratch, but rather often created from existing code. While
this reuse-oriented software development paradigm leads to enhanced productivity and quality, identifying
reusable code of desirable functionalities is challenging. A major obstacle lies in the common absence
of functional specification, along with the difficulty of automatically understanding code semantics. This
project proposes to overcome these challenges by learning visual semantics of programs from their graphical
outputs. A corpus of graphical programs and their outputs will be used to curate an incrementally growing
knowledge base of reusable components, which will offer reusable code search against natural-language queries.
To that end, latest advances in computer vision and image understanding empowered by deep learning will
be exploited to compute graphical features hence semantic description of programs. Beyond code reuse, the
graphical features will immediately support semantic program comprehension and comparison independent
of the programming languages used, among other software engineering tasks.
3 Proposed Work
We propose to innovate code understanding and reuse by learning code semantics of programs from their
execution outputs that are graphical, referred to as visual semantics. The key insight is that graphical
Graphical program
① ② ③ Code understanding
Visualization Graphical Visual Graphical Semantics and reuse
outputs understanding features
Test inputs engine engine learning
Other applications
Figure 1: Overview of the proposed cross-language visual semantics learning approach.
1
representations (e.g., a 3D tube) are much easier for humans to understand, hence intuitively also easier for
a computer program to learn, than is the algorithmic logic of the code (e.g., complex geometric computations)
that produces those representations. As illustrated in Figure 1, the proposed approach will consist of three
key technical modules, as numbered in the figure and elaborated below.
Visualization engine. Given a graphical program P , we will generate the graphical outputs of P through a
visualization engine. Build utilities/resources (e.g., makefiles and other build configurations) and test inputs
coming with P will be utilized to automate the execution, and advanced image acquisition/manipulation
frameworks (e.g., Processing and VTK) will be used to capture and index the graphical outputs.
For each program, two levels of graphical outputs will be archived: application-level output as a result
of executing the whole program (e.g., against an integration/system test), and unit-level output resulting
from the execution of a functional unit (e.g., a method) of the program (e.g., against a unit test). The first
will help identify the entire program as a reusable component while the second a particular reusable routine
(e.g., the method). In the cases of lacking adequate test inputs, we will utilize state-of-the-art automated
test generators to augment the existing test suite associated with the program. In particular, targeted input
generation (e.g., [10]) techniques will be used to trigger graphical outputs when necessary (e.g., with graphical
APIs set as the targets). We will leverage our prior experiences in automatic project build management [3]
and data visualization [2, 1] for developing the holistic visualization engine.
Visual understanding engine. Once the program is transformed from code to graphical representations
via the visualization engine, the next step is to transform each representation to low-level visual segments
(elements) that are amendable to machine learning. To that end, we will build a dedicated engine based
on state-of-the-art visual-understanding techniques [7] to accurately segment the graphical outputs and
recognize separable visual elements therein for each program P . The engine will treat the outputs of P as
a sequence S of images and label identified objects in each image (e.g., a rectangle). In addition, graphical
attributes of each object corresponding to common visual encoding variables will also be computed, such
as color, size, orientation, and shape. These attributes and labels from all images in S will then constitute
the graphical features, and be synthesized to form a single feature vector, to represent P . Thanks to
the innovations in deep learning, the accuracy of current visual object recognition approaches [6] will be
sufficient for learning program code semantics in this project. For instance, convolutional neural networks
have attained 95% accuracy for recognizing objects in real-world scenes. The graphical outputs of most
programs are two-dimensional thus would not even be as complex as those scene pictures.
The resulting graphical features will empower a range of applications. For example, a natural-language
description of visual semantics can be generated from the feature vector for a given program (e.g., via an
NLG engine), which directly supports code-semantics understanding. Another application is to use such
vectors for cross-language semantic code differencing, which is not currently available. Relevant libraries and
frameworks (e.g., OpenCV+TensorFlow) will be employed for developing the visual understanding engine.
Semantics clustering. To support automated code understanding and reuse, we propose to build a
searchable knowledge base of graphical programs through unsupervised learning. For practical use, this
knowledge base would need to include a sizable and diverse set of code samples. To start with, we propose
to mine public software repositories (e.g., Apache, github) to garner a reasonable number of open-source
projects. The visualization engine will aid in selecting graphical programs out of these projects (e.g., by
checking whether the program attempts to open the display device). In addition, example/illustrating
programs for graphics and visualization programming will be mined and included in the starting sample set.
Next, we will apply a clustering algorithm on this sample set to divide it into visual semantics clusters.
Each cluster will represent a set of programs similar in their graphical outputs, and will be labeled according
to an aggregation of the graphical features that represent its member programs. The clustered sample set
will form the initial knowledge base, which will continuously grow by incorporating new graphical programs
and their features through incremental clustering [5]. When searching this knowledge base for reusable
components, the natural-language description of a user request will be decomposed into parts of speech in
terms of primitive object labels (names) and visual encoding variables. The cluster label that best matches
the user request will be retrieved, and all programs in the cluster will be returned to the user for reference.
Further matching based on per-program feature vector would identify the best-matching program out of the
cluster, which will be returned as the optimal answer. Implementing the knowledge base and natural-language
querying will be eased by leveraging respective existing libraries (e.g., scikit-learn and openNLP).
2
References Cited
[1] H. Cai. Parallel rendering for legible illustrative visualizations of dense geometries on commodity CPUs.
International Journal of Image and Graphics, 16(1), 2016.
[2] H. Cai, J. Chen, A. P. Auchus, and D. H. Laidlaw. Inshape: In-situ shape-based interactive multiple-view
exploration of diffusion mri visualizations. In International Symposium on Visual Computing, pages
706–715, 2012.
[3] H. Cai and R. Santelices. A comprehensive study of the predictive accuracy of dynamic change-impact
analysis. Journal of Systems and Software, 103:248–265, 2015.
[4] X. Cai, M. R. Lyu, K.-F. Wong, and R. Ko. Component-based software engineering: technologies,
development frameworks, and quality assurance schemes. In Asia-Pacific Software Engineering
Conference, pages 372–379, 2000.
[5] M. Charikar, C. Chekuri, T. Feder, and R. Motwani. Incremental clustering and dynamic information
retrieval. SIAM Journal on Computing, 33(6):1417–1440, 2004.
[6] K. Grauman and B. Leibe. Visual object recognition. Synthesis lectures on artificial intelligence and
machine learning, 5(2):1–181, 2011.
[7] Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu, and M. S. Lew. Deep learning for visual understanding:
A review. Neurocomputing, 187:27–48, 2016.
[8] A. Podgurski and L. A. Clarke. A formal model of program dependences and its implications for software
testing, debugging, and maintenance. IEEE Transactions on Software Engineering, 16(9):965–979, 1990.
[9] R. S. Pressman. Software engineering: a practitioner’s approach. Palgrave Macmillan, 2005.
[10] S. Rasthofer et al. Making malory behave maliciously: Targeted fuzzing of Android execution
environments. In International Conference on Software Engineering, pages 300–311, 2017.
[11] S. Shoham, E. Yahav, S. J. Fink, and M. Pistoia. Static specification mining using automata-based
abstractions. IEEE Transactions on Software Engineering, 34(5):651–666, 2008.