VividGraph Learning To Extract and Redesign Network Graphs From Visualization Images
VividGraph Learning To Extract and Redesign Network Graphs From Visualization Images
fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 1
Abstract—Network graphs are common visualization charts. They often appear in the form of bitmaps in papers, web pages,
magazine prints, and designer sketches. People often want to modify graphs because of their poor design, but it is difficult to obtain
their underlying data. In this paper, we present VividGraph, a pipeline for automatically extracting and redesigning graphs from static
images. We propose using convolutional neural networks to solve the problem of graph data extraction. Our method is robust to
hand-drawn graphs, blurred graph images, and large graph images. We also present a graph classification module to make it effective
for directed graphs. We propose two evaluation methods to demonstrate the effectiveness of our approach. It can be used to quickly
transform designer sketches, extract underlying data from existing graphs, and interactively redesign poorly designed graphs.
Index Terms—Information visualization, Network graph, Data extraction, Chart recognition, Semantic segmentation, Redesign
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2
Bitmap Vector
an attention mechanism to detect bar charts, in which
deep learning techniques have been well applied.
D3 Some researchers have focused on the method of
semiautomatically extracting bitmap chart data. Jung et
al. [5] introduced ChartSense, a system to increase the
E-charts
Input Retarget Recolor Re-layout Modification
data extraction accuracy by manually adding informa-
tion. DataThief [18] is another semiautomatic tool for ex-
Fig. 1. VividGraph can be used in many practical ap- tracting data from line charts. The users need to provide
plications. The input is a bitmap. Through our semantic information such as the coordinates of the start point
segmentation and connection algorithm, we can obtain and endpoint of the line and the positions of the hor-
its underlying data. Using the extracted data, we can re- izontal and vertical axes. iVoLVER [19] integrates data
construct the vector of the graph and redesign the chart, extraction of bitmap images and SVG objects on the web,
such as recoloring, re-layout, and data modification. providing a semiautomatic data extraction framework.
These semiautomatic methods rely on a large quantity
of user interaction data, such as specifying the data type
works to solve the problem of graph data extraction.
(e.g., color and shape) and providing the dividing line
(2) We propose a pipeline with semantic segmentation
location. While improving data extraction accuracy, it
to accurately identify the characteristics of graphs.
also reduces efficiency and requires considerable manual
The method is robust to hand-drawn graphs, blurred
intervention. However, to the best of our knowledge,
images, and large images.
these frameworks do not support data extraction and
(3) We propose two methods to evaluate the effective-
redesign of graph bitmaps.
ness of our methods through structure similarity and
image similarity.
2.2 Graph Perception with CNNs
2 R ELATED W ORK Haehn et al. [9] reproduced Cleveland and McGill’s [20]
graphic perception evaluation experiment with CNNs.
Our work is related to three technologies: chart ex-
They compared the recognition capabilities of four net-
traction, graph perception with CNNs (Convolutional
works, MLP, LeNet [21], VGG [22], and Xception [23],
Neural Networks), and chart interaction.
on nine basic perception tasks. They presented that
the graphic perception ability of VGG19 is the best
2.1 Chart Extraction among these networks. They proposed that graphs are
Some inverse visualization studies extract data from advanced graphical coding, so the task of extracting
charts. Harper et al. [13] proposed a method for extract- data from graphs is challenging. Haleem et al. [24]
ing underlying data from a visualization chart built with evaluated the readability of force-directed graph lay-
the D3 library [11]. This method depends on web page outs with CNNs. Giovannangeli et al. [25] continued
code, such as HTML and SVG. The extracted data can be this experiment and used CNNs to evaluate the image
used for chart redesign or color mapping redefinition. perception ability of graphs. However, their evaluation
WebPlotDigitizer [14] is another graph data extraction task indicators were only the number of edges, nodes
tool based on web pages. It can extract four types of and maximum degree of the graph, not the topological
graph data, including bar charts, line charts, pole charts relations, the most critical data in the graph.
and ternary charts. However, the accuracy of this tool is These pattern recognition methods all regress the
not high, and users generally add information manually graph numerically. The graphs are too complex to make
to improve accuracy. Another tool called Ycasd [15] these methods feasible. Therefore, we thought of the
requires the user to provide the position of all points semantic segmentation method, a method commonly
on the line to extract the line chart data. used in the field of computer vision for natural images.
Static bitmaps are encountered more often. There are The fully convolutional network (FCN) [26] was the
some studies based on image processing and machine earliest method to obtain segmentation results equal to
learning to solve the problem of data extraction in the input image size through a series of convolutional
bitmap charts. ReVision [6] is a data extraction frame- layers and deconvolution. SegNet [27] is similar to FCN.
work for bitmap charts, which automatically divides While deepening the network, it uses pooling indices to
charts into ten categories and focuses on data extraction save the contour image information. PSPNet [28] uses
for pie charts and bar charts. FigureSeer [7] focuses a pyramid pooling module to simultaneously feed the
on extracting data from line charts. Poco et al. [8, 16] feature map through four parallel pooling layers. It then
proposed a data extraction method with legends and obtains and upsamples four outputs of different sizes.
extend the research to heat maps. They focused on the These semantic segmentation networks are applied to
role of the legend text in the charts and added an OCR natural image data, such as VOC datasets. Deep fea-
module to solve the data extraction problem for charts ture extraction networks cause the target to lose small
with legends. Zhou et al. [17] proposed a network with features, leaving basically correct segmentation results.
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 3
Color
Node
Coordinate
Radius
ID
Fig. 2. The VividGraph pipeline includes five steps: (1) input a graph image, (2) classification of directed and
undirected graphs, (3) semantic segmentation network, (4) algorithm to reconnect the nodes, and (5) interactive
chart redesign.
However, for graph images, these small features, such proposed a pipeline that can answer questions about the
as edges, are important. The network parameters need chart.
to be simplified compared to natural images.
3 M ETHODS
2.3 Chart Interaction The extraction of graphs faces three major problems: it
Chart redesign can maximize the value of the original is difficult to label the data set, the topological relation
data and help readers understand these data quickly is large. If a graph has n nodes, there will be in the order
and accurately. [29] Good chart visualization is made of n2 topological relations. The traditional method [25]
up of a clear-sighted layout, easily distinguishable color has difficulty extracting edges. Therefore, we establish a
scheme and interactive application scenarios. First, the graph data set with automatically generated pixel labels
layout gives readers the first impression of a chart. and propose VividGraph to automatically extract the
Takayuki et al. [30] combined a force-directed layout data of graphs. VividGraph is a framework composed
with a hybrid space-filling method to simultaneously of four modules: (1) classification of directed and undi-
represent both connectivity and categories of multiple- rected graphs, (2) semantic segmentation network, (3)
category graphs. Arc diagrams [31] were proposed to algorithm of node reconnection, and (4) interactive chart
display complex patterns of string data with overlap- redesign. First, we classify the graph into a directed
ping parts. Ka-ping et al. [32] added animation tech- graph and an undirected graph so that the second step
niques to radial layouts so that users can interactively uses different parameters for data extraction. Second,
explore the dynamic evolution of topological relations. VividGraph uses a semantic segmentation network to
Second, the similarity and comparison of color schemes locate the node and edge pixels. Third, we design an
help readers understand directly. Lujin Wang et al. [33] algorithm to reconnect these nodes, which can calculate
used a knowledge-based system to learn color design topological relations. Fourth, VividGraph uses the ex-
rules and apply them to illustrative visualization. Jorge tracted data to redesign graphs according to user needs.
et al. [8] extracted color encodings from a heatmap
and recolored it with different color schemes to make
the heatmap more comprehensible. Third, deeper in- 3.1 Training Dataset Generation
formation can be acquired by interactive operations. Considering that our data extraction algorithm uses a
Thinkbase and Thinkpedia [34] are used to excavate semantic segmentation neural network, we need graph
semantic graphs with interactive operations of large datasets with pixel-level labels. We classify pixel labels
knowledge repositories so that web content can be into three categories, including background, nodes, and
explored more easily. NR [35] provided an interactive edges. When we obtain the category of each pixel, we
database used for visual interactive analytics based on can calculate all network attributes including relation,
the web. Lai et al. [36] combined the chart extraction node size, edge width (thickness), color, etc. However,
technology of bar charts and pie charts with natural it is difficult to label the graphs generated by common
language processing technology and proposed a visual visualization frameworks with pixels. Inspired by some
interactive automatic annotation method. Kim et al. [37] studies [9, 10, 38] using image synthesis as datasets,
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 4
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 5
𝐶𝐶back
𝑂𝑂𝑖𝑖 𝐶𝐶𝑖𝑖 to reconnect nodes to calculate the topological relation
of the graph. First, we extract the connected components
𝑅𝑅𝑖𝑖
labeled as nodes in the image. We perform an erosion
𝐴𝐴 < 𝛾𝛾
operation on these connected components to remove
𝑗𝑗
𝐶𝐶𝑖𝑖1 ,𝑖𝑖2
noise, where the size of the kernel is determined by the
a 𝐴𝐴 ≥ 𝛾𝛾 b c area of the connected components. Erosion and dilation
are two fundamental morphological operations of image
Fig. 5. Illustration of the semantic segmentation and processing [47]. Then, we use the same size kernel to
reconnection algorithm. In (b), white pixels indicate that perform a dilation operation on the connected com-
the pixels are detected as background, blue pixels are ponents to maintain the node radius. These connected
nodes, and orange pixels are edges. There are noise components are the nodes of the graph.
pixels detected as edges near the nodes, but it does not • W, H: image width and height
reduce the efficiency of our algorithm. • Cx,y : input image color, storing 3D data (R, G, B).
0, back
• Labelx,y = 1, node :semantic segmentation results
labels (for directed graphs).
2, edge
of pixels located at (x, y)
3.3 Semantic Segmentation Network • Oi , Ri : center coordinate and radius of Node i
Giovannangeli et al. [25] noted that it is difficult to rec- • Cij1 ,i2 : color of Edge j
ognize edges with traditional image processing methods • CC: Connected Component
and evaluate the possibility of using CNNs to perceive • Areai : rectangular area surrounding CC i
the network. However, their evaluation task indicators • k: erosion or dilation operation core size
were only the number of edges, nodes and maximum • γ: threshold of edge pixels between two nodes
degree of the graph. Therefore, they could not extract
the topological relation in the images. Many inverse
Algorithm 1 Node Reconnect Algorithm
visualizations in the past were based on simple visual
coding, and their attributes were often relatively simple. Input: {Cx,y |x ∈ [0, W ], y ∈ [0, H]},
For example, the attribute of bar charts is the length of {Labelx,y |x ∈ [0, W ], y ∈ [0, H]}
the bar, the attribute of scatter graphs is the coordinate Output: Oi , Ri , Cij1 ,i2
value of a point, and the attribute of point cloud graphs Extract the CC of Labelx,y = 1
is the number of points. As a high-level visual coding, for all CC do √
the attributes of networks are not only the size, location, k = 31 × Areai
and number of its nodes but also the relations of these Use (k, k) size kernel to perform morphological
nodes. If using the adjacency matrix of the graph as opening on CC
the annotations for training, the data extraction task end for
becomes a deep learning regression task. This method Extract CC again
is unrealistic, and the regression data dimension is too for all CC do
large. Therefore, we split the data extraction task into Oi =The coordinates
√ of the center pixel of CC
two parts. The first part uses the semantic segmentation Ri = 21 × Areai
network to locate the nodes and edges, and the second end for
part reconnects the nodes through our algorithm. for each Node i1 and Node i2 ,i1 6= i2 do
In traditional convolutional neural networks, an input Draw an line connecting Node i1 and Node i2
image often has only one output label. However, in Check A = {(x, y)|Labelx,y = 2, (x, y) ∈ line}
a semantic segmentation network, each pixel has an Perform dilation operation
output label. In this task, we use the elements of the Set γ ∝ lengthline
graph as pixel-level labels (background, node, edge). if |A| > γ then
We choose U-Net [45], a semantic segmentation con- Node i1 and Node i2 are connected
volutional network applied to medical images, as our Cij1 ,i2 = C¯x,y , (x, y) ∈ A
semantic segmentation network model. We normalize end if
the image to a dimension of 320 × 320 × 3 (sRGB space). end for
We choose VGG16 as the U-Net backbone network. We return Oi , Ri , Cij1 ,i2 ;
use the popular deep learning framework Keras [46] to
quickly implement our model.
We draw a line between every two nodes to check the
number of pixels detected as edges on this line. If the
3.4 Reconnection Algorithm number of such pixels exceeds the threshold, the two
When we obtain the pixel-level label output by the nodes are considered to be connected. The size of the
semantic segmentation network, we design an algorithm threshold is proportional to the length of the line. The
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 6
a b c d
result of semantic segmentation and the process of the
reconnection algorithm are shown in Figure 5. Blank
Fragment
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 7
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 8
4.3 Visualization Redesign Fig. 10. The reconstruction results of large graphs.
VividGraph can extract complex networks with dense
Many existing graphs have poor designs because they
nodes and edges. We show some details in the lower
are not designed by professional visualization workers.
right corner of each picture. We also label some errors in
VividGraph provides automatic and interactive chart
the figure.
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 9
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 10
tionnaire data have been included in the appendix for ifying graphs in papers. Most of the existing graphs in
further study. the paper are bitmaps or sketches. In these interviews,
more than half of the studies were related to deep
100
Color learning. When modifying other neural network struc-
90 tures(e.g. change Inception V1 to Inception V2), they can
80
Edge
Thickness 70
Position save drawing time by using VividGraph. In addition,
60
many researchers are not specialized in visualization.
Our system can also help them generate graphs with
Topologic Node
al Relation Radius reasonable layout and color matching, making the charts
pleasant and easy to obtain features. This type of user
(a) (b)
feedback was the most positive feedback among the
Fig. 12. The questionnaire scoring results. The maximum three types of users. They also hoped that some auxiliary
score is 100. In (a), Our topological relation and location interactive functions can be added in the future.
extraction results are more satisfactory. The extraction of
node size, edge thickness and color is basically accurate,
but still needs improvement. In (b), the graphs from 6 P ERCEPTUAL E VALUATION
D3 library have the most satisfactory performance. The
sketch has room for improvement due to some color bias. We propose two methods to prove the effectiveness of
our method. The first method is NetSimile [55], which
Interview: After the survey, we invited participants to is proven to be a scalable and effective method for
use our system to solve some actual problems in their measuring the structural similarity between two net-
work. We obtain their feedback through interviews to works. NetSimile features integrate the degree of nodes,
improve the system. We divide the feedback into three clustering coefficient, two-hop away neighbors, ego net,
categories according to the user’s occupation. Ordinary etc. It generates a 35-dimensional signature vector from
people who are not computer majors think it is helpful the average, median, and standard deviation of these
to convert hand-drawn graphs into images. In the in- features. It defines the similarity of two networks by
terview, a middle school teacher who teaches natural the Canberra distance of two signature vectors. Given a
sciences and a primary school teacher who teaches graph signature vector xi and the other graph signature
mathematics both stated that graphs often appear in vector yi , where i = 1, 2, 3..., 35, the Canberra distance
elementary education. When they need to make course- between xi and yi is defined as:
ware and examination papers, their electronic drawing 35
ability is so weak that they spend considerable time X kxi − yi k
(2)
drawing graphs. They proposed enhancing the accuracy i=1
kxi k + kyi k
and robustness of hand-drawn graph extraction.
Professional designers feel that the function of con- The lower the Canberra distance is, the more similar
verting hand-drawn graphs into images is not practical. the two networks. This method does not require a one-
They can use drawing software proficiently, and the to-one correspondence of nodes between the two net-
time consumption of hand drawing is greater than works. Therefore, when there are missing nodes in our
drawing directly on the computer. However, they find evaluation experiment, we can still output evaluation
it helpful to convert existing graphs into vectors. They indicators. NetSimile is an indicator that varies with
need to use materials found on the internet in their the number of nodes. To show the range of NetSimile,
design work. The quality of these materials is uneven, we traverse the NetSimile between each graph and the
many of which are bitmaps. They proposed the need graph structure with the same number of nodes. We
for more professional automatic color schemes and more choose the maximum value from them as the maxi-
customized data modification options, such as node size, mum value of NetSimile. When NetSimile reaches this
node color, edge color, and edge width. After adding value, the two graphs are significantly different. In our
some customization options to the system, we again datasets, this value is approximately 24, and we choose
invite these professional designers to use the system. it as the maximum value of the Y-axis.
They also proposed adding some functions to improve The second method is SSIM [50], which is widely
the human-computer interaction experience, such as used to evaluate the similarity of two images. We use
color-picking pens, auxiliary design lines, and canvas the extracted data (topological relation; node coordi-
changes. Embedding our system into drawing software nates, color and size; background color; edge color) to
on iPads or professional drawing software can also be reconstruct the image of the graph. We measure the
considered. In the future, we will continue to explore similarity of the two graphs by calculating the SSIM of
this direction. the reconstructed image and the original image. Given
The last group of people is scientific researchers. They the images x and y of two graphs, where µx is the
are as proficient in computer software as designers. average of x, σx2 is the variance in x, σxy is the covariance
However, their demand lies more in quoting and mod- of x and y, and c is a constant used to maintain stability,
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 11
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 12
Hand-drawn Academic Paper E-Charts Gallery D3 Gallery Shutterstock Simple Ⅰ Simple Ⅱ Directed Graph Complex
(c) Complex Dataset (d) Directed Graph Dataset 320×320 480×480 640×640 480×480 640×640 800×800
Fig. 15. Evaluation experiment results. In (a), the results (c) Complex Dataset (d) Directed Graph Dataset
demonstrate that VividGraph can deal with graphs of Fig. 16. Time performance comparisons of evaluation
various image resolutions and different scale nodes. In experiments.
(b), the results are similar to those of Simple I. In (c),
the results demonstrate that although 1-SSIM raises,
VividGraph is still effective in extreme cases. In (d), the good time performance for graphs with fifty nodes in
results demonstrate that VividGraph is also suitable for general.
data extraction of directed graphs.
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 13
As shown in Figure 10(b), the segmentation result of a [4] P. Zhang, C. Li, and C. Wang, “Viscode: Embedding information
node in the detail is smaller than the ground truth. in visualization images using encoder-decoder network,” IEEE
TVCG, 2020.
Second, the reconnection algorithm can be further [5] D. Jung, W. Kim, H. Song, J.-i. Hwang, B. Lee, B. Kim, and J. Seo,
improved. The segmentation results are not completely “Chartsense: Interactive data extraction from chart images,” in
accurate. When the pixels of the edge category are Proceedings of the 2017 chi conference on human factors in computing
systems, 2017, pp. 6706–6717.
identified as the background category, the edge width [6] M. Savva, N. Kong, A. Chhajta, L. Fei-Fei, M. Agrawala, and
will be smaller than the original image, as shown in J. Heer, “Revision: Automated classification, analysis and re-
Figure 8(c). When the pixels of the background category design of chart images,” in Proceedings of the 24th annual ACM
symposium on User interface software and technology, 2011, pp. 393–
are identified as the edge category, the color of the 402.
edges is closer to back ground, as shown in Figure 8(b). [7] N. Siegel, Z. Horvitz, R. Levin, S. Divvala, and A. Farhadi, “Fig-
Besides, these conditions will result in the offset of the ureseer: Parsing result-figures in research papers,” in European
Conference on Computer Vision. Springer, 2016, pp. 664–680.
edge, as shown in Figure 8(c). For the sketches with
[8] J. Poco, A. Mayhua, and J. Heer, “Extracting and retargeting color
shadow as shown in Figure 7(c,d), our algorithm cannot mappings from bitmap images of visualizations,” IEEE TVCG,
infer the original color of shadowed nodes. We plan vol. 24, no. 1, pp. 637–646, 2017.
to increase the accuracy of the segmentation model or [9] D. Haehn, J. Tompkin, and H. Pfister, “Evaluating ‘graphical
perception’with cnns,” IEEE TVCG, vol. 25, no. 1, pp. 641–650,
optimize the heuristic rules to improve this in the future. 2018.
Third, when the three nodes are collinear, if we have [10] L. Yuan, W. Zeng, S. Fu, Z. Zeng, H. Li, C.-W. Fu, and H. Qu,
no prior knowledge and cannot distinguish with our “Deep colormap extraction from visualizations,” IEEE TVCG,
2021.
eyes, our algorithm will think that any two of these [11] M. Bostock, V. Ogievetsky, and J. Heer, “D3 data-driven docu-
nodes are connected. As shown in Figure 11(c), the ments,” IEEE TVCG, vol. 17, no. 12, pp. 2301–2309, 2011.
collinear nodes are considered as connected with each [12] D. Li, H. Mei, Y. Shen, S. Su, W. Zhang, J. Wang, M. Zu, and
other. W. Chen, “Echarts: a declarative framework for rapid construc-
tion of web-based visualization,” Visual Informatics, vol. 2, no. 2,
pp. 136–146, 2018.
8 C ONCLUSION [13] J. Harper and M. Agrawala, “Deconstructing and restyling d3
visualizations,” in Proceedings of the 27th annual ACM symposium
We proposed a method to extract the underlying data on User interface software and technology, 2014, pp. 253–262.
of the graph image. We also proposed a pipeline called [14] A. Rohatgi, “Webplotdigitizer,” 2017.
[15] A. Gross, S. Schirm, and M. Scholz, “Ycasd–a tool for capturing
VividGraph, which combines a semantic segmentation and scaling data from graphical representations,” BMC bioinfor-
model and a node connection algorithm. This frame- matics, vol. 15, no. 1, p. 219, 2014.
work is suitable for undirected graphs, directed graphs, [16] J. Poco and J. Heer, “Reverse-engineering visualizations: Recov-
blurred graph images, hand-drawn graphs, large graph ering visual encodings from chart images,” in Computer Graphics
Forum, vol. 36, no. 3. Wiley Online Library, 2017, pp. 353–363.
images, and other graphs. VividGraph can be used [17] F. Zhou, Y. Zhao, W. Chen, Y. Tan, Y. Xu, Y. Chen, C. Liu, and
to quickly transform designer sketches, restore blurred Y. Zhao, “Reverse-engineering bar charts using neural networks,”
graph pictures, obtain the underlying data of bitmaps to Journal of Visualization, pp. 491–435, 2021.
[18] A. Flower, J. W. McKenna, and G. Upreti, “Validity and reliability
generate vectors, modify graph data, redesign graphs, of graphclick and datathief iii for data extraction,” Behavior
etc. modification, vol. 40, no. 3, pp. 396–413, 2016.
In the future, we plan to improve the time efficiency [19] G. G. Méndez, M. A. Nacenta, and S. Vandenheste, “ivolver:
Interactive visual language for visualization extraction and re-
and accuracy of pipelines for large-scale networks by construction,” in Proceedings of the 2016 CHI Conference on Human
optimizing networks and algorithms. We will combine Factors in Computing Systems, 2016, pp. 4073–4085.
the model of this paper with OCR technology to im- [20] W. S. Cleveland and R. McGill, “Graphical perception: Theory,
prove our system. Cooperating with designers to im- experimentation, and application to the development of graphical
methods,” Journal of the American statistical association, vol. 79, no.
prove the human-computer interaction experience of the 387, pp. 531–554, 1984.
system is also under our consideration. [21] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based
learning applied to document recognition,” Proceedings of the
IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
ACKNOWLEDGMENTS [22] K. Simonyan and A. Zisserman, “Very deep convolutional net-
works for large-scale image recognition,” pp. 1–14, 2015.
We would like to acknowledge the support from NSFC
[23] F. Chollet, “Xception: Deep learning with depthwise separable
under Grant No. 61802128 and 62072183. convolutions,” in Proceedings of the IEEE CVPR, 2017, pp. 1251–
1258.
[24] H. Haleem, Y. Wang, A. Puri, S. Wadhwa, and H. Qu, “Evaluating
R EFERENCES the readability of force directed graph layouts: A deep learning
[1] J. Yang, Y.-G. Jiang, A. G. Hauptmann, and C.-W. Ngo, “Evaluat- approach,” Computer Graphics and Applications, IEEE, 2019.
ing bag-of-visual-words representations in scene classification,” [25] L. Giovannangeli, R. Bourqui, R. Giot, and D. Auber, “Toward
in Proceedings of the international workshop on Workshop on multi- automatic comparison of visualization techniques: Application
media information retrieval, 2007, pp. 197–206. to graph visualization,” Visual Informatics, 2020.
[2] Y. Liu, X. Lu, Y. Qin, Z. Tang, and J. Xu, “Review of chart recogni- [26] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional net-
tion in document images,” in Visualization and Data Analysis 2013, works for semantic segmentation,” in Proceedings of the IEEE
vol. 8654. International Society for Optics and Photonics, 2013, CVPR, 2015, pp. 3431–3440.
pp. 384–391. [27] V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep
[3] E. Brynjolfsson and K. McElheran, “The rapid adoption of data- convolutional encoder-decoder architecture for image segmenta-
driven decision-making,” American Economic Review, vol. 106, tion,” IEEE transactions on pattern analysis and machine intelligence,
no. 5, pp. 133–39, 2016. vol. 39, no. 12, pp. 2481–2495, 2017.
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TVCG.2022.3153514, IEEE Transactions on Visualization and Computer Graphics
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 14
[28] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing analysis,” in Proceedings of the SIGCHI conference on human factors
network,” in Proceedings of the IEEE CVPR, 2017, pp. 2881–2890. in computing systems, 2013, pp. 483–492.
[29] N. Kong and M. Agrawala, “Graphical overlays: Using layered [53] M. Okoe, R. Jianu, and S. Kobourov, “Node-link or adjacency
elements to aid chart reading,” IEEE transactions on visualization matrices: Old question, new insights,” IEEE TVCG, vol. 25, no. 10,
and computer graphics, vol. 18, no. 12, pp. 2631–2638, 2012. pp. 2940–2952, 2018.
[30] T. Itoh, C. Muelder, K.-L. Ma, and J. Sese, “A hybrid space- [54] M. Ghoniem, J.-D. Fekete, and P. Castagliola, “On the readability
filling and force-directed layout method for visualizing multiple- of graphs using node-link and matrix-based representations: a
category graphs,” in 2009 IEEE Pacific Visualization Symposium. controlled experiment and statistical analysis,” Information Visu-
IEEE, 2009, pp. 121–128. alization, vol. 4, no. 2, pp. 114–135, 2005.
[31] M. Wattenberg, “Arc diagrams: Visualizing structure in strings,” [55] M. Berlingerio, D. Koutra, T. Eliassi-Rad, and C. Faloutsos,
in IEEE Symposium on Information Visualization, 2002. INFOVIS “Netsimile: A scalable approach to size-independent network
2002. IEEE, 2002, pp. 110–116. similarity,” arXiv preprint arXiv:1209.2684, 2012.
[32] K.-P. Yee, D. Fisher, R. Dhamija, and M. Hearst, “Animated [56] Y. Wang, G. Baciu, and C. Li, “Smooth animation of structure
exploration of graphs with radial layout,” in Proc. IEEE InfoVis evolution in time-varying graphs with pattern matching,” in
2001, 2001, pp. 43–50. SIGGRAPH Asia 2017 Symposium on Visualization, ser. SA ’17, New
[33] L. Wang, J. Giesen, K. T. McDonnell, P. Zolliker, and K. Mueller, York, NY, USA, 2017.
“Color design for illustrative visualization,” IEEE TVCG, vol. 14, [57] Z. Bylinskii, N. W. Kim, P. O’Donovan, S. Alsheikh, S. Madan,
no. 6, pp. 1739–1754, 2008. H. Pfister, F. Durand, B. Russell, and A. Hertzmann, “Learning
[34] C. Hirsch, J. Hosking, and J. Grundy, “Interactive visualiza- visual importance for graphic designs and data visualizations,”
tion tools for exploring the semantic graph of large knowledge in Proceedings of the 30th Annual ACM symposium on user interface
spaces,” in Workshop on Visual Interfaces to the Social and the software and technology, 2017, pp. 57–69.
Semantic Web (VISSW2009), vol. 443, 2009, pp. 11–16.
[35] R. Rossi and N. Ahmed, “The network data repository with
interactive graph analytics and visualization,” in Twenty-Ninth
Sicheng Song received his B.Eng. from
AAAI Conference on Artificial Intelligence, 2015.
Hangzhou Dianzi University, China, in 2019.
[36] C. Lai, Z. Lin, R. Jiang, Y. Han, C. Liu, and X. Yuan, “Automatic
He is working toward the Ph.D. degree with
annotation synchronizing with textual description for visualiza-
East China Normal University, Shanghai, China.
tion,” in Proceedings of the 2020 CHI Conference on Human Factors
His main research interests include information
in Computing Systems, 2020, pp. 1–13.
visualization and visual analysis.
[37] D. H. Kim, E. Hoque, and M. Agrawala, “Answering questions
about charts and generating visual explanations,” in Proceedings
of the 2020 CHI Conference on Human Factors in Computing Systems,
2020, pp. 1–13.
[38] Y. Ma, A. K. Tung, W. Wang, X. Gao, Z. Pan, and W. Chen,
“Scatternet: A deep subjective similarity model for visual analysis
of scatterplots,” IEEE TVCG, vol. 26, no. 3, pp. 1562–1576, 2018.
[39] S. Van der Walt, J. L. Schönberger, J. Nunez-Iglesias, F. Boulogne, Chenhui Li received Ph.D. from the Depart-
J. D. Warner, N. Yager, E. Gouillart, and T. Yu, “scikit-image: ment of Computing at Hong Kong Polytechnic
image processing in python,” PeerJ, vol. 2, p. e453, 2014. University, in 2018. He is an associate profes-
[40] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classi- sor with the School of Computer Science and
fication with deep convolutional neural networks,” in Advances Technology at East China Normal University.
in neural information processing systems, 2012, pp. 1097–1105. He received ICCI*CC Best Paper Award (2015)
[41] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, and SIGGRAPH Asia Sym. Vis. Best Paper
“Imagenet: A large-scale hierarchical image database,” in 2009 Award (2017). He has served as a local chair
IEEE CVPR. Ieee, 2009, pp. 248–255. in VINCI2019. He works on the research of in-
[42] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for formation visualization and computer graphics.
image recognition,” in Proceedings of the IEEE CVPR, 2016, pp.
770–778.
[43] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna,
“Rethinking the inception architecture for computer vision,” in
Yujing Sun received her B.Eng. from East
Proceedings of the IEEE CVPR, 2016, pp. 2818–2826.
China Normal University, in 2020. She is work-
[44] L. Bottou, “Stochastic gradient descent tricks,” in Neural networks: ing toward the Master degree with East China
Tricks of the trade. Springer, 2012, pp. 421–436. Normal University, Shanghai, China. Her main
[45] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional research interests include information visualiza-
networks for biomedical image segmentation,” in International tion and visual analysis.
Conference on Medical image computing and computer-assisted inter-
vention. Springer, 2015, pp. 234–241.
[46] F. Chollet et al., “Keras,” https://github.com/keras-team/keras,
2015.
[47] J. Serra, “Image analysis and mathematical morphol-ogy,” 1982.
[48] B. Caldwell, M. Cooper, L. G. Reid, G. Vanderheiden,
W. Chisholm, J. Slatin, and J. White, “Web content accessibility
guidelines (wcag) 2.0,” WWW Consortium (W3C), 2008. Changbo Wang is a professor with the School
[49] S. Leijnen and F. v. Veen, “The neural network zoo,” in Multi- of Computer Science and Technology, East
disciplinary Digital Publishing Institute Proceedings, vol. 47, no. 1, China Normal University. He received his Ph.D.
2020, p. 9. degree at the State Key Lab of CADCG of Zhe-
[50] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image jiang University in 2006. He was a post-doctor
quality assessment: from error visibility to structural similarity,” of the State University of New York in 2010.
IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, His research interests mainly include computer
2004. graphics, information visualization, visual Ana-
[51] S. E. Palmer, K. B. Schloss, and J. Sammartino, “Visual aesthetics lytics, etc. He is serving as the Young AE of
and human preference,” Annual review of psychology, vol. 64, pp. Frontiers of Computer Science, and PC member
77–107, 2013. for several international conferences.
[52] B. Alper, B. Bach, N. Henry Riche, T. Isenberg, and J.-D. Fekete,
“Weighted graph comparison techniques for brain connectivity
1077-2626 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
Authorized licensed use limited to: ASTAR. Downloaded on February 08,2023 at 04:50:04 UTC from IEEE Xplore. Restrictions apply.
information.