100% found this document useful (2 votes)
32 views

Complete Download Intelligent Web Data Management Software Architectures and Emerging Technologies 1st Edition Kun Ma PDF All Chapters

Software

Uploaded by

nkwananocum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
32 views

Complete Download Intelligent Web Data Management Software Architectures and Emerging Technologies 1st Edition Kun Ma PDF All Chapters

Software

Uploaded by

nkwananocum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Experience Seamless Full Ebook Downloads for Every Genre at textbookfull.

com

Intelligent Web Data Management Software


Architectures and Emerging Technologies 1st
Edition Kun Ma

https://textbookfull.com/product/intelligent-web-data-
management-software-architectures-and-emerging-
technologies-1st-edition-kun-ma/

OR CLICK BUTTON

DOWNLOAD NOW

Explore and download more ebook at https://textbookfull.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Emerging Technologies for Connected Internet of Vehicles


and Intelligent Transportation System Networks: Emerging
Technologies for Connected and Smart Vehicles Mohamed
Elhoseny
https://textbookfull.com/product/emerging-technologies-for-connected-
internet-of-vehicles-and-intelligent-transportation-system-networks-
emerging-technologies-for-connected-and-smart-vehicles-mohamed-
elhoseny/
textboxfull.com

Big Data, Emerging Technologies and Intelligence :


National Security Disrupted 1st Edition Miah Hammond-Errey

https://textbookfull.com/product/big-data-emerging-technologies-and-
intelligence-national-security-disrupted-1st-edition-miah-hammond-
errey/
textboxfull.com

Textbook on Scar Management State of the Art Management


and Emerging Technologies Luc Téot

https://textbookfull.com/product/textbook-on-scar-management-state-of-
the-art-management-and-emerging-technologies-luc-teot/

textboxfull.com

Software Project Estimation: Intelligent Forecasting,


Project Control, and Client Relationship Management 1st
Edition Dimitre Dimitrov
https://textbookfull.com/product/software-project-estimation-
intelligent-forecasting-project-control-and-client-relationship-
management-1st-edition-dimitre-dimitrov/
textboxfull.com
Campus Network Architectures and Technologies 1st Edition
Ningguo Shen

https://textbookfull.com/product/campus-network-architectures-and-
technologies-1st-edition-ningguo-shen/

textboxfull.com

Big Data Analytics for Intelligent Healthcare Management


1st Edition Nilanjan Dey

https://textbookfull.com/product/big-data-analytics-for-intelligent-
healthcare-management-1st-edition-nilanjan-dey/

textboxfull.com

Pragmatic Evaluation of Software Architectures 1st Edition


Jens Knodel

https://textbookfull.com/product/pragmatic-evaluation-of-software-
architectures-1st-edition-jens-knodel/

textboxfull.com

Smart Sensors Networks Communication Technologies and


Intelligent Applications A volume in Intelligent Data
Centric Systems Fatos Xhafa
https://textbookfull.com/product/smart-sensors-networks-communication-
technologies-and-intelligent-applications-a-volume-in-intelligent-
data-centric-systems-fatos-xhafa/
textboxfull.com

Emerging technologies : blockchain of Intelligent Things


to boost revenues First Edition. Edition Errol S. Van
Engelen
https://textbookfull.com/product/emerging-technologies-blockchain-of-
intelligent-things-to-boost-revenues-first-edition-edition-errol-s-
van-engelen/
textboxfull.com
Studies in Computational Intelligence 643

Kun Ma
Ajith Abraham
Bo Yang
Runyuan Sun

Intelligent Web Data


Management: Software
Architectures and
Emerging Technologies
Studies in Computational Intelligence

Volume 643

Series editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
e-mail: [email protected]
About this Series
The series “Studies in Computational Intelligence” (SCI) publishes new develop-
ments and advances in the various areas of computational intelligence—quickly and
with a high quality. The intent is to cover the theory, applications, and design
methods of computational intelligence, as embedded in the fields of engineering,
computer science, physics and life sciences, as well as the methodologies behind
them. The series contains monographs, lecture notes and edited volumes in
computational intelligence spanning the areas of neural networks, connectionist
systems, genetic algorithms, evolutionary computation, artificial intelligence,
cellular automata, self-organizing systems, soft computing, fuzzy systems, and
hybrid intelligent systems. Of particular value to both the contributors and the
readership are the short publication timeframe and the worldwide distribution,
which enable both wide and rapid dissemination of research output.

More information about this series at http://www.springer.com/series/7092


Kun Ma Ajith Abraham

Bo Yang Runyuan Sun


Intelligent Web Data


Management: Software
Architectures and Emerging
Technologies

123
Kun Ma Bo Yang
Shandong Provincial Key Laboratory Shandong Provincial Key Laboratory
of Network Based Intelligent Computing, of Network Based Intelligent Computing,
School of Information Science School of Information Science
and Engineering and Engineering
University of Jinan University of Jinan
Jinan, Shandong Jinan, Shandong
China China

Ajith Abraham Runyuan Sun


Scientific Network for Innovation Shandong Provincial Key Laboratory
and Research Excellence of Network Based Intelligent Computing,
Machine Intelligence Research Labs School of Information Science
(MIR Labs) and Engineering
Auburn, WA University of Jinan
USA Jinan, Shandong
China

ISSN 1860-949X ISSN 1860-9503 (electronic)


Studies in Computational Intelligence
ISBN 978-3-319-30191-4 ISBN 978-3-319-30192-1 (eBook)
DOI 10.1007/978-3-319-30192-1

Library of Congress Control Number: 2016931841

© Springer International Publishing Switzerland 2016


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by SpringerNature


The registered company is Springer International Publishing AG Switzerland
Preface

The goal of this book is to present the methods of intelligent Web data manage-
ment, including novel software architectures and emerging technologies and then
validate this architecture using experimental data and real-world applications.
Furthermore, the extensibility mechanisms are discussed. This book is organized to
blend in with the research findings of the author in the past few years.
The contents of this book are focused on four popular thematic categories of
intelligent Web data management: cloud computing, social networking, monitoring
and literature management. There are a number of applications in these areas, but
there is a lack of mature software architecture. Having participated in more than 20
software projects in the past 10 years, we have some interesting experience to share
with readers. Therefore, this book attempts to introduce some new intelligent Web
data management methods, including software architectures and emerging tech-
nologies. The book is organized into four parts as detailed below.

Part I: Cloud Computing

Part I introduces intelligent Web data management in the area of cloud computing.
This part emphasizes some software architectures of cloud computing.
Chapter 1 deals with intelligent Web data management of multi-tenant data
middleware. This chapter introduces intelligent Web data management of a trans-
parent data middleware to support multi-tenancy. This approach is transparent to
the developers of cloud applications.
Chapter 2 presents intelligent Web data management of NoSQL data warehouse.
This chapter introduces intelligent Web data management of NoSQL data ware-
house, which is used to address the issue of formulating no redundant data ware-
house with small amount of storage space for the purpose of their composition in a
way that utilizes the MapReduce framework. The experiments are illustrated to
successfully build the NoSQL data warehouse reducing data redundancy compared
with document with timestamp and lifecycle tag solutions.

v
vi Preface

Part II: Social Networking

Part II of this book introduces intelligent Web data management in the area of social
networking. This part emphasizes some software architectures for social
networking.
Chapter 3 presents intelligent Web data management of social question
answering. This chapter introduces intelligent Web data management of a question
answering system, which aims at improving the success ratio of the question
answering process with a multi-tenant architecture.
Chapter 4 deals with intelligent Web data management of content syndication
and recommendation. This chapter introduces intelligent Web data management of
a content syndication and recommendation system. The experimental result depicts
that the developed architecture speeds up the search and synchronization process,
and provides friendly user experience.

Part III: Monitoring

Part III of this book introduces intelligent Web data management in the area of
monitoring. This part emphasizes some software architectures for intelligent
monitoring.
Chapter 5 presents intelligent Web data management infrastructure and software
monitoring. This chapter introduces intelligent Web data management of a light-
weight module-centralized and aspect-oriented monitoring system. This framework
performs end-to-end measurements at infrastructure and software in the cloud. It
monitors the quality of service (QoS) parameters of the Infrastructure as a Service
(IaaS) and Software as a Service (SaaS) layer in the form of plug-in bundles. The
experiments provide insight into the modules of cloud monitoring. All the modules
constitute the entire proposed framework to improve the performance in hybrid
clouds.
Chapter 6 deals with intelligent Web data management of WebSocket-based
real-time monitoring. This chapter introduces intelligent Web data management of a
WebSocket-based real-time monitoring system for remote intelligent buildings. The
monitoring experimental results show that the average latency time of the devel-
oped WebSocket monitoring is generally lower than polling, FlashSocket and
Socket solution, and the storage experimental results show that our storage model
has low redundancy rate, storage space and latency.
Preface vii

Part IV: Literature Management

Part IV of this book introduces intelligent Web data management in the area of
literature management. This part emphasizes some software architectures of liter-
ature management.
Chapter 7 illustrates intelligent Web data management for literature validation.
This chapter introduces intelligent Web data management of a literature validation
system, which aims at validating the literature by the author name from the
third-party integrated system and the metadata from the DOI content negotiation
proxy. The analysis of application’s effect shows the ability to verify the authen-
ticity of the literature by the author name from the system and the metadata from
our DOI content negotiation proxy.
Chapter 8 presents intelligent Web data management for literature sharing. This
chapter introduces intelligent Web data management of a bookmarklet-triggered
unified literature sharing system. This architecture allows easy manipulation of the
literature sharing and academic exchange, which are used frequently and are very
often necessary in scientific activity such as research, writing chapters and disser-
tations, and preparing reports.
This book is written primarily for academic researchers who are interested in
intelligent Web data management of some emerging software systems, or software
architects who are interested in developing intelligent software architecture in the
aspect of Web data management. However, it was also written keeping in mind the
postgraduates who are studying Web data management. We assume basic famil-
iarity with the concepts of Web data management, but also provide pointers to
sources of information to fill in the background.
Many people have collaborated to shape the technical contents of this book. Our
thanks to our colleagues for the wonderful feedback, which helped us to enhance
the quality of the manuscript. We also thank the Springer Series on Studies on
Computational Intelligence Editorial Team: Prof. Dr. Janusz Kacprzyk, Dr. Thomas
Ditzinger and Mr. Holger Schaepe for the wonderful support to publish this book
very quickly.
We hope the readers will enjoy the contents and we await for further feedback to
further improve the work.

Kun Ma
Ajith Abraham
Bo Yang
Runyuan Sun
Contents

Part I Cloud Computing


1 Intelligent Web Data Management of Multi-tenant Data
Middleware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Challenges and Contributions . . . . . . . . . . . . . . . . . . . . . 4
1.2 Related Work and Emerging Techniques . . . . . . . . . . . . . . . . . . 4
1.2.1 Software as a Service Maturity Model . . . . . . . . . . . . . . . 4
1.2.2 Software as a Service Data Models . . . . . . . . . . . . . . . . . 6
1.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 Criteria of Multi-tenant Data Middleware . . . . . . . . . . . . . 9
1.3.2 Requirements of Multi-tenant Data Middleware. . . . . . . . . 10
1.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.1 SQL Interceptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.2 SQL Parser. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.3 SQL Restorer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.4 SQL Router . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.5 Data Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.6 Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.7 Tenant Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.1 Cost Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6.1 Extensibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6.2 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6.3 Disaster Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

ix
x Contents

2 Intelligent Web Data Management of NoSQL Data


Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.2 Challenges and Contributions . . . . . . . . . . . . . . . . . . . . . 22
2.2 Related Works and Emerging Techniques . . . . . . . . . . . . . . . . . . 23
2.2.1 Slowly Changing Dimensions of RDBMS . . . . . . . . . . . . 24
2.2.2 Slowly Changing Dimensions of NoSQL . . . . . . . . . . . . . 27
2.2.3 MapReduce Framework . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.1 Deployment Architecture . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.2 Capture-Map-Reduce Procedure . . . . . . . . . . . . . . . . . . . 31
2.4.3 Log-Based Capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.4 MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5.1 Redundancy Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5.2 Storage Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.5.3 Query Time of Track of History . . . . . . . . . . . . . . . . . . . 38
2.5.4 Execution Time of Creation . . . . . . . . . . . . . . . . . . . . . . 39
2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.6.1 Effective Lifecycle Tag . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.6.2 Cell with Effective Lifecycle Tag . . . . . . . . . . . . . . . . . . 41
2.6.3 Extreme Data Storage Principles . . . . . . . . . . . . . . . . . . . 41
2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Part II Social Networking


3 Intelligent Web Data Management of Social Question
Answering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1.2 Challenges and Contributions . . . . . . . . . . . . . . . . . . . . . 47
3.2 Related Work and Emerging Techniques . . . . . . . . . . . . . . . . . . 48
3.2.1 Social Question Answering. . . . . . . . . . . . . . . . . . . . . . . 48
3.2.2 Multi-tenancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2.3 NoSQL Storage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2.4 RESTful Web Service . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Contents xi

3.4.1 Helper Recommendation Algorithm . . . . . . . . . . . . . . . . . 53


3.4.2 Help Feed Propagation Method . . . . . . . . . . . . . . . . . . . . 55
3.4.3 Multi-tenancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.4 Data Customizing of Tenants . . . . . . . . . . . . . . . . . . . . . 57
3.4.5 RESTful Web Service API . . . . . . . . . . . . . . . . . . . . . . . 58
3.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5.1 High Success Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5.2 Propagation Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5.3 Propagation Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5.4 Data Customizing of Tenants . . . . . . . . . . . . . . . . . . . . . 62
3.6 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.6.1 Preprocessing of Successful Knowledge Base . . . . . . . . . . 63
3.6.2 Expert Discovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4 Intelligent Web Data Management of Content Syndication
and Recommendation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.1.2 Challenges and Contributions . . . . . . . . . . . . . . . . . . . . . 66
4.2 Related Work and Emerging Techniques . . . . . . . . . . . . . . . . . . 66
4.2.1 RSS Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.2.2 RSS Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2.3 Feed Synchronization. . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2.4 RSS Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4.1 Source Listener . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4.2 Feed Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.4.3 Feed Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.4.4 OAuth2-Authorization RESTful Feed Sharing APIs . . . . . . 71
4.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5.1 Low Latency of Search . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5.2 Incremental Synchronization . . . . . . . . . . . . . . . . . . . . . . 74
4.5.3 User Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.6 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.6.1 RSS Feeds Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.6.2 Interaction with Social Networking Website . . . . . . . . . . . 76
4.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
xii Contents

Part III Monitoring


5 Intelligent Web Data Management of Infrastructure
and Software Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1.2 Challenges and Contributions . . . . . . . . . . . . . . . . . . . . . 82
5.2 Related Work and Emerging Techniques . . . . . . . . . . . . . . . . . . 83
5.2.1 Cloud Monitoring Categories . . . . . . . . . . . . . . . . . . . . . 83
5.2.2 Cloud Monitoring Methods. . . . . . . . . . . . . . . . . . . . . . . 84
5.2.3 Cloud Monitoring Methods. . . . . . . . . . . . . . . . . . . . . . . 87
5.2.4 Cloud Web Service Monitoring in the Cloud . . . . . . . . . . 87
5.2.5 Aspect-Oriented Programming. . . . . . . . . . . . . . . . . . . . . 88
5.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.1 Hierarchy of Resource Entity Models. . . . . . . . . . . . . . . . 89
5.3.2 Requirements of Monitoring . . . . . . . . . . . . . . . . . . . . . . 89
5.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.4.1 Cloud Monitoring Architecture . . . . . . . . . . . . . . . . . . . . 90
5.4.2 Manager-Agent Architecture . . . . . . . . . . . . . . . . . . . . . . 91
5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.5.1 Virtualization Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 93
5.5.2 Service Availability Monitoring. . . . . . . . . . . . . . . . . . . . 93
5.5.3 Performance Monitoring Module via SNMP . . . . . . . . . . . 94
5.5.4 Application Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.5.5 User Experience Tracker . . . . . . . . . . . . . . . . . . . . . . . . 97
5.5.6 Over-Commit Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 97
5.6 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.6.1 Monitoring Client of AOP Service. . . . . . . . . . . . . . . . . . 101
5.6.2 Monitoring Server of AOP Service . . . . . . . . . . . . . . . . . 101
5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6 Intelligent Web Data Management of WebSocket-Based
Real-Time Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.1.2 Challenges and Contributions . . . . . . . . . . . . . . . . . . . . . 105
6.2 Related Work and Emerging Techniques . . . . . . . . . . . . . . . . . . 106
6.2.1 Networking of Intelligent Building . . . . . . . . . . . . . . . . . 106
6.2.2 Classical Monitoring Methods. . . . . . . . . . . . . . . . . . . . . 108
6.2.3 Storage of Monitoring Data . . . . . . . . . . . . . . . . . . . . . . 110
6.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.4.1 Overview of System Architecture . . . . . . . . . . . . . . . . . . 112
6.4.2 WSN of Intelligent Buildings . . . . . . . . . . . . . . . . . . . . . 112
Contents xiii

6.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117


6.5.1 Fast Loading. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.5.2 Low Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.5.3 High Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.5.4 Low Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.6 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.6.1 Redundancy Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.6.2 Storage Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.6.3 Query Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Part IV Literature Management


7 Intelligent Web Data Management of Literature Validation . . . . . . . 127
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.1.2 Challenges and Contributions . . . . . . . . . . . . . . . . . . . . . 127
7.2 Related Work and Emerging Techniques . . . . . . . . . . . . . . . . . . 128
7.2.1 Literature Bibliography Acquisition . . . . . . . . . . . . . . . . . 128
7.2.2 DOI Content Negotiation and Resolver . . . . . . . . . . . . . . 129
7.2.3 Bibliographic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.4.1 Bibliography Acquisition Architecture . . . . . . . . . . . . . . . 132
7.4.2 DOI Content Negotiation Proxy . . . . . . . . . . . . . . . . . . . 133
7.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.5.1 DOI Content Service . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.5.2 DOI Resolver Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.5.3 DOI Presentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.5.4 BibTeX Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.5.5 Bibliography Validation Service . . . . . . . . . . . . . . . . . . . 139
7.5.6 BibModel Transformation Engine . . . . . . . . . . . . . . . . . . 140
7.5.7 Terminal UI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.6 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.6.1 Bibliographic Model—BibModel. . . . . . . . . . . . . . . . . . . 142
7.6.2 Transformation from BibModel
to Bibliographic Records . . . . . . . . . . . . . . . . . . . . . . . . 143
7.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8 Intelligent Web Data Management of Literature Sharing. . . . . . . . . 147
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
8.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
8.1.2 Challenges and Contributions . . . . . . . . . . . . . . . . . . . . . 148
xiv Contents

8.2 Related Architecture and Emerging Techniques . . . . . . . . . . . . . . 149


8.2.1 Emerging Web Technologies . . . . . . . . . . . . . . . . . . . . . 149
8.2.2 Literature Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
8.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.4.1 Hierarchical Model of Bookmarklet . . . . . . . . . . . . . . . . . 151
8.4.2 System Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.4.3 Literature Sharing Process . . . . . . . . . . . . . . . . . . . . . . . 152
8.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
8.5.1 Bookmarklet Versus Third-Party Platform. . . . . . . . . . . . . 154
8.5.2 WebSocket Versus FlashSocket. . . . . . . . . . . . . . . . . . . . 154
8.5.3 NoSQL Versus RDBMS . . . . . . . . . . . . . . . . . . . . . . . . 156
8.6 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.6.1 Bookmarklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.6.2 Cloud DOI Resolver . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8.6.3 Cloud Storage Engine . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.6.4 Scopus API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.6.5 Academic Exchange WebSocket Server . . . . . . . . . . . . . . 160
8.6.6 Sidebar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.6.7 Academic Exchange WebSocket Client . . . . . . . . . . . . . . 161
8.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
Part I
Cloud Computing
Chapter 1
Intelligent Web Data Management
of Multi-tenant Data Middleware

1.1 Introduction

1.1.1 Background

Software as a service (SaaS), is a software delivery model in which software is


deployed as a hosted service and accessed over the Internet [1]. SaaS has become the
words that are on everyone’s lips, and is going to have a major impact on software
industry. According to International Data Corporation’s (IDC) latest market report,
SaaS will grow at a 26.6 % annual compound rate through 2014–2018 [2].
Today, SaaS applications are expected to take advantage of the benefits of
centralization through a single-instance as well as multi-tenant architecture, and to
provide a feature-rich experience competitive with comparable applications.
A typical SaaS application is offered either directly by the vendor or by a service
provider. In contrast to the one-time licensing model commonly used for a software,
SaaS application access is frequently sold using a subscription model, with cus-
tomers paying an ongoing fee to use the application. Fee structures vary from
application to application. For example, some providers charge a flat rate for
unlimited access to some or all of the application’s features, while others charge
varying rates that are based on usage.
Multi-tenancy is the most significant paradigm of SaaS, which is different from
traditional software [3]. Multi-tenancy refers to a principle in software architecture
where a single instance of the software runs on a server, serving multiple tenants. It
contrasts with multi-instance architectures where separate software instances
operate on behalf of different client organizations. In cloud computing, the meaning
of multi-tenant architecture has broadened because of new service models that take
advantage of virtualization and remote access. A SaaS provider, for example, can
run one instance of its application on one instance of a database and provide Web
access to multiple tenants. In such a scenario, each tenant’s data is isolated and
remains invisible to other tenants.

© Springer International Publishing Switzerland 2016 3


K. Ma et al., Intelligent Web Data Management: Software Architectures
and Emerging Technologies, Studies in Computational Intelligence 643,
DOI 10.1007/978-3-319-30192-1_1
4 1 Intelligent Web Data Management …

Multi-tenant application is regarded as a potential segment and the utilization in


the aspect of enterprise applications, such as enterprise resource planning, office
automation, and e-business [4]. However, it is difficult to manage the Web data of
multi-tenant applications. In this chapter, we introduce intelligent Web data man-
agement of a multi-tenant application using a transparent data middleware.

1.1.2 Challenges and Contributions

There are some challenges of current multi-tenant data techniques. First, how to
make the data middleware transparent to the developers is more challenging. That is
to say that the legacy application is assured to migrate to the multitenant one with
minimum modification of the source codes. Second, how to minimize the cost and
impact of the database performance is also challenging.
To address these challenges, we introduce the architecture of a transparent data
middleware to support multi-tenancy. The architecture of this data middleware is
discussed in detail. The contributions of this data middleware are several folds.
First, the data middleware is transparent to the developers. It is easy to make the
legacy application to support multi-tenancy without re-architecting the entire sys-
tem from the ground up. Second, some auxiliary optimized measures of the
architecture are added to make this data middleware more extensive and scalable.

1.2 Related Work and Emerging Techniques

This Section introduces the related work and techniques on multi-tenant data
middleware.

1.2.1 Software as a Service Maturity Model

Broadly speaking, SaaS application maturity can be expressed using a model with
four distinct levels [5]. Each level is distinguished from the previous one by the
addition of scalability, multi-tenancy, and configuration. Figure 1.1 shows the SaaS
maturity model.
Level 1: Ad Hoc/Custom
The first level of maturity is similar to the traditional application service provider
(ASP) model of software delivery, dating back to the 1990s. At this level, each
tenant has its own customized version of the hosted application, and runs its own
instance of the application on the host’s servers. Architecturally, software at this
maturity level is very similar to traditionally-sold line-of-business software.
1.2 Related Work and Emerging Techniques 5

tenant tenant tenant tenant tenant tenant

Instance1 Instance2 Instance3 Instance Instance Instance


1 2

tenant tenant tenant tenant tenant tenant

Tenant Load Balancer


Instance

3 4 Instance Instance Instance

Fig. 1.1 SaaS maturity model

Typically, traditional client-server applications can be moved to a SaaS model at


the first level of maturity, with relatively little development effort, and without
re-architecting the entire system from the ground up. Although this level offers few
of the benefits of a fully mature SaaS solution, it does allow vendors to reduce costs
by consolidating server hardware and administration.
Level 2: Configurable
At the second level of maturity, the vendor hosts a separate instance of the appli-
cation for each tenant. Whereas in the first level each instance is individually
customized for the tenant, at this level, all instances use the same code imple-
mentation, and the vendor meets tenants’ needs by providing detailed configuration
options that allow the tenant to change how the application looks and behaves to its
users. Despite being identical to one another at the code level, each instance
remains wholly isolated from all the others.
Moving to a single code base for all of a vendor’s tenants greatly reduces a SaaS
application’s service requirements, because any changes made to the code base can
be easily provided to all of the vendor’s tenants at once, thereby eliminating the
need to upgrade or slipstream individual customized instances. However, reposi-
tioning a traditional application as SaaS at the second maturity level can require
significantly more re-architecting than at the first level, if the application has been
designed for individual customization rather than configuration metadata. Similarly
to the first maturity level, the second level requires that the vendor provide sufficient
hardware and storage to support a potentially large number of application instances
running concurrently.
6 1 Intelligent Web Data Management …

Level 3: Configurable and Multi-tenant


At the third level of maturity, the vendor runs a single instance that serves every
tenant, with configurable metadata providing a unique user experience and feature
set for each one. Authorization and security policies ensure that each tenant’s data is
kept separate from that of other customers logically. From the end user’s per-
spective, there is no indication that the application instance is being shared among
multiple tenants.
This mature model eliminates the need to provide server space for as many
instances as the vendor has customers, allowing for much more efficient use of
computing resources than the second level, which translates directly to lower costs.
A significant disadvantage of this approach is that the scalability of the application
is limited. Unless partitioning is used to manage database performance, the appli-
cation can be scaled only by moving it to a more powerful server (scaling up), until
diminishing returns make it impossible to add more power cost-effectively.
Level 4: Scalable, Configurable, and Multi-tenant
At the fourth and final level of maturity, the vendor hosts multiple tenants on a
load-balanced farm of identical instances, with each tenant’s data kept separate, and
with configurable metadata providing a unique user experience and feature set for
each tenant. A SaaS system is scalable to an arbitrarily large number of tenants,
because the number of servers and instances on the back end can be increased or
decreased as necessary to match demand, without requiring additional
re-architecting of the application, and changes or fixes can be rolled out to thou-
sands of tenants as easily as a single tenant.

1.2.2 Software as a Service Data Models

The distinction between shared data and isolated data is not binary. Instead, it is
more of a continuum, with many variations that are possible between the two
extremes. Therefore, there are mainly three SaaS data models from the balance
between isolation and sharing [6–8]. Figure 1.2 shows the current SaaS data
models.
Model A: Separate application and separate database
Separate application and separate database uses different separate applications and
databases for each tenant, which is the simplest approach to data model.
Unfortunately, this approach tends to lead to higher costs for maintaining equipment
and backing up tenant data. The number of tenants that can be housed on a given
database server is limited by the number of databases that the server can support.
Model B: Shared application and separate database
In this model, computing resources and application code are generally shared
between all the tenants on a server, but each tenant has its own set of data that
remains logically isolated from data that belongs to all other tenants. Metadata
1.2 Related Work and Emerging Techniques 7

Isolated Shared

tenant tenant tenant tenant tenant tenant tenant tenant tenant

application application application application application

database database database database database database database

a) Separate Application b) Shared Application c) Shared Application


Separate Database Separate Database Shared Database

tenant tenant tenant tenant


tenant tenant

application application

schema1 schema2 schema1 database

c 1) Separate Schema c 2) Shared Schema

Fig. 1.2 Current SaaS solutions

associates each database with the correct tenant, and database security prevents any
tenant from accidentally or maliciously accessing other tenants’ data.
Giving each tenant its own database makes it easy to extend the application’s
data model to meet tenants’ individual needs, and restoring a tenant’s data from
backups in the event of a failure is a relatively simple procedure. Unfortunately, this
approach tends to lead to higher costs for maintaining equipment and backing up
tenant data. Hardware costs are also higher than they are under alternative
approaches, as the number of tenants that can be housed on a given database server
is limited by the number of databases that the server can support.
Model C: Shared application and shared database
From the aspect of fine-grained partition of shared data model, there are two shared
SaaS data models: separate schema and shared schema.
Model C 1: Shared database and separate schema
This data model involves housing multiple tenants in the same database, with each
tenant having its own set of tables that are grouped into a schema created
8 1 Intelligent Web Data Management …

Tenant 1 Tenant 2

Tenant 3

Fig. 1.3 Shared database and separate schema

specifically for the tenant. Figure 1.3 shows this data model. The provisioning
database creates a discrete set of tables for the tenant and associates it with the
tenant’s own schema. Although the tenants’ data are in the same database, but with
a discrete set of tables, views, stored procedure and triggers. Like the isolated
approach, the separate schema approach is relatively easy to implement. This
approach offers a moderate degree of logical data isolation for security-conscious
tenants, though not as much as a completely isolated system would. It can support a
larger number of tenants per database server. A significant drawback of the separate
schema approach is that tenant data is harder to restore in the event of a failure. If
each tenant has its own database, restoring a single tenant’s data means simply
restoring the database from the most recent backup. With a separate schema
application, restoring the entire database would mean overwriting the data of every
tenant on the same database with backup data, regardless of whether each one has
experienced any loss or not. Therefore, to restore a single customer’s data, the
database administrator may have to restore the database to a temporary server, and
then import the customer’s tables into the production server.
Model C 2: Shared database and shared schema
A second approach involves using the same database and the same schema that is
composed of a set of tables to host multiple tenants’ data. Figure 1.4 shows this data
model. A given table can include records from multiple tenants stored in any order.
Therefore, a tenant ID column is added to associates every record with the
appropriate tenant.
Of the two approaches explained here, the shared schema approach has the
lowest hardware and backup costs, because it allows you to serve the largest
number of tenants per database server. However, because multiple tenants share the
1.2 Related Work and Emerging Techniques 9

Fig. 1.4 Shared database and


tenantID
shared schema
1 tenantID
tenantID
2 1
3 2 1

3 2
3

same database tables, this approach may incur additional development effort in the
area of security to ensure that tenants can never access other tenants’ data, even in
the event of unexpected bugs or attacks. In this context, a multi-tenant data mid-
dleware is well designed to optimize and minimize the development work to the
utmost. That is the motivation of the proposed multi-tenant data middleware.

1.3 Requirements

1.3.1 Criteria of Multi-tenant Data Middleware

To define what might be called a mature multi-tenant data middleware, we must


introduce some additional criteria. From a data architect’s point of view, there are
three key differentiators that separate a well-designed multi-tenant data middleware
from a poorly designed one. A well-designed multi-tenant data middleware is
scalable, multi-tenant, and configurable [5].
Scaling the database means maximizing concurrency, and using database
resources more efficiently. For example, optimizing data storage, caching reference
data, and partitioning large databases are all acceptable ways.
Multi-tenancy may be the most significant paradigm shift that an architect
accustomed to designing isolated, single-tenant applications has to make. For
example, when a user at one company accesses customer database by using an ERP
application service, the application instance that the user connects to may be
accommodating users from dozens or even hundreds of other companies. This
requires an architecture that maximizes the sharing of data resources across tenants,
but that is still able to differentiate data belonging to different customers.
The challenge for the architect is to ensure that the task of configuring databases
is simple and easy for the customers, without incurring extra development or
operation costs for each configuration. The personalization of tenants’ data is
permitted using the configuration. For example, the structure of the sharing
multi-tenant database might be slightly different.
10 1 Intelligent Web Data Management …

1.3.2 Requirements of Multi-tenant Data Middleware

We have the following requirements while designing a multi-tenant data middle-


ware [9].
Transparency: We want to enable legacy applications support multi-tenancy
without minimum rectification of the source codes.
Extensibility: We want to ensure the extensibility to support the personalization of
tenants’ data.
Scalability: We want that this middleware has the ability to cache tenants’ data to
optimize this architecture.
Disaster recovery: We want that the database would be recovered in the case of
any unexpected disaster.

1.4 Architecture

In this section, we propose a transparent multi-tenant data middleware, which is


shown in Fig. 1.5. This architecture of data middleware is comprised of SQL
interceptor, SQL parser, SQL restorer, and SQL router. Some additional compo-
nents (such as data node and cache) are assisted to optimize this architecture.
Multi-tenant data middleware is a kind of data proxy to support multi-tenancy
without the storage of any physical data for the sake of smooth transition. It is
transparent to the developers, owning the similar logical set of tables and views as
the physical database. Therefore, the legacy application can connect to multi-tenant
data middleware transparently. The multi-tenant data middleware provides the
logical data isolation for the tenants with higher demand on security.

1.4.1 SQL Interceptor

SQL interceptor is used to intercept the SQLs that are transmitted to SQL parser. An
simple implementation of SQL interceptor is using JDBC proxy, which captures all
the SQLs in the database driver layer.

1.4.2 SQL Parser

SQL parser is used to parse the fine-grained predicates of SQL statements, such as
select predicates, aggregation predicates, where predicates, order predicates, and
group predicates.
1.4 Architecture 11

tenant tenant tenant

Data Request
SQL
Interceptor
Sql

SQL Parser

Multi-tenant Data Middleware

SQL Restorer

Sql

SQL Router

Hit or miss
cache

hit
Data Node read miss

Slave write Cache


replication replace and consistency

Master read
replication

Slave

Fig. 1.5 Architecture of multi-tenant data middleware

1.4.3 SQL Restorer

SQL restorer is used to restore the new SQL to the physical sharing data. The new
SQL is reorganized from the original SQL predicates and tenantID discriminator.
The restoring process is denoted as a mapping: sql(u)->pre(TenantID) [ T(sql(u),
TenantID) [ post(TenantID), where pre(TenantID) is the pre-personalized
12 1 Intelligent Web Data Management …

SELECT * FROM table1 where …. ORDER BY… GROUP BY …


SELECT
SELECT * FROM table1 where …. and tenantID=V ORDER BY tenantIDASC , … GROUP BY …
INSERT INTO table1 (A1, A2, ..., An) VALUES(V1, V2, ..., Vn)
INSERT
INSERT INTO table1 (tenantID, A1, A2, ..., An) VALUES(V, V1, V2, ..., Vn)
UPDATE tables set … where …
UPDATE
UPDATE tables set … where … and tenantID=V

DELETE from table1 where …


DELETE
DELETE from table1 where … and tenantID=V

Fig. 1.6 Restoring transformation function

operation, post(TenantID) means the post-personalized operation, and T is the


transformation function. Figure 1.6 is an example of transformation function of
select, insert, update, and delete statements, where tenantID indicates the tenant ID
column, and V means the value of tenantID in the context.

1.4.4 SQL Router

SQL router sends the reorganized SQL requests to the data node or the cache. The
cache is deployed to accelerate the read process. If the data of one column of the
query hit the cache, they are obtained from the cache. If the data of one column of
the query miss the cache, they are obtained from the master/slave data nodes.

1.4.5 Data Node

Read/write splitting techniques are applied to improve the scalability and perfor-
mance of the database. The basic concept is that a master data node handles the
transactional operations, and slaves handle the non-transactional queries. The
identification of transaction depends on the parse of SQLs.
The master/slave nodes are applied in the architecture. Replication enables data
from the master to be replicated to one or more slaves. Replication is based on the
master server keeping track of all changes to its databases in its binary log. The
binary log serves as a written record of all events that modify database structure or
data from the moment the server was started. The before and after images are both
recorded in the binary log with low impact on the performance of the database.
Each slave that connects to the master requests a copy of the binary log. That is, it
pulls the data from the master, rather than the master pushing the data to the slave.
The slave also executes the events from the binary log that it receives. This has the
effect of repeating the original changes just as they were made on the master. Tables
1.4 Architecture 13

Fig. 1.7 Architecture of the


Access Counter
cache

column
Data node Cache

column column
Log-based listener

are created or their structure modified, and data is inserted, deleted, and updated
according to the changes that were originally made on the master.

1.4.6 Cache

The architecture of the cache is shown in Fig. 1.7, which is a part of the data
middleware. In our solution, we adopt the cache to optimize the architecture of the
data middleware.
Step 1: Log-based replication from the data node to the cache
The changes of the data node are parsed from the binary log store. This process is
called log-based replication in the presence of updates. For a new type of the data
node, the only component that needs to change in this architecture is the concrete
parser of the binary log. In the case of the insert transactional operation of the data
node, it will insert the data of this column into the cache when this column exists in
the cache. In the case of the delete transactional operation of the data node, it will
delete the corresponding cache when this column exists in the cache. In the case of
the update transactional operation of the data node, it will rectify the data in this
column in the existence of the cache.
Step 2: Cache listener
One purpose of the access counter is designed to observe the usage of the column in
the cache. If the current column access frequency rate is smaller than the average
column access frequency rate, the column access frequency count need to be
abated. If the subtraction from the current column access frequency rate to the
average one descends a negative threshold, we should remove the data in this
column in the cache.
Step 3: Cache replacement strategies
If the data in this column of the query hit the cache, it returns the results from the
column oriented NoSQL cache. On the contrary, if the data in this column of the
query miss the cache, it returns the results from the original data node. If the
subtraction from the current column access frequency rate to the average one
exceeds a threshold, the data of this column in the data node need to be dynamically
translated into the cache. In the case of the hit or miss of the cache, the column
access frequency count needs the rectification.
14 1 Intelligent Web Data Management …

1.4.7 Tenant Context

In order to isolate the tenants’ data, tenant ID column is used as a discriminator


along with the context session. The legacy application is revised to transfer
tenantID discriminator after the success of the users’ authentication. For example, if
the framework of the application is based on Spring Security, the acquisition of
tenantID is configured as a rule of the authentication without any modification of
the source codes.

1.5 Evaluation

1.5.1 Cost Analysis

We evaluate the proposed multi-tenant data middleware using cost analysis [10].
The goal is to find the cost-minimal solution for the considered multi-tenant
application. Different reengineering measures of varying complexity are necessary
for fulfilling this requirement.
This cost of different multi-tenant data models is mainly composed of two major
aspects: initial reengineering cost and monthly ongoing cost. The breakeven point
of the data model is calculated as:
X
TimetoBreakEven ¼ InitRECosts= MonthlyOngoingCosts

where InitRECosts is initial reengineering costs, and MonthlyOngoingCosts means


monthly ongoing costs. Therefore, every reengineering activity reducing the
monthly ongoing costs will sooner or later be amortized.
The calculation of initial reengineering cost of different data models is as follows.

InitCostshared ¼ CostApp þ CostDatabase þ n  ðCostAppTenant þ CostDatabaseTenant Þ


 
InitCostseparate ¼ n  CostApp þ CostDatabase
InitCostmiddleware ¼ CostApp þ CostDataMiddleware þ n  CostAppTenant

The calculation of monthly ongoing cost of different data models is as follows.

MonthlyCostshared ¼ MonthlyCostDatabase þ n  ðMonthlyCostAppTenant þ MonthlyCostDatabaseTenant Þ


 
MonthlyCostseparate ¼ n  MonthlyCostDatabase þ MonthlyCostApp
MonthlyCostmiddleware ¼ MonthlyCostDatamiddleware þ n  MonthlyCostAppTenant

If the service is alive forever, the data model with the lowest incremental
monthly ongoing cost is always the best. However, we rather assume that a service
1.5 Evaluation 15

Cost

Shared
Data Middleware
Seperate

Time elapse

Fig. 1.8 Cost for different multi-tenant data models

gets replaced sooner or later. Thus, the question is whether the time period is
sufficiently long to justify huge investments (i.e. service usage time > time to break
even).
Figure 1.8 shows the empirical cost of different multi-tenant models. It is
indicated that the higher initial reconstruction cost is reasonable for the sake of the
low monthly ongoing costs. It is noted that these cost functions in reality might not
be linear. Due to the relative complexity of developing a shared architecture,
applications that are optimized for a shared approach tend to require a larger
development effort than applications that are designed using an isolated approach.
The monthly ongoing costs of shared approaches tend to be lower, since they can
support more tenants per server. It is shown that our approach using multi-tenant
data middleware is a transparent and loose coupled solution for SaaS applications.
The cost of our data middleware is between isolated and shared data model at the
beginning, but it falls back close to the shared model in the long term.

1.6 Discussion

1.6.1 Extensibility

The requirements of tenant vary from person to person, although they share the
similar structure of database. The data middleware is designed to support the
personalization of the tenants. There are three approaches to support the person-
alization [9].
An intuitive approach is using single wide table, which is shown in Fig. 1.9.
Single wide table, as its name suggests, stores all the tenant data in the same table
with the maximum number of fields. The data model is simply extended to create a
preset number of custom fields in every table you wish. This approach often causes
the waste of column if the tenant dos not customize this column. This issue is
16 1 Intelligent Web Data Management …

Single Wide Table


tenantID Key A1 A2 … An

Fig. 1.9 Single wide table

Single wide table with vertical scalability

tenantID Key A1 A2 … An
core horizontal part

extended vertical part tenantID rowKey columnKey columnValue

Fig. 1.10 Single wide table with vertical scalability

generally called scheme null issue. In this solution, each tenant that uses at least one
or more custom fields gets a row in the combined table, with null fields representing
available custom fields that the tenant has not used.
Another improved version of single wide table is single wide table with vertical
scalability. This model extracts the personalized data from wide table, and then
describes it using extended vertical metadata. Each row in the extended vertical
metadata is a key/value pair, which is used to store the personalization of tenants to
fulfill the requirements of different tenants. The single wide table with vertical
scalability is shown in Fig. 1.10. In the case that the personalization of tenants is
identical, the extended vertical metadata can be omitted. The advantage of this
approach is that it can reduce the waste of data resources efficiently.
The last approach is multiple wide tables with vertical scalability, which is
shown in Fig. 1.11. In the context of multiple wide tables, tenants’ data are spread
over different single wide tables. That is to say that multiple wide tables with

Multiple wide tables with vertical scalability

tenantID Key A1 A2 … An
core horizontal part
tenantID Key B1 B2 … Bm

tenantID Key C1 C2 … Ck

extended vertical part tenantID rowKey columnKey columnValue

Fig. 1.11 Multiple wide tables with vertical scalability


Exploring the Variety of Random
Documents with Different Content
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law
in the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name associated
with the work. You can easily comply with the terms of this
agreement by keeping this work in the same format with its attached
full Project Gutenberg™ License when you share it without charge
with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.

1.E. Unless you have removed all references to Project Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears,
or with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this eBook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is derived


from texts not protected by U.S. copyright law (does not contain a
notice indicating that it is posted with permission of the copyright
holder), the work can be copied and distributed to anyone in the
United States without paying any fees or charges. If you are
redistributing or providing access to a work with the phrase “Project
Gutenberg” associated with or appearing on the work, you must
comply either with the requirements of paragraphs 1.E.1 through
1.E.7 or obtain permission for the use of the work and the Project
Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is posted


with the permission of the copyright holder, your use and distribution
must comply with both paragraphs 1.E.1 through 1.E.7 and any
additional terms imposed by the copyright holder. Additional terms
will be linked to the Project Gutenberg™ License for all works posted
with the permission of the copyright holder found at the beginning
of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files containing a
part of this work or any other work associated with Project
Gutenberg™.

1.E.5. Do not copy, display, perform, distribute or redistribute this


electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the Project
Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or providing


access to or distributing Project Gutenberg™ electronic works
provided that:

• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™


electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or
damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for


the “Right of Replacement or Refund” described in paragraph 1.F.3,
the Project Gutenberg Literary Archive Foundation, the owner of the
Project Gutenberg™ trademark, and any other party distributing a
Project Gutenberg™ electronic work under this agreement, disclaim
all liability to you for damages, costs and expenses, including legal
fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR
NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR
BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK
OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL
NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT,
CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF
YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you


discover a defect in this electronic work within 90 days of receiving
it, you can receive a refund of the money (if any) you paid for it by
sending a written explanation to the person you received the work
from. If you received the work on a physical medium, you must
return the medium with your written explanation. The person or
entity that provided you with the defective work may elect to provide
a replacement copy in lieu of a refund. If you received the work
electronically, the person or entity providing it to you may choose to
give you a second opportunity to receive the work electronically in
lieu of a refund. If the second copy is also defective, you may
demand a refund in writing without further opportunities to fix the
problem.

1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of damages.
If any disclaimer or limitation set forth in this agreement violates the
law of the state applicable to this agreement, the agreement shall be
interpreted to make the maximum disclaimer or limitation permitted
by the applicable state law. The invalidity or unenforceability of any
provision of this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation,


the trademark owner, any agent or employee of the Foundation,
anyone providing copies of Project Gutenberg™ electronic works in
accordance with this agreement, and any volunteers associated with
the production, promotion and distribution of Project Gutenberg™
electronic works, harmless from all liability, costs and expenses,
including legal fees, that arise directly or indirectly from any of the
following which you do or cause to occur: (a) distribution of this or
any Project Gutenberg™ work, (b) alteration, modification, or
additions or deletions to any Project Gutenberg™ work, and (c) any
Defect you cause.

Section 2. Information about the Mission


of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new computers.
It exists because of the efforts of hundreds of volunteers and
donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project Gutenberg™’s
goals and ensuring that the Project Gutenberg™ collection will
remain freely available for generations to come. In 2001, the Project
Gutenberg Literary Archive Foundation was created to provide a
secure and permanent future for Project Gutenberg™ and future
generations. To learn more about the Project Gutenberg Literary
Archive Foundation and how your efforts and donations can help,
see Sections 3 and 4 and the Foundation information page at
www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-profit
501(c)(3) educational corporation organized under the laws of the
state of Mississippi and granted tax exempt status by the Internal
Revenue Service. The Foundation’s EIN or federal tax identification
number is 64-6221541. Contributions to the Project Gutenberg
Literary Archive Foundation are tax deductible to the full extent
permitted by U.S. federal laws and your state’s laws.

The Foundation’s business office is located at 809 North 1500 West,


Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up
to date contact information can be found at the Foundation’s website
and official page at www.gutenberg.org/contact

Section 4. Information about Donations to


the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission of
increasing the number of public domain and licensed works that can
be freely distributed in machine-readable form accessible by the
widest array of equipment including outdated equipment. Many
small donations ($1 to $5,000) are particularly important to
maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws regulating


charities and charitable donations in all 50 states of the United
States. Compliance requirements are not uniform and it takes a
considerable effort, much paperwork and many fees to meet and
keep up with these requirements. We do not solicit donations in
locations where we have not received written confirmation of
compliance. To SEND DONATIONS or determine the status of
compliance for any particular state visit www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states where


we have not met the solicitation requirements, we know of no
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot make


any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.

Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.

Section 5. General Information About


Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.
Project Gutenberg™ eBooks are often created from several printed
editions, all of which are confirmed as not protected by copyright in
the U.S. unless a copyright notice is included. Thus, we do not
necessarily keep eBooks in compliance with any particular paper
edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how
to subscribe to our email newsletter to hear about new eBooks.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

textbookfull.com

You might also like