DB2 9 PureXML Guide For Beginners
DB2 9 PureXML Guide For Beginners
Whei-Jen Chen Art Sammartino Dobromir Goutev Felicity Hendricks Ippei Komi Ming-Pang Wei Rav Ahuja
ibm.com/redbooks
SG24-7315-01
Note: Before using this information and the product it supports, read the information in Notices on page ix.
First Edition (January 2007) This edition applies to DB2 9 for Linux, UNIX, and Windows.
Copyright International Business Machines Corporation 2007. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi The team that wrote this redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Chapter 1. Introducing DB2 9: pureXML . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Growing importance of XML data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Growth of XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.2 The value of XML data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 pureXML overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Traditional methods for managing XML data . . . . . . . . . . . . . . . . . . . 5 1.2.2 XML data management with DB2 9 . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.3 Setting up databases for XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.4 XML optimized storage and XML data type . . . . . . . . . . . . . . . . . . . . 7 1.2.5 Getting XML data into the database . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2.6 Querying XML data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2.7 Query optimization and indexes for XML . . . . . . . . . . . . . . . . . . . . . 10 1.2.8 XML schema repository and validation . . . . . . . . . . . . . . . . . . . . . . . 10 1.2.9 Full text search for XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.10 Annotated schema decomposition . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.11 Application development support . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.12 Tools and utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.2.13 Benefits of DB2 pureXML technology . . . . . . . . . . . . . . . . . . . . . . . 13 1.3 pureXML usage scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.3.1 Integration of diverse data sources . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.3.2 Forms and their processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.3.3 Document storage and querying . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3.4 XML for transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.3.5 Syndication and XML feeds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.3.6 XML as a better data model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Chapter 2. Sample scenario description . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.1 Business requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
iii
2.1.1 Data modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2 Application description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.2.1 Loan application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.2.2 Loan processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.2.3 Loan management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.3 Application setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Chapter 3. XML database design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.1 Architecture overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.2 Logical database design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2.1 XML data type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2.2 Relational structure versus XML structure . . . . . . . . . . . . . . . . . . . . 44 3.2.3 XML indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.2.4 Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2.5 XML schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2.6 XML schema design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.2.7 Industry standards and XML schemas . . . . . . . . . . . . . . . . . . . . . . . 56 3.2.8 XML data validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.3 Physical database design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.4 Creating a database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Chapter 4. Working with XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.1 XPath. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.1.1 XQuery/XPath data model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.1.2 Location paths. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.1.3 Using location paths to retrieve nodes of an XML document . . . . . . 80 4.1.4 Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.2 XQuery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.2.1 Types, expressions, and functions . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.2.2 FLWOR and selecting XML data. . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.2.3 Updating XML data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.3 XQuery and SQL/XML. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 4.3.1 XQuery with embedded SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 4.3.2 SQL/XML. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 4.3.3 When to use what . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 4.4 When and how to use namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 4.5 Getting XML data in and out of database . . . . . . . . . . . . . . . . . . . . . . . . 142 4.6 XML full-text search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 4.6.1 DB2 Net Search Extender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 4.6.2 Preparing the instance for text search . . . . . . . . . . . . . . . . . . . . . . 155 4.6.3 Full-text searching using DB2 NSE . . . . . . . . . . . . . . . . . . . . . . . . . 159 4.6.4 Taking advantage of Net Search Extender text search features. . . 162 4.6.5 Full-text search considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
iv
4.6.6 The NSE document model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Chapter 5. Managing XML data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 5.1 XML indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.1.1 XML index types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.1.2 Creating XML indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 5.1.3 How to look up information for XML indexes. . . . . . . . . . . . . . . . . . 181 5.1.4 Access plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 5.1.5 Best practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 5.2 Schema management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 5.2.1 XML Schema Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 5.2.2 XML schema registration/dropping . . . . . . . . . . . . . . . . . . . . . . . . . 199 5.2.3 Querying XSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 5.2.4 XSR support on the Control Center . . . . . . . . . . . . . . . . . . . . . . . . 205 5.2.5 Schema evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 5.3 IMPORT, EXPORT, and RUNSTATS . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 5.3.1 IMPORT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 5.3.2 EXPORT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 5.3.3 RUNSTATS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 5.4 XML data security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 5.4.1 LBAC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 5.4.2 Row and column-level access control . . . . . . . . . . . . . . . . . . . . . . . 234 5.4.3 Node-level access control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Chapter 6. Application development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 6.1 The database application development environment . . . . . . . . . . . . . . . 250 6.2 Application development tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 6.2.1 Developer Workbench. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 6.2.2 Developer Workbench: Visual Query Builder overview. . . . . . . . . . 256 6.3 Accessing pureXML from application overview. . . . . . . . . . . . . . . . . . . . 268 6.3.1 Application programming language support for XML . . . . . . . . . . . 268 6.3.2 Considerations when updating and inserting XML data . . . . . . . . . 269 6.3.3 Considerations when retrieving XML data . . . . . . . . . . . . . . . . . . . 275 6.4 DB2 application development with CLI and ODBC . . . . . . . . . . . . . . . . . 278 6.4.1 Setting up the CLI environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 6.4.2 Building CLI applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 6.4.3 XML data handling in CLI applications . . . . . . . . . . . . . . . . . . . . . . 282 6.4.4 Embedded SQL applications: overview . . . . . . . . . . . . . . . . . . . . . 289 6.5 Building applications in C or C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 6.5.1 Building C/C++ applications with the sample build script . . . . . . . . 291 6.5.2 Declaring XML host variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 6.5.3 Referencing XML host variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 6.5.4 Declaring large object type host variables . . . . . . . . . . . . . . . . . . . 297
Contents
6.5.5 Referencing LOB type host variables . . . . . . . . . . . . . . . . . . . . . . . 299 6.5.6 Executing XQuery expressions in embedded SQL applications . . . 299 6.6 Java application programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 6.6.1 Setting up the DB2 JDBC and SQLJ development environment . . 302 6.6.2 Building JDBC applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 6.6.3 Building SQLJ applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 6.7 Building DB2 applications with PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 6.7.1 Setting up the PHP application development environment . . . . . . . 316 6.7.2 Introduction to PHP application development for DB2 . . . . . . . . . . 318 6.8 The DB2 .NET environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 6.8.1 Building sample applications for the DB2 .NET data provider . . . . 321 6.8.2 XML support in Visual Studio.NET: overview . . . . . . . . . . . . . . . . . 322 6.8.3 XML data type support in Visual Studio .NET . . . . . . . . . . . . . . . . . 322 6.8.4 XQuery support in Visual Studio.NET . . . . . . . . . . . . . . . . . . . . . . . 334 6.9 XML and stored procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 6.9.1 XML and XQuery support in SQL procedures. . . . . . . . . . . . . . . . . 339 6.9.2 XML support in external routines . . . . . . . . . . . . . . . . . . . . . . . . . . 343 6.9.3 XML Schema Repository object registration . . . . . . . . . . . . . . . . . . 348 6.10 Web services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 6.10.1 Components of Web Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 6.10.2 Web services in DB2 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Appendix A. Sample data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 A.1 Creating XMLoan database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 A.1.1 Creating database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 A.1.2 Creating tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 A.2 contactInfo.xsd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 A.3 Sample XML data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368 Appendix B. Additional material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373 Locating the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373 Using the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 System requirements for downloading the Web material . . . . . . . . . . . . . 374 How to use the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 Abbreviations and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380 Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
vi
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
Contents
vii
viii
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
ix
Trademarks
The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: DB2 DB2 Universal Database developerWorks IBM ibm.com IMS Informix iSeries pureXML Rational Redbooks Redbooks (logo) WebSphere Workplace Workplace Forms z/OS
The following terms are trademarks of other companies: Oracle, JD Edwards, PeopleSoft, and Siebel are registered trademarks of Oracle Corporation and/or its affiliates. Snapshot, and the Network Appliance logo are trademarks or registered trademarks of Network Appliance, Inc. in the U.S. and other countries. eXchange, Java, JDBC, JDK, JVM, J2SE, and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Expression, Microsoft, Visual Basic, Visual Studio, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a trademark of Linus Torvalds in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others.
Preface
This IBM Redbook, intended for IT managers, IT architects, DBAs, application developers, and other data server professionals, offers a broad understanding of the DB2 9 feature pureXML. The book is organized as follows: Chapter 1, Introducing DB2 9: pureXML on page 1, explores the importance of XML data and the necessity for managing it as a strategic business asset. We give you an overview of the DB2 9 pureXML technology and its features, and illustrate examples and scenarios for utilizing DB2 9 and pureXML. Chapter 2, Sample scenario description on page 21, introduces the sample online unsecured loan application, XMLoan. We cover the business requirements, data modeling, application descriptions, and application setup. Chapter 3, XML database design on page 39, provides information about the hybrid database design. We describe the DB2 9 database architecture, as well as logical and physical database design. Chapter 4, Working with XML on page 73, discusses how to work with XML. The topics covered include XPath, XQuery, SQL/XML, when and how to use namespaces, getting XML data in and out of the database, and XML full-text search. Chapter 5, Managing XML data on page 173, explains how to manage XML data stored in XML columns. We introduce the pureXML index features and schema management, and illustrate how to move data, including XML documents, in and out of a table using DB2 9 IMPORT and EXPORT utilities. We also show how RUNSTATS work with pureXML features. Finally, we describe some security solutions that correspond to pureXML features. Chapter 6, Application development on page 249, covers various aspects of application development using DB2. The information contained in this chapter features topics and examples that are specific to application development with XML. The subjects covered include database application development environment and tools, how to access pureXML from within an application, XML and stored procedures, and Web Services.
xi
xii
Rav Ahuja is a worldwide DB2 program manager based at the IBM Toronto Lab. He has been working with DB2 for Linux, UNIX, and Windows since version 1 and has held various roles in DB2 development, technical support, marketing, and product strategy. He works with customers and partners around the globe helping them build and benefit from DB2 and services-based solutions. Rav is a frequent contributor to DB2 papers, articles, and books. He holds a Computer Engineering degree from McGill University and MBA from University of Western Ontario.
Left to right: Ming-Pang, Dobromir, Felicity, Ippei, Whei-Jen, Rav, and Art
Acknowledgements
The authors express their deep gratitude for the help they received from Susan Malaika from the IBM Silicon Valley Laboratory. Thanks to the following people for their contributions to this project: Grant Hutchison Budi Surjanto Prashant Juttukonda Ro Omro Samir Kapoor IBM Toronto Laboratory
Preface
xiii
Matthias Nicola Cindy Saracco Bert Van Der Linden Christina Lee Mayank Pradhan Ted Wasserman Henrik Loeser IBM Silicon Valley Laboratory Barry Faust IBM Software Migration Project Office Brian Williams IBM Software Group Denise Pirro IBM Sales and Distribution Many thanks to our support staff for their help in the preparation of this book: Emma Jacobs, Sangam Racherla, Deanna Polm, and Yvonne Lyon. International Technical Support Organization, San Jose Center
xiv
Comments welcome
Your comments are important to us! We want our Redbooks to be as helpful as possible. Send us your comments about this or other Redbooks in one of the following ways: Use the online Contact us review redbook form found at: ibm.com/redbooks Send your comments in an e-mail to: [email protected] Mail your comments to: IBM Corporation, International Technical Support Organization Dept. HYTD Mail Station P099 2455 South Road Poughkeepsie, NY 12601-5400
Preface
xv
xvi
Chapter 1.
Portions of this chapter are excerpted from the papers listed in 1.5, References on page 20.
XBRL - Business Reporting / Accounting: http://www.xbrl.org/ NewsML - News / Publication: http://www.newsml.org/ These standards facilitate purposes such as the exchange of information between the various players within these industries and their value chain members, data definitions for ongoing operations, and document specifications. More and more companies are adopting such XML standards or are being compelled to adopt them in order to stay competitive, improve efficiencies, communicate with their trading partners or suppliers, or just to perform everyday tasks.
Content for these feeds is rendered as XML files and can contain links, summaries, full articles, and even attached multimedia files such as podcasts. Syndication and Web feeds are transforming the Web as we know it. New business models are emerging around these technologies. As a consequence, XML data now exists not only in companies adopting XML industry standards, or enterprises implementing SOAs, but also on virtually every Web-connected desktop.
Stuffing involves storing an XML document as a whole into a single VARCHAR or large object (CLOB or BLOB) column within a relational database. This approach works well as long as all you have to do is store and retrieve the XML documents in their entirety. However, if you have to query the contents of XML documents or retrieve fragments or specific elements/attributes/sub-trees, it involves scanning through each document at run-time, which can be highly unwieldy due to performance overhead. Shredding or decomposing involves mapping XML data to and from relations columns and tables. To store XML into the database, the XML document is shredded into its various pieces (elements and attributes) that are stored in separate columns, a process that involves some overhead. Complexity of the XML data and normalization rules might cause XML documents to span hundreds of columns in dozens of tables. Similarly, to reconstruct the document, all of these columns and tables must be accessed using complex queries and multitable joins that introduce unnecessary complexity. It might even become impossible to reconstruct some documents or preserve the original fidelity of data, such as for digital signatures. Furthermore, this approach also introduces rigidity of relational data models into the flexible nature of XML data formats.
During recent years, a number of specialized DBMSs for XML data have been introduced that are aware of the XML data structures and allow more efficient management of XML data. However, most of these XML DBMSs are relatively new and introduce a largely unproven environment into an IT infrastructure, raising concerns about integration with traditional data, staff skills, and long-range viability.
The pureXML technology in DB2 9 includes the following capabilities: pureXML data type and storage techniques for efficient management of hierarchical structures common in XML documents. pureXML indexing technology to speed searches of subsets of XML documents. New query language support (for XQuery and SQL/XML) based on industry standards and new query optimization techniques. Industry-leading support for managing, validating, and evolving XML schemes. Comprehensive administrative capabilities, including extensions to popular database utilities. Integration with popular application programming interfaces (APIs) and development environments.
XML shredding and publishing facilities for working with existing relational models. Enterprise proven reliability, availability, scalability, performance, security, and maturity that you have come to expect from DB2 In the next few sections we take a closer look at some of these capabilities and how to use them.
INSERT INTO orders (orderid, orderinfo) VALUES (5, '<order> <orderdate>2006-07-07</orderdate> <customer id="8"> <name>XYZ</name> <zip>12345</zip> </customer> <shipnote>Fragile Contents</shipnote> </order>' );
Consider the XQuery shown in Example 1-3. This XQuery returns the entire XML document inserted in Example 1-1.
Example 1-3 Using XQuery to access XML data
XQUERY db2-fn:xmlcolumn('ORDERS.ORDERINFO');
The XQuery in Example 1-4 retrieves the orderdate from XML documents in the orders table.
Example 1-4 Using XQuery to access part of XML document
XQUERY for $d in db2-fn:xmlcolumn('ORDERS.ORDERINFO')/order/orderdate return $d; Result: <orderdate> 2006-07-07 </orderdate> Finally, Example 1-5 shows code samples that combine SQL and XQuery. The first is an SQL/XML statement, and the next one embeds SQL within XQuery.
Example 1-5 Combining SQL and XQuery
-- retrieve the orderid of orders for a specific customer 'XYZ' SELECT orderid FROM orders WHERE xmlexists('$o[order/customer/name="XYZ"]' PASSING orderinfo AS "o"); -- retrieve info for orders matching specified criteria XQUERY db2-fn:sqlquery( "SELECT orderinfo FROM orders WHERE orderid > 3" )/order/customer[zip = "12345"];
We discuss in detail XQuery, SQL, and SQL/XML in Chapter 4, Working with XML on page 73.
CREATE INDEX odindex ON orders(orderinfo) GENERATE KEY USING XMLPATTERN '/order/orderdate' as SQL DATE;
10
11
Take, for instance, DB2 support for XML in JDBC; the Universal DB2 driver for JDBC has been enhanced for XML data. XML data for query results and input and output parameters can be bound using Java data types such as strings, byte arrays, and streams. Because JDBC 3.0 currently does not define a native XML data type, DB2 provides an extension XML type known as com.ibm.db2.DB2Xml. The DB2Xml extension has a number of very useful methods that makes working with XML data easy. In Example 1-8, a column is retrieved as a DB2 XML object. Then the getDB2String method returns the serialized representation of the XML value (without XML declaration) as a string object. The method getDB2XMLBinaryStream (UTF-16) then returns a binary stream with the XML value encoded in UTF-16, including a matching XML declaration.
Example 1-8 Using DB2 provided JDBC extension to access XML data
12
Reduce development time through code simplification and avoiding XML-relational transformations in your applications. Increase agility through versatile XML schema evolution, allowing you to quickly modify applications as result of changing or introducing new services, products, or business processes Improve insight by harnessing previously unmanaged XML data and providing quicker query processing through XML-optimized storage and indexing.
For instance, Storebrand Group, a large financial services company in Europe, is seeing dramatic benefits as a result of using DB2 9 pureXML technology for powering their SOA solution. Development tasks that took them anywhere from two to eight hours with relational databases, now take them less than 30 minutes with DB2 pureXML. Schema changes using a relational data model that could take up to one week to implement, can now be done in a matter of minutes with DB2 9. Long-running queries running over shredded XML data that previously took days to complete now execute in seconds or minutes with pureXML. With the pure XML support available in IBM DB2 9, it is far easier, faster and less expensive to run queries, share and retrieve data, and make document changes in response to new business requirements without impacting applications. - Thore Thomassen, Senior Enterprise Architect, Storebrand Group
13
Customer
WAP
XM
XM L
Life Insurance
XML
XM
YTP Pensions
WWW
X ML
Call Center
XM L
Business Services
XML
Integration Database
XM L
ITP Pensions
DB2 9
XM L
XM L
XM
Investments
L
Financial Advisor
XM
XM L
XML
Banking
XM L
Archive
Process Management
Data Warehouse
Mortgage
14
ve pro Ap
DB2 9
<xml>
Audit
Insi ght
</xml>
15
DB2 9
Create Update Manage
Contract
Another example is that of a manufacturer using DB2 9 to store technical manuals. Their finished products are made up of mechanical parts, each consisting of other parts or subassemblies. The subparts and subassemblies have their own instructions or manuals. Each manual has some structured sections and some free-form sections. By using DB2 9 to store these manuals as XML documents, they are able to build complete manuals of the finished products using the hierarchy of manuals, while being able to quickly drill down and access instructions for subassemblies. They are also able to easily adapt and update the manuals for new models of their products.
16
DB2 9 is also suitable for simple or application managed document processing. Library bibliography and online documentation are examples of simple document processing applications. These applications can leverage the power of XQuery for search or dynamic composition of required document elements from the underlying XML documents. Applications such as Wikis and Blogs are also examples that require simple document object management and processing capabilities. These applications store, update, search, and retrieve text and other fragments in XML. Administrators and power users are likely to perform some additional query tasks on such application driven simple document systems. DB2 9 provides the necessary capabilities for these simple requirements. There are numerous other examples where DB2 9 is an excellent fit for storing document-centric XML. Many business objects are now being generated as XML documents, such as orders, invoices, and even spreadsheets and word-processed documents. Storing these documents in a DB2 database allows for reliable management, fast subdocument level access, and the ability to derive deep insight. DB2 9 can also be used as a building block for other content and document management applications.
As business-critical transactions are conducted using XML, DB2 with pureXML capabilities becomes a natural fit as well. There are numerous drivers behind XML transactions. Message-based transactions within service-oriented architectures are one such driver. Companies want to capture critical business data, route it appropriately, prioritize processing (such as by customer value) and perform analysis on these data items. For example, a goods distributor is moving towards a message-based infrastructure to manage business interactions with other value-chain members. XML data is used as the data objects of the transactions (such as for purchase and sale of goods). This XML data (such as buy/sell records) can be retrieved, updated, searched according to filtering criteria, and analyzed using the XML capabilities in DB2 9. Industry standards are also driving XML-based transactions. For example, FIXML is being used for trading stocks and other financial instruments. DB2 9 can serve as the transactional engine behind such financial transactions and store the resulting XML data in a native format for fast queries and updates, but also provides the flexibility to evolve schemas rapidly to handle new types of financial instruments and changes in standards. As a case in point, FIXML has had new versions every one to two years.
17
Web Server
Web Server
ATOM/RSS Reader
DB2 9
ATOM/RSS Provider
Consider a commercial scenario involving syndication benefits. A traditional method for a goods manufacturer or distributor looking to unload excess inventory is to sell it to a local overstock inventory buyer who typically offers only a few cents on the dollar. There are now a number of online exchanges and trading platforms (such as amazon, eBay, and overstock.com) that open up new markets and channels for selling inventory. The goods manufacturer/ distributor using DB2 9 can have the database automatically analyze and post listings for overstock goods on such online marketplaces using Web services. Furthermore, the goods vendor can use feeds to update information such as inventory data and pricing information in real time, as a result of supply and demand variations and changing inventory levels. The goods vendor can also store feeds using DB2 for other similar and competitive products, analyze data from these feeds, and adjust their offering tactics dynamically, often automatically using rule-based criteria.
18
Data with changing or evolving schemas: Forms Changing industry standard documents New product versions Data with null or multiple values: Addresses Phone numbers (home, office, mobile) Such data in patient records If you decide to use XML as the data model for your applications, DB2 9 is a great choice for storing and managing this XML data for all of the reasons discussed previously.
1.4 Summary
In this chapter we reviewed the pervasiveness of XML data, its relevance for all kinds of organizations, and the importance for managing it well. We introduced pureXML technology in DB2 9 and how it can unlock the latent potential of XML with performance and development time/cost savings. We also examined several examples and business scenarios where usage of DB2 9 along with its pureXML technology is highly applicable.
19
1.5 References
This chapter contains references or excerpts from papers and articles indicated below: Saracco, C. M. Managing XML for Maximum Return, IBM White Paper, October 2005. ftp://ftp.software.ibm.com/software/data/pubs/papers/managingxml.pdf Nicola, Matthias and Bert Van der Linden. Native XML Support in DB2 Universal Database, Proceedings of the 31st Annual VLDB, 2005. http://www.vldb2005.org/program/paper/thu/p1164-nicola.pdf Saracco, C. M. What's New in DB2 Viper: XML to the Core, IBM developerWorks article, February 2006. http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-060 2saracco/
20
Chapter 2.
21
Approve
New Loan
submit
Create new account & send loan approval notification Send loan rejected notification
Customer
Application
Loan Officer
Reject
The application should have the following functions: The customer can select the loan products and submit the loan application. The customer can make a payment. The customer can send feedback. The loan office can process the loan. The management team can create reports to analyze the loan, payment, and customer feedback.
22
LOAN APPLICATION TABLE APPL_ID(PK) APPL_STATUS(FK) PROD_ID(FK) APPL_DOC(XML) LOAN TABLE LOAN_ID(FK) START_DATE PYMT_STATUS PYMT_COUNT
FEEDBACK TABLE PAYMENT TABLE APPL_ID(FK) PAYMENT_DATE APPL_ID(FK) COMMENT(XML) CAMPAIGN TABLE CAMPAIGN_ID(PK) CAMPAIGN_DESC
Tables
The XMLoan sample application uses the following DB2 9 tables: CAMPAIGN The CAMPAIGN table contains information about predefined advertising campaign sources and their description. This information helps the bank to identify the successful campaigns for future advertising. The primary key is CAMPAIGN_ID.
23
FEEDBACK The FEEDBACK table saves the information entered by the customer from the bank Web site. The information includes the loan application ID, a rating of the loan process, and comments about the loan products. The COMMNET column is defined as XML data type to store customer feedback. The foreign key LOAN_APPL_ID references the LOAN_APPLICATION table. LOAN The LOAN table contains records for approved loans. Loan ID, loan start date and loan status ID are stored in this table. The foreign key LOAN_ID references the PAYMENT table. LOAN_APPLICATION The LOAN_APPLICATION table contains information pertaining to the loan application process, such as loan application ID, loan application status, and application documents, which are stored as documents in an XML column. The loan application ID is generated by our application after the customer submitted their applications. APP_DOC is an XML column to store the loan application. The primary key is LOAN_APPL_ID. The foreign key PROD_ID references the PRODUCT table. PAYMENT The PAYMENT table contains payment records. The information includes application ID and payment date. The foreign key APPL_ID references the LOAN table. PRODUCT The PRODUCT table contains information about all unsecured loan products offer by the bank. The information includes the loan product ID, a description for each product, the interest rate, the loan amount, and the loan term. The primary key is PROD_ID. APPL_STATUS The APPLICATION_STATUS table contains information about the loan application status. The primary key is STATUS_ID.
24
25
Select a loan product Enter personal information How did you hear about us?
Apply Clear
Submit
Enter Loan ID
Setup.txt
Make Payment
Submit Reset
Make Payment
Feedback
Submit
Feedback
Submit Reset
Apply Loan
The loan application form is launched when the customer selects Apply Loan, as shown in Figure 2-5. The customer selects a loan product from the Product drop-down menu, and enters the required information such as name, address, and financial data.
26
After having successfully submitted their unsecured loan information, the customer receives a confirmation message and is assigned a loan application identification. The confirmation page is shown in Figure 2-6.
27
loanForm.jsp
GenXML.java
Index.html
SELECT
SELECT
INSERT
PRODUCT
campaign
LOAN_APP LICATION
Legend:
Make Payment
In addition to applying for a new loan, the customer can make a payment on an existing loan account. To make a payment, the customer can select the hyperlink Make Payment from the home page. The customer is presented with a payment page where they can enter a loan identification number to bring up their record. By entering the Loan ID and selecting the Enter button, a new page is launched to allow the customer to view payment information and make a monthly payment on their loan account.
28
By selecting the Make Payment button, the customer confirms that the payment will be sent to the bank, and receives a confirmation message with the payment history displayed. The Make Payment process flow is illustrated in Figure 2-8.
29
Loan_Application
LOAN
UPDATE Loan table. Update PYMT_STATUS
Index.html
PAYMENT
makePayment.html
makePayment.jsp
updatePayment.jsp
Legend:
PAYMENT
Feedback
Alternately, the customer can submit feedback about the loan request process or current unsecured loan product offers by selecting the Feedback hyperlink from the home page (Figure 2-10).
30
Index.html
Feedback.html
Feedback.jsp
FEEDBACK TABLE
Legend:
31
The loan officer can start processing an unsecured loan application by selecting any application from the loan listing. When a loan application is selected, the processing page (Figure 2-13) is presented, where the loan officer can analyze the record and approve or reject the loan.
32
When the loan officer approves a loan request, a confirmation page will be displayed with a message. The message states that the loan has been successfully approved, and a welcome notification is sent to the customer to let him or her know of the loan approval decision. When the loan officer rejects a loan request, a loan decision notification is sent to the customer. The notification includes information explaining the decision and, optionally, proposes a new loan offer for which the customer can be approved.
33
approved status. Alternately, if the Reject button is selected, the application loads the applicationRejected page and just updates the APPL_STATUS column in the LOAN_APPLICATION table. The process is illustrated in Figure 2-14.
showAppID.jsp
processApp.jsp applicationApproved.jsp
Index.html
Insert
T DA UP E
applicationRejected.jsp SELECT BY APPID
SELECT
UPDATE
LOAN_APP LICATION
LOAN_APP LICATION
PRODUCT
Legend:
34
When the loan officer selects a monthly report, a separate report page is displayed for viewing and analysis. For example, when clicking Customer Satisfaction, the Customer Satisfaction report is displayed (Figure 2-16).
In the Reports page, we provide Show Queries buttons for each report. In a real-life application, this field would not be shown to the loan officer. This was added so that we can show you the XQuery constructed for each report. Figure 2-17 shows the Customer Satisfaction query.
35
showQuery1.html query1Result.jsp
viewSelectQuery.html
QU ER Y
Index.html
QUERY
query2Result.jsp Legend:
Java html JSP TABLE Beans
36
Following are the installation instructions for the XMLoan application: 1. Install Apache 41: The Apache HTTP Server can be downloaded from the following Web site: http://tomcat.apache.org/download-41.cgi Apache Tomcat 4.1 requires the J2SE Software Development Kit (SDK). Note: If you do not have J2SE, you can use the copy that comes with DB2 9 located under <DB2_install_directory>\SQLLIB\java\jkd. The default <DB2_install_directory> for Windows is C:\Program Files\IBM\. 2. Install DB2 9: You can run any DB2 9 edition with the XMLoan application. DB2 Express-C is available for free download at the following IBM DB2 Universal Database Web site: http://www-306.ibm.com/software/data/db2/udb/db2express/ 3. Add the following environment variables: JAVA_HOME=<db2_install_directory>\SQLLIB\java\jkd CATALINA_HOME=C:\Program Files\Apache Group\Tomcat 4.1 Edit the existing environment variable PATH and add %JAVA_HOME%\bin to the path. 4. Copy db2jcc.jar and db2jcc_license_cu.jar from <DB2_install_directory>\SQLLIB\java to %CATALINA_HOME%\common\lib directory. 5. Copy XMLoan.war to %CATALINA_HOME%\webapps\ directory. Apache will deploy the application automatically. 6. Run setup.txt from DB2 Command line processor or DB2 Command Editor to create a UTF-8 database, tables, and populate data required for the tables, as follows: DB2 -tvf setup.txt The DDLs for creating database and tables are listed in A.1, Creating XMLoan database on page 362. 7. Install a partial update stored procedure DB2XMLFUNCTIONS.jar using the following steps: a. Start DB2 command line processor. b. Set up the DB2 environment variable using the following command: DB2SET DB2_USE_DB2JCCT2_JROUTINE=on;
37
c. Update the Java heap size using the following command: DB2 UPDATE DBM CFG USING JAVA_HEAP_SZ 1024; d. Install the stored procedure jar file into DB2 using the following commands: DB2 -TD; CONNECT TO xmlrb USER db2admin USING db2admin; CALL SQLJ.INSTALL_JAR('file:///c:/temp/DB2XMLFUNCTIONS.jar', db2xmlfunctions, 0); You have to replace the c:/temp with the directory where the XML application is downloaded. e. Register the stored procedure: Use the command shown in Example 2-1 to create the stored procedure. You can copy and paste, then run the command in CLP.
Example 2-1 Create stored procedure
CREATE PROCEDURE db2xmlfunctions.XMLUPDATE( IN COMMANDSQL VARCHAR(32000), IN QUERYSQL VARCHAR(32000), IN UPDATESQL VARCHAR(32000), OUT errorCode INTEGER, OUT errorMsg VARCHAR(32000)) DYNAMIC RESULT SETS 0 LANGUAGE JAVA PARAMETER STYLE JAVA NO DBINFO FENCED NULL CALL MODIFIES SQL DATA PROGRAM TYPE SUB EXTERNAL NAME 'db2xmlfunctions:com.ibm.db2.xml.functions.XMLUpdate.Update'; 8. Start the XMLoan application: Open your browser and enter the following URL on your Web address: http://localhost:8080/XMLoan The application should start on your browser.
38
Chapter 3.
39
Application
SQL
SQL Parser
XML Data
Compiler
Hybrid Engine
Application
XQuery
XQuery Parser
Relational Table
DB2 9 treats XML as a native data type. It has a pureXML storage, meaning that XML data is stored in XML form, which is a hierarchical structure. Figure 3-1 shows how XML and relational data are stored separately.
40
<Customer> <Name> <FirstName>John</FirstName> <LastName>Smith</LastName> </Name> <DateOfBirth>1967-02-23</DateOfBirth> <SSN>123-45-6789</SSN> <Address> <Street>46 South Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95030</Zip> </Address> <Employer> <Company>My company</Company> <Position>Developer</Position> </Employer> </Customer> Figure 3-2 shows an XML document in hierarchical form.
Customer
Name
Date of Birth
SSN
Address
Employer
First Name
123-45-6789
Street
City
State
Zip
Company
Position
John
Smith
Los Gatos
CA
95031
MyCompany
Developer
41
An XML document must be well-formed in order to be inserted and imported into an XML column. Any attempt to insert or import an XML document that is not well-formed into an XML column will fail with an error. You can insert and import an XML document up to two gigabytes in size into an XML column. Similar to the long data types (LONG VARCHAR, LONG VARGRAPHIC, and LOB data), XML data is stored separately from the other contents of a table and can be stored in its own individual table space. DB2 9 stores XML data contained in table columns of the type XML in auxiliary XML storage objects. If the XML columns are stored in system managed space (SMS), the files associated with XML storage objects have the file type extension .xda. XML data objects are stored separately from parent table objects. For each row of XML type column, there is an XML data specifier (XDS) stored in the table. The XDS has the information to access the XML data stored in the disk. The XDS is also used for IMPORT and EXPORT utilities. Figure 3-3 shows the relationship among table, XDS, and an XML data column.
Table
XML Column
XDS XDS XDS XDS
DB2 9 supports XML document validation with XML schemas. The validation usually takes place in insert and import time. XML schemas used for validation are registered in an XML Schema Repository (XRS). The XML schema is different from the schema in the relational database. A relational database schema is a collection of named database objects, which is used to logically group database objects and is also used as a name qualifier. An XML schema is a language that describes the structure and constraints for the contents of XML documents. XML schemas are discussed in more detail in 3.2.5, XML schema on page 49.
42
43
for parsing for the LOB/LONG VARCHAR type. Selecting the whole XML document is also fast because, unlike XML type, the data is not stored in tree structure and there is no necessity for serialization. The trade-off is slow performance for searching and extracting the XML document. Because there is no parsing at insert time, the XML document is not checked for well-formedness. For the XML type, all XML documents are parsed and checked to see if they are well-formed. The select of whole XML documents from XML type columns also takes more time than from LOB/LONG VARCHAR type columns because of serialization. The search, extract, and partial update from XML type columns is faster than for LOB/LONG VARCHAR type columns. XQuery is available for XML type columns. In general, XML documents can be stored as LOB/LONG VARCHAR types if one or more of the following statements are true: No process is required on the XML document. It is not necessary to search, to extract, or to partially update the document. For example, the document must be kept intact for business rules or legal reasons. The XML document is from a trusted source that guarantees its well-formedness, and there is a requirement for validating it. The well-formedness is not important for the XML document. The only operations on the XML document are insert and whole document select. The performance of insert and select are the most important.
44
design detail in Administration Guide: Planning, SC10-4223. The following link directs you to the PDF file of this DB2 9 manual: ftp://ftp.software.ibm.com/ps/products/db2/info/vr9/pdf/letter/en_US/db 2d1e90.pdf In general, data that has the following properties should be stored in XML: The data is better described in hierarchal format. High complexity of the hierarchy data might require a high number of relational tables to store it and can be difficult to map into relational structure. XML is the most natural way to store such data. The schema is constantly changing and evolving. Business rules can change and affect the schema. In general, it is easier to do XML schema evolution than to change relational schema. We discuss the schema evolution in more detail in 5.2.5, Schema evolution on page 206. Many attributes of the data are empty or unknown. If you map the data into relational tables, there will be many null values in the tables. If the data is complicated and large, you might require many relational tables, and most of the values in the table would be null. To process the data in relational tables, you usually have to join many tables with complicated SQL statements. The XML schema is more flexible. You do not have to store the null values if the XML schema is well-designed. The XQuery accessing such data in XML would not be complicated. There is little data with a highly complex structure. If you store such data in relational tables, you will have complicated relational schemas, which means you require many tables. Managing these tables can have overhead. The SQL query to access such data requires joining many tables. If you have to process this data together with other data, the SQL query will be even more complicated. A small amount of data with a highly complex structure should be stored in XML.
45
<ORDER_ID=83492 CUST_ID=93457> <ITEM> <PROD_ID>94872</PROD_ID> <PROD_NAME>PEN</PROD_NAME> <PRICE>19.95</PRICE> <QUANTITY>30</QUANTITY> </ITEM> <ITEM> <PROD_ID>94866</PROD_ID> <PROD_NAME>BINDER</PROD_NAME> <PRICE>7.95</PRICE> <QUANTITY>26</QUANTITY> </ITEM> <ITEM> <PROD_ID>92219</PROD_ID> <PROD_NAME>LABELS</PROD_NAME> <PRICE>12.95</PRICE> <QUANTITY>250</QUANTITY> </ITEM> </ORDER> An order can have one or more items. We require two relational tables to decompose the XML document. The ORDER table has two columns. ORDER_ID is the primary key. The ITEM table has 5 columns. ORDER_ID is a foreign key from ORDER table. An order can have many items. Table 3-1 and Table 3-2 show decomposed data in relational tables.
Table 3-1 ORDER table ORDER_ID 83492 Table 3-2 iTEM table ORDER_ID 83492 83492 83492 PROD_ID 94872 94866 92219 PROD_NAME PEN BINDER LABLES PRICE 19.95 7.95 12.95 QUANTITY 30 26 250 CUST_ID 93457
46
After the XML data is decomposed into relational data, you can use SQL to query the tables and add some indexes to improve the query performance. Decomposing XML documents can be the right approach if the XML structure is simple. When decomposing XML documents, for every element that occurs more than one time, you usually require a separate table to represent it. For instance, in our previous example, the item element requires a separate table. This is fine with simple XML documents with a small number of elements that occur more than once. For complex XML documents, the insertion and selection of the data might then involve a huge number of tables. Also, for complex XML documents, decomposing is not always practical. In general, XML documents can be decomposed if the data has the following properties: The XML documents structure can be decomposed to a reasonable number of relational tables. The XML document is used for data exchange. The original document is not important after the data is exchanged. The decomposing usually does not keep the structure of the document; for example, the order is lost. Partial update is frequent and update performance is important. The relational tables usually have better performance in individual column updates. The XML schema does not change. The XML schema is usually more flexible than relational schema. That means schema evolution is easier in XML schema. If the does not change, decomposing the XML document into relational tables would be a reasonable choice. The XML document must be mapped into existing relational tables. You might have already have some existing relational tables and you want to map your XML document into those existing relational tables. The XML document must be processed by an application that only has the ability to access a relational table.
47
Instead of providing access to the beginning of a document, entries in an index over XML data provide access to nodes within the document by creating index keys based on XML pattern expressions. Because multiple parts of an XML document can satisfy an XML pattern, multiple index keys can be inserted into the index for a single document. If the performance of the query is the only priority, there is no necessity to perform update, insert, and delete, and if you do not care about space overhead, you can index everything. In reality, storage space is limited and most data requires update, insert, and delete. In general, data that is queried frequently and does not require modification is good candidate for indexing. What do you index? DB2 9 comes with tools to help, such as visuals and text-based explanations. You can test you queries against the data with explanations. By studying the explanation output, you would know if your index is useful and if there is another, better index. For more information about how to index XML data, see 5.1, XML indexes on page 174.
3.2.4 Views
You can create relational views from data in an XML column by invoking the function XMLTABLE, which is discussed in 4.3.2, SQL/XML on page 127. We show how to create a view with XMLTABLE here. In Example 3-3, a table with an XML column is created and one record inserted.
Example 3-3 Create a table with XML column
CREATE TABLE loan_application(appl_id bigint, appl_doc xml, appl_status integer, prod_id integer); insert into loan_application values( 11111,xmlparse(document<xml doc>Preserve whitespace),10,20);
You can create a view with relational column data and data from some elements in the XML document. Example 3-4 shows creating the view loan_application_view with three columns. The firstName and lastName are taken from an XML document.
Example 3-4 Creating the view
CREATE VIEW loan_application_view AS SELECT appl_status, t.lastName, t.firstName FROM loan_application, xmltable('$lo/application/customer/name' passing appl_doc as "lo" columns lastName char(20) path 'lastName', firstName char(20) path 'firstName')as t;
48
Example 3-5 shows selecting the view loan_application_view and the result.
Example 3-5 Selecting the view
select * from loan_application_view APPL_STATUS LASTNAME FIRSTNAME ----------- -------------------- -------------------10 John Smith 1 record(s) selected. Creating relational views for XML column data is easy, but you have to consider that DB2 does not use XML column indexes when queries are issued against such a view. For example, if you have an index on the element firstName and you issue an SQL query that restricts the result of the lastName column to be Smith, the index on element firstName is not used. DB2 would read all the XML documents and search for Smith in element firstName. If you have a lot of data, performance might not be as you expected. If the query also has a highly restrictive predicate involving the indexed traditional SQL columns, you can mitigate a slow performance problem. It is because DB2 uses the relational index to filter qualifying rows to a small number and applies any XML query predicate to these interim results before returning the final result set.
1 2 3 4 5
49
6 7 8 9 10 11
<xsd:element name="name" type="xsd:string"/> <xsd:element name="dateOfBirth" type="xsd:date"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
Line 1 shows the namespace prefix xds in the root element in the XML document. In this example, the prefix is xds, but it can be an arbitrary name. A namespace is a set of names that can be used as element and attribute names in an XML document. XML namespaces provide a mechanism to qualify an attribute, and an element name to avoid the naming conflict in XML documents. For example, if a health insurance company receives insurer information from a different company as an XML document, it is quite possible that two or more companies have the same element name defined, but representing different things in different formats. Qualifying the elements with a namespace resolves the name-conflicting issue. Line 2 declares an element named employee. In XML schema, you must have a name and its data type to define an element. Once a data type is defined, the instance element in the XML instance document can only have the value of the data type defined in the XML schema. An element can be defined as either a simple type or complex type. A simple type element cannot contain any elements or attributes. A simple element has the format: <xs:element name="element name" type="data type"/>. The following two elements, defined in lines 5 and 6 of Example 3-6 on page 49, are simple elements. The types xsd:string and xsd:integer are XML schema pre-defined data types: <xsd:element name="id" type="xsd:integer"/> <xsd:element name="name" type="xsd:string"/> A complex type element can contain elements or attributes. The element employee, defined in line 3 of Example 3-6 on page 49, is a complex type. It contains three other elements: id, name, and dateOfBirth. In this example, the definition of the complex type is inside of the element employee. This type is called a local complex type, and it can only be used to define the element and its children elements.
50
If you want a complex data type that can be used to define other elements, you can declare the complex data type globally, as shown in Example 3-7. The complex type employeeType is declared globally; it is not inside the employee element and can be used to define other elements in the same schema document. It also can be used in other schema documents if the other schema documents include or import it.
Example 3-7 Globally declare complex type employeeType
<xsd:complexType name="employeeType"> <xsd:sequence> <xsd:element name="id" type="xsd:integer"/> <xsd:element name="name" type="xsd:string"/> <xsd:element name="dateOfBirth" type="xsd:date"/> </xsd:sequence> </xsd:complexType> <xsd:element name="employee" type="employeeType"/> Just as with any XML documents, the XML schema documents have to be well-formed. You can use an XML schema to validate XML instance documents. A good XML schema should correctly describe business concepts.
UML modeling
Unified Modeling Language (UML) is an industry standard for modeling business concepts. It is an object-orientated language. UML is one of the import modeling languages that can assist you in building XML schemas. You can use UML to represent business concepts in graphic notions, and you can easily turn these graphic notions into XML schemas. In a UML diagram, a box represents a business concept or a class. A line represents relationship. You can turn a sentence into UML diagram. Figure 3-4 shows a UML diagram representing the sentence Loan officer approves a loan. In the Loan officer box, the pertinent information regarding Loan officer is identified. The Loan box identifies the loan-related information. This information often becomes the attributes in an XML schema. The line, Approve, identifies the relationship between Loan officer and Loan.
51
Loan officer
Officer ID Name ... ...
Loan
Approve
Loan ID Amount ... ...
We used UML in the schema design process for our sample application, XMLoan, described in Chapter 2, Sample scenario description on page 21. We started by gathering information about the loan application business process and studying the hard copy of the loan application form. For a loan application, the customer must provide their personal and financial information. This data should be able to be correlated to the existing account in the bank, if any. The bank also wants to perform market analysis. We then produce a simple loan application UML diagram as shown in Figure 3-5.
Application
Customer = Customer LoanType = Integer Campaign = Integer
Storage
Consist of
Financial Data
Name = Name DateOfBirth = Date SSN = String Address = Address Phone = String Email = String Employer = Employer FinancialData = Financial Data
Consist of
Consist of
Consist of
Consist of
Name
FirstName = String LastName = String
Address
String = String City = String State = String Zip = String
Employer
Company = String Position = String
Financial Data
Income = decimal Debt = decimal Expense = decimal Assets = decimal
52
Application consists of customer, loan type, and campaign. Customer consists of name, date of birth, social security number, address, phone, e-mail, employer, and financial data. Name consists of first name and last name. Address consists of street, city, state, and zip code. Employer consists of company and position. Financial data consists of income, debt, expenses, and assets.
From the business process point of view. a loan officer or bank manager can read the UML diagram as a loan application form separated into different sections. They can identify easily if any information is missing. For an application developer or DBA, the UML diagram can be easily transformed into an XML schema. Example 3-8 is the schema mapped from Figure 3-5.
Example 3-8 application.xsd
<?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:include schemaLocation="complextype.xsd"/> <xsd:element name="application"> <xsd:complexType> <xsd:sequence> <xsd:element name="customer" type="Customer"/> <xsd:element name="loanType" type="xsd:integer"/> <xsd:element name="campaign" type="xsd:integer"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema> The Application class in Example 3-8 is represented by a complex data type, which consists of three elements: customer, loanType, and campaign. The elements loanType and campaign have the built-in data type, string. The element customer has a complex data type Customer. Notice that XML is case-sensitive; Customer (capitalized) is different from customer. In this case, Customer with capital C is the complex data type and customer is the element. The complex type Customer is not defined in this schema document. The expression "<xsd:include schemaLocation="complextype.xsd"/>" means to include another XML document complextype.xsd. The definition of complex data type Customer is in the schema document complextype.xsd, as shown in Example 3-9.
53
<?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:complexType name="Customer"> <xsd:sequence> <xsd:element name="name" type="Name"/> <xsd:element name="dateOfBirth" type="xsd:date"/> <xsd:element name="ssn" type="xsd:string"/> <xsd:element name="address" type="Address"/> <xsd:element name="phone" type="xsd:string"/> <xsd:element name="email" type="xsd:string"/> <xsd:element name="employer" type="Employer"/> <xsd:element name="financialData" type="FinancialData"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="Name"> <xsd:sequence> <xsd:element name="firstName" type="xsd:string"/> <xsd:element name="lastName" type="xsd:string"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="Address"> <xsd:sequence> <xsd:element name="street" type="xsd:string"/> <xsd:element name="city" type="xsd:string"/> <xsd:element name="state" type="xsd:string"/> <xsd:element name="zip" type="xsd:integer"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="Employer"> <xsd:sequence> <xsd:element name="company" type="xsd:string"/> <xsd:element name="position" type="xsd:string"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="FinancialData"> <xsd:sequence> <xsd:element name="income" type="xsd:decimal"/> <xsd:element name="debt" type="xsd:decimal"/> <xsd:element name="expenses" type="xsd:decimal"/> <xsd:element name="assets" type="xsd:decimal"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
54
The complex data type Customer has eight elements. The element name has the complex data type Name. The element address has the complex data type Address. The element employer has the complex data type Employer. The element financialData has the complex data type FinancialData. The rest of the elements of the Customer complex data type are built-in data types. The complex data types Name, Address, Employer, and FinancialData are also defined in this XML schema document. Example 3-10 shows an instance document of the application schema.
Example 3-10 Application.xml
<?xml version="1.0"?> <application xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="C:\application.xsd"> <customer> <name> <firstName>Smith</firstName> <lastName>John</lastName> </name> <dateOfBirth>1967-02-23</dateOfBirth> <ssn>123-45-6789</ssn> <address> <street>46 East Main Street</street> <city>Los Gatos</city> <state>CA</state> <zip>95030</zip> </address> <phone>234-567-8901</phone> <email>[email protected]</email> <employer> <company>My company</company> <position>Developer</position> </employer> <financialData> <income>5000</income> <debt>10000</debt> <expenses>30000</expenses> <assets>200000</assets> </financialData> </customer> <loanType>10</loanType> <campaign>20</campaign> </application>
55
If you had not started building the XML schema by drawing the UML diagram first, it would be tricky to identify all object types and the relationships among them. The UML diagram provides not only the big picture, but also the detail of the business model. The UML diagram can also be used to explain the schema to people who are not familiar with XML schemas, such as the loan department manager and the loan officer. The UML diagram can be easily mapped into an XML schema. When designing a schema, you also want to consider what business data should be kept together in a single XML document. The XML document granularity does impact performance. For more details, refer to 15 best practices for pureXML performance in DB2 9 on the IBM developerWorks Web site: http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0610ni cola/
56
FpML: Financial products Markup Language (FpML) is the business information exchange standard for electronic dealing and processing of financial derivatives instruments. It establishes a new protocol for sharing information, and dealing in swaps, derivatives, and structured products. For more information about FpML, refer to: http://www.fpml.org/ FIXML: The Financial Information eXchange (FIXML) protocol is a messaging standard developed specifically for the real-time electronic exchange of securities transactions. For more information, refer to the Web site: http://www.fixprotocol.org/ MISMO: The Mortgage Industry Standards Maintenance Organization, Inc. (MISMO) was established by the Mortgage Bankers Association (MBA) to coordinate the development and maintenance of Internet-based Extensible Markup Language (XML) real estate finance specifications. MISMO utilizes an open and democratic vendor-neutral approach to the development and maintenance of a single real estate finance XML DTD transaction repository. MISMO has published specifications that support mortgage insurance applications, mortgage insurance loan boarding, secondary, bulk pricing, real estate services, credit reporting, and underwriting process areas. For more information, refer to the Web site: http://www.mismo.org/default.html XBRL: eXtensible Business Reporting Language (XBRL) is a language for the electronic communication of business and financial data. It provides major benefits in the preparation, analysis, and communication of business information. For more information, refer to the Web site: http://www.xbrl.org IFX: The Interactive Financial eXchange (IFX) is an XML-based, financial messaging protocol. It is an object model represented by an XML Schema, a communications protocol with well-considered business rules, and a Targeted Financial Solution. For more information, refer to the Web site: http://www.ifxforum.org/standards/
57
58
SportsML: Sports Markup Language (SportsML) is an XML-based standard for the interchange of sports data and statistics. The current release of SportsML is Version 1.0, which was ratified by the IPTC (International Press Telecommunications Council). For more information, refer to the Web site: http://www.sportsml.com/specifications.php XPRL: eXtensible Public Relations Language (XPRL) is an XML-based standard that is being designed for use in the public relations sector. It is an open initiative, which developers can use to create business-oriented programs for PR. XPRL defines how computer data relating to PR campaigns is stored and shared across the Internet. For more information, refer to the Web site: http://www.xprl.org/ PhotoML: Photo Markup Language (PhotoML) is a standard for describing the details of photo creation, processing, and content in a collection of photographs. It can be used for a wide variety of photographic formats, including roll film (such as 35mm and 120/220), sheet film (such as 4x5 and 8x10) and digital images. For more information, refer to the Web site: http://www.wohlberg.net/public/software/photo/photoml/ ThML: Theological Markup Language (ThML) is an XML-based standard that is being used to mark up texts for the Christian Classics Ethereal Library and other projects. For more information, refer to the Web site: http://www.ccel.org/ThML/ XBITS: XML Book Industry Transaction Standards (XBITS) is a Working Group of IDEAlliance that is designing an XML-based standard to facilitate bidirectional electronic data exchanges between publishers, printers, paper mills, and component vendors. For more information, refer to the Web site: http://www.idealliance.org/xbits/ CBML: Comic Book Markup Language (CBML) is a TEI-based XML vocabulary (with DTD and schema representations) designed to accommodate the XML encoding of comic books and graphic novels. For more information, refer to the Web site: http://www.cbml.org/
59
GJXDM: The Global JXDM (GJXDM) is an XML standard for criminal justice information exchanges, providing law enforcement, public safety agencies, prosecutors, public defenders, and the judicial branch with a tool to share data and information in a timely manner. For more information, refer to the Web site: http://it.ojp.gov/jxdm/3.0/index.html
IT standards
Following are some IT standards used across the industries: Web Services provide a way of describing and publishing a general purpose and agreed interface for accessing data and applications, through the Web Services Description Language (WSDL) notation. The Web Services approach provides loose coupling between clients and the data or applications being accessed and is important for enabling service-oriented architecture (SOA). Atom (and RSS) provide an agreed way for publishing summaries of changes to data and for interested parties to easily locate these summaries easily. Atom also makes it possible for general-purpose software readers to offer a human or programmatic interface to subscribe to changes, to be notified when the changes happen, and to review the changes. RSS is similar to Atom, except it has not been standardized and thus has many variants. XForms is an agreed way to enable a Web forms interface. An XForm can load external XML documents as initial data in the browser, and can submit the results to the server as XML. By including the browser in the XML pipeline through XFORMS, it means that you can have end-to-end XML, right up to the user's desktop. This eliminates data conversions, thereby reducing processing overhead.
60
Explicit validation Explicit validation means that the information of a precise XML schema is
explicitly specified in XMLVALIDATE function. There are two ways to specify the information: Use the namespace and the schema location of the XML schema. Use an SQL identifier. Example 3-11 is a primary XML schema document that has been registered with the SQL identifier sample.pets.
Example 3-11 XML schema pets.xsd
<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.itso.org/pets" xmlns:pe="http://www.itso.org/pets" xmlns:ca="http://www.itso.org/cat" xmlns:do="http://www.itso.org/dog"> <xs:import namespace="http://www.itso.org/cat" schemaLocation="cat.xsd" /> <xs:import namespace="http://www.itso.org/dog" schemaLocation="dog.xsd" /> <xs:element name="PETS"> <xs:complexType>
61
<xs:sequence> <xs:element name="DOG" type="do:DOG"/> <xs:element name="CAT" type="ca:CAT"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> Example 3-12 shows an XML document, pets.xml, which requires validation during insert.
Example 3-12 XML document pets.xml
<?xml version="1.0"?> <pe:PETS xmlns:pe="http://www.itso.org/pets" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.itso.org/pets pets.xsd"> <DOG> <NAME>SPOT</NAME> <AGE>2</AGE> </DOG> <CAT> <NAME>TOM</NAME> <AGE>1</AGE> </CAT> </pe:PETS> Example 3-13 shows explicitly validating an XML document with an SQL identifier. <XML document> means that the content of the XML document pets.xml. sample.pets is the SQL identifier of the registered schema.
Example 3-13 Explicit validation using SQL identifier
insert into test values xmlvalidate (xmlparse(document'<XML document>' Preserve whitespace) ACCORDING TO XMLSCHEMA ID sample.pets) An XML schema can contain one or more XML schema documents. One of these XML documents must be the primary schema, which is at the top of the hierarchy. In our example, pets.xsd is the primary with two imported schema, cat.xsd and dog.xsd. When you use the namespace and the schema location of the XML schema to validate the XML document, the namespace and the schema location you specified must match the namespace and the schema location of the XML schema primary document.
62
Example 3-14 shows explicitly validating the XML document with the use of namespace and the schema location of the XML schema. http://www.itso.org/pets is the namespace and http://sample is the schema location. <XML document> means the content of the XML document pets.xml.
Example 3-14 Explicit validation using namespace
insert into test values xmlvalidate (xmlparse(document'<XML document>' Preserve whitespace) ACCORDING TO XMLSCHEMA URI 'http://www.itso.org/pets' LOCATION 'http://sample') Which one do you use, the SQL identifier or the namespace and the schema location of the XML schema? Both have the same performance and provide the same functionality. You can choose one over the other, depending on how you design the database and application. You might have the same schema that registers in different databases in different names. If you use the namespace and the schema location of the XML schema, you do not have to change your application to different SQL identifiers for the same schema. Sometimes, the SQL identifier is more convenient, if your application only accesses one database and you know the exact SQL identifier that associates to the schema you want to use.
Implicit validation Implicit validation means that the schema information is not passed by the
INSERT or IMPORT statement. The schema hints are from the XML document that is inserted or imported. The schema hints are used to find the specific schema to validate the XML document. The schema hints are specified in the following attributes in the XML document:
xsi:schemaLocation has two values, the namespace and the schema location
to the namespace. If the XML document does not have a namespace, xsi:noNamespaceSchemaLocation would be used.
63
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://person" xmlns:per="http://person"> <xsd:element name="person"> <xsd:complexType> <xsd:sequence> <xsd:element name="name" type="xsd:string" /> <xsd:element name="age" type="xsd:integer" /> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema> Whether explicit or implicit validation is being done, the schema used for validation must be registered in the database. Example 3-16 shows the commands to register schema person.xsd.
Example 3-16 Register schema
register xmlschema http://person from c:\person.xsd as john.person complete xmlschema john.person Example 3-17 shows an XML document person.xml to be inserted into the table TEST using implicit validation with schema person.xsd. The actual data in person.xml is irrelevant. We are interested in the attribute xsi:schemaLocation, which contains the schema hints. It has the first value http://person as the namespace, and the second value http://person as the schema location. Our example shows that the namespace and the schema location have the same value, but they do not have to be the same.
Example 3-17 Person.xml
<?xml version="1.0"?> <per:person xmlns:per="http://person" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://person http://person" > <name>John Doe</name> <age>36</age> </per:person> Example 3-18 shows the command to implicitly validate with the schema hints in schemaLocation in the XML document during the insert time. <XML document> means the content of the XML document person.xml.
64
insert into test values xmlvalidate (xmlparse(document'<XML document>' preserve whitespace)) If the XML document is valid, it will be inserted into the TEST table. If it is not valid, the insert fails with error code SQL16206N. When you do implicit validation, you do not use the ACCORDING TO XMLSCHEMA clause. If you do, it becomes explicit validation. DB2 will use the information provided by the ACCORDING TO XMLSCHEMA clause, and the schema hints in the XML document are ignored. In implicit validation, DB2 searches the catalog table for the pair value provided by xsi:schemaLocation in order to find the correct schema. In our simple example, DB2 finds the schema john.person by searching the catalog tables using the pair value http://person and http://person.
65
No validation
If the XML documents are from a trusted source, there is no necessity to validate. Suppose that a bank develops an application for its branches. The application is guaranteed to generate valid XML documents, therefore no validation is required. Sometimes, you do not care if XML documents are valid or not. In this case, the validation is also not required. Note: Even if you do not validate documents, you can only insert well-formed XML documents into an XML column in DB2 9.
66
The amount of space that an XML document occupies in a DB2 database is determined by the initial size of the document in raw form and by a number of other properties. The following list includes the most important properties: Document structure: XML documents that contain complex markup tagging require a larger amount of storage space than documents with simple markup. For example, an XML document that has many nested elements, each containing a small amount of text or having short attribute values, occupies more storage space than an XML document composed primarily of textual content. The number of elements: The pureXML storage model stores all XML data in a hierarchical form as a tree structure, so besides the real data, it requires extra storage, which is used to describe the tree structure. For example, every node in the tree has to store the links to its child nodes and to its parent node. The more complex tree requires more extra storage for the tree structure information. For every element in an XML document, DB2 9 will allocate a small structure to store the element information. There is no data stored in this structure. Node names: The length of element names, attribute names, namespace prefixes and similar, noncontent data also affect storage size. Any information unit of this type that exceeds four bytes in raw form is compressed for storage, resulting in comparatively greater storage efficiency for longer node names. Ratio of attributes to elements: Typically, the more attributes that are used per element, the lower the amount of storage space that is required for the XML document. Document code page: XML documents with encoding that uses more than one byte per character occupy a larger amount storage space than documents using a single-byte character set. Document validation: XML documents are annotated after having been validated against an XML schema. The addition of type information after validation results in an increased storage requirement. To calculate an XML document manually, add up the size of every element, attributes, and actual data. Use this simple XML document as an example: <Customer> <Name>John Smith</Name> <Phone>408-404-1212</Phone> </Customer>
67
The size of this document will be the total of the length of <Customer>, </Customer>, <Name>, </Name>, John Smith, <Phone>, </Phone>, and 408-404-1212. For each element, DB2 uses a few bytes to store element information. Because XML document lengths vary, it is not practical to calculate each documents size and use it as the base to estimate the space required. A better way to estimate the XML data space required is by sampling your XML documents with a typical size and storing the samples into the database. You can then use the administrative table function admin_get_tab_info to check how much space the sample data takes. Following is the SELECT statement for the XML object size: SELECT t.xml_object_l_size, t.xml_object_p_size, t.data_object_l_size, t.data_object_p_size, t.index_object_l_size, t.index_object_p_size FROM TABLE(admin_get_tab_info('schemaname','tablename')) as t You can project the storage size of the sample to your actual XML document amount. The accuracy of the estimating depends on the sampling and the amount of samples. Note that the monitor element used by the administrative function admin_get_tab_info does not take free pages or free page management into account. It just reports the number of pages on disk.
68
Table spaces for XML: In general, database managed space (DMS) table spaces have better performance than System Managed Space (SMS) table spaces. This is because, unlike SMS table spaces, DB2 can directly access DMS table spaces without going through the operating system. When you create a table with XML columns, you can place XML data and indexes in separate table spaces to use different page sizes and separate configuration parameters, such as prefetch size. Example 3-19 shows creating a table in three different table spaces. The non-long data types are in tablespace2. Indexes are in tablespace3. XML types are considered to be long data types. XML will go into tablespace4.
Example 3-19 Create a table
CREATE TABLE xmltable(c1 char(5), c2 int,c3 char(7), c4 XML) IN tablespace2 INDEX IN tablespace3 LONG IN tablespace4 Buffer pools: A buffer pool belongs to a single database and can be used by more than one table space. When you assign a buffer pool to a table space, the buffer pool and the table space must have the same page size. If you want to assign a buffer pool to multiple tables spaces, all table spaces must have the same page size as the buffer pool page size. Buffer pools reduce disk I/O, therefore buffer pools are crucial for the database performance. The DB2 9 snapshot monitor supports monitoring of XML data in buffer pools. The buffer pool Snapshot Monitor has new XML data counters to help you make decisions on how to tune your buffer pools. In order to use the snapshot monitor, you have to turn on the buffer pool switch. Example 3-20 shows the commands to turn on the buffer pool monitoring switch and get the snapshot data.
Example 3-20 Turn on the buffer pool switch
UPDATE MONITOR SWITCHES USING bufferpool on get snapshot for bufferpools on <database name> Example 3-21 shows the counters from the buffer pool snapshot output. For more details on the counters, see the DB2 9 Information Center.
Example 3-21 The new counters for XML data
relational Data Counters Buffer pool data logical reads Buffer pool data physical reads Buffer pool temporary data logical reads
= 246 = 68 = 132
69
Buffer pool temporary data physical reads Relational and XML Index Counters Buffer pool data writes Buffer pool index logical reads Buffer pool index physical reads Buffer pool temporary index logical reads Buffer pool temporary index physical reads XML Data Counters Buffer pool xda logical reads Buffer pool xda physical reads Buffer pool temporary xda logical reads Buffer pool temporary xda physical reads Buffer pool xda writes
= 0
= = = = =
16323 0 0 0 0
= = = = =
2921 152 0 0 0
create database xmlrb using codeset UTF-8 territory US Through out this book, we discuss the new support for creating database objects such as tables, views, and index. For more information about the features and options for creating database, refer to Administration Guide: Implementation, SC10-4221. DB2 9 has the following new system catalog views for XML indexes and XSR objects: SYSCAT.INDEXXMLPATTERNS: Each row represents a pattern clause in an index over an XML column. SYSCAT.XDBMAPSHREDTREES: Each row represents one shred tree for a given schema graph identifier. SYSCAT.XDBMAPGRAPHS: Each row represents a schema graph for an XDB map (XSR object).
70
SYSCAT.XSROBJECTAUTH: Each row represents a user or group that has been granted the USAGE privilege on a particular XSR object. SYSCAT.XSROBJECTCOMPONENTS: Each row represents an XSR object component. SYSCAT.XSROBJECTDEP: Each row represents a dependency of an XSR object on some other object. The XSR object depends on the object of type BTYPE of name BNAME, so a change to the object affects the XSR object. SYSCAT.XSROBJECTHIERARCHIES: Each row represents the hierarchical relationship between an XSR object and its components. SYSCAT.XSROBJECTS: Each row represents an XML schema repository object. SQL Reference Volume 1, SC10-4249 has more detailed information about system catalog views. In Chapter 5, Managing XML data on page 173, we discuss in more detail the system catalog view for XML index and XSR objects.
71
72
Chapter 4.
73
4.1 XPath
XPath 2.0 is an expression language for processing values that conform to the XQuery/XPath Data Model (XDM). XDM provides a tree representation of XML documents. Values in XDM are sequences containing zero or more items, which could be: Atomic values such as integers, strings, or Booleans XML nodes such as documents, elements, attributes, or texts Example 4-1 shows an XML document, sample.xml. It contains these nodes: Document node: This XML document contains one document node, sample.xml. Comment node: There is only one comment node, which is a sample XML file. Element node: The element nodes are: Customer, Name, FirstName, LastName, Address, Street, City, State, Zip. Both Phone and Email have two occurrences, which make a total of thirteen element nodes. Attribute node: The Address and two Phone elements have attribute nodes associated with them. "Country" is the attribute node for Address and "type" is the attribute node for Phone. Text node: The ten text nodes and their parent element nodes are: "Steve" - FirstName "Ferrington" - LastName "46 Oak Street" - Street "Los Gatos" - City "CA" - State "95030" - Zip "123-456-7890" - Phone "234-567-8901" - Phone "[email protected]" - Email "[email protected]" - Email.
<!-- sample xml file --> <Customer> <Name> <FirstName>Steve</FirstName> <LastName>Ferrington</LastName> </Name> <Address country="US"> <Street>46 Oak Street</Street>
74
<City>Los Gatos</City> <State>CA</State> <Zip>95030</Zip> </Address> <Phone type="work">123-456-7890</Phone> <Phone type="home">234-567-8901</Phone> <Email>[email protected]</Email> <Email>[email protected]</Email> </Customer>
75
Figure 4-1 shows a simplified representation of the data model for sample.xml. The diagram includes a document node (D), comment node (C), element nodes (E), attribute nodes (A), and text nodes (T). It shows that a node can have other nodes as children forming one or more node hierarchies. For example, the element Address is a child of Customer (or to say this another way, the element Customer is a parent of Address). The Address element has one attribute (country) and four child elements (Street, City, State, and Zip).
76
DB2 supports six types of nodes: document, element, attribute, text, processing instruction, and comment. Each type of node has its own set of properties which are different from each other. A nodes properties can include its name, its parent node, and its attributes. Table 4-1 lists node properties.
Table 4-1 Node properties Node property node-name type-name string-value typed-value in-scope namespaces content attributes parent children target Description The name of the node as a QName The dynamic (run-time) type of the node A string value that can be extracted from the node A sequence of zero or more atomic values that can be extracted from the node The in-scope namespaces that are associated with the node The content of the node The sequence of attribute nodes that are children of the current node The node that is parent of the current node The sequence of nodes that are children of the current node The application name as QName
Table 4-2 lists node type supported in DB2 and their properties.
Table 4-2 Node supported in DB2 Node type Document node Description Encapsulates XML document Properties string-value typed-value children node-name type-name string-value typed-value in-scope namespaces attributes parent children
Element node
77
Properties node-name type-name string-value typed-value parent content parent content parent target content parent
Encapsulates XML character content Encapsulates XML processing instruction Encapsulates XML comment
We can illustrate node types and properties using the sample.xml document. The Address and Names are element type nodes. Here are some properties of the Address node: Node-name: The qualified name of the node is Address. Attributes: This node has one attribute, country. Parent: The parent of the node is the Customer node. Children: This node has four children, Street, City, State, and Zip nodes.
78
nodes; and zero or more predicates that filter the sequence produced by the step. The result of an axis step is a sequence of nodes, and each node is assigned a context position that corresponds to its position in the sequence. Context positions allow every node to be accessed by its position. Table 4-3 describes the axes supported in DB2.
Table 4-3 Axes supported in DB2 Axis self child descendant descendant-or-self parent attribute Description Returns the context node. Returns the children of the context node Returns the descendants of the context node Returns the context node and its descendants Returns the parent of the context node Returns the attributes of the context node Direction Forward Forward Forward Forward Reverse Forward
A node test is a condition that must be true for each node that is selected by an axis step. The node test can be either a name test or kind test. A name test filters nodes based on their names. It consists of a QName or a wildcard and, when used, selects the nodes (elements or attributes) with matching QNames. The QNames match if the expanded QName of the node is equal to the expanded QName in the name test. Two expanded QNames are equal if they belong to the same namespace and their local names are equal. Table 4-4 describes all name tests supported in DB2.
Table 4-4 Name tests supported in DB2 Test QName NCName.* *.NCName * Description Matches all nodes whose QName is equal to the specified QName Matches all nodes whose namespace URI is the same as the namespace to which the specified prefix is bound Matches all nodes whose local name is equal to the specified NCName Matches all nodes
79
A kind test filters nodes based on their kind. Table 4-5 describes all kind tests supported in DB2.
Table 4-5 Kind test supported in DB2 Test node() text() comment() processing-instruction() element() attribute() document-node() Description Matches any node Matches any text node Matches any comment node Matches any processing instruction node Matches any element node Matches any attribute node Matches any document node
There are two syntaxes for axis steps: unabbreviated and abbreviated. The unabbreviated syntax consist of an axis name and node test that are separated by a double colon (::). In the abbreviated syntax, the axis is omitted by using shorthand notations. Table 4-6 describes abbreviated syntax supported in DB2.
Table 4-6 Abbreviated syntax supported in DB2 Abbreviated syntax No axis specified @ // Description child::, except when the node test is attribute(). In that case omitted axis is shorthand for attribute::. attribute:: /descendant-or-self::node()/, except when appears in the beginning of the path expression. In that case the axes step selects the root of the tree plus all nodes that are its descendants self::node() parent::node()
. ..
80
CONNECT TO xmlrb; CREATE TABLE xps(id INTEGER NOT NULL, doc XML); 2. Insert our sample XML document into DOC column of XPS table using the command in Example 4-3.
Example 4-3 Inserting sample.xml into xps table
INSERT INTO xps (id, doc) VALUES (1, XMLPARSE ( DOCUMENT '<!-- sample xml file --> <Customer> <Name> <FirstName>Steve</FirstName> <LastName>Ferrington</LastName> </Name> <Address country="US"> <Street>46 Oak Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95030</Zip> </Address> <Phone type="work">123-456-7890</Phone> <Phone type="home">234-567-8901</Phone> <Email>[email protected]</Email> <Email>[email protected]</Email> </Customer>')); 3. We use the template shown in Example 4-4 to execute our path expressions. We will replace the <path_expression> with the actual path expression that is to be executed. The db2-fn:xmlcolumn('XPS.DOC') function returns the sequence of all XML documents stored in the DOC column of the XPS table (this function is discussed in more detail in the next section). We have inserted only one row in the XPS table, so the result of this function call is a sequence containing only sample.xml document, which will be used for all of our examples.
Example 4-4 Template for execution of path expressions
XQUERY db2-fn:xmlcolumn('XPS.DOC')<path_expression>;
81
Later in this chapter, we add more data into XPS table. If you have more data in the table and would like to get the same results as shown in our examples, you can replace the db2-fn:xmlcolumn('XPS.DOC') with db2-fn:sqlquery('select DOC from XPS where id = 1'). The db2-fn:xmlcolumn and db2-fn:sqlquery functions are described later in this chapter. Note: XPath is a case-sensitive language. /Customer and /customer are different.
Executing XQuery
You can execute XQuery using DB2 Command Line Processor (CLP) or DB2 Command Editor. Figure 4-2 shows an XQuery executed using Command Editor.
82
Example 4-5 shows the same XQuery executed using DB2 CLP.
Example 4-5 Executing XQuery at DB2 CLP
E:\SQLLIB\BIN>db2 xquery for $i in (1 to 3) return $i 1 ----------------------------------------------------------------------1 2 3 Because DB2 is not case-sensitive, when executing XQuery using DB2 CLP or Command editor, the key word XQUERY can be lower or upper case. In this chapter, we capitalize the key word XQUERY in our examples for stylistic reasons. You might have to adjust your DB2 CLP settings in order to display the XML output with all the indents and line spacing.
XQUERY db2-fn:xmlcolumn('XPS.DOC')/.;
XQUERY db2-fn:xmlcolumn('XPS.DOC')/Customer;
Example 4-8 Retrieving the root element
XQUERY db2-fn:xmlcolumn('XPS.DOC')/node();
83
XQUERY db2-fn:xmlcolumn('XPS.DOC')/Customer/Address; If we do not know the exact location of the element in the document, instead of specifying the full path (using the child axis), we use the descendant-or-self axis to look at all nodes that are in the hierarchy under the context node. Example 4-10 demonstrate how to retrieve the Address element without specifying its location in the document.
Example 4-10 Retrieving an element anywhere in an XML document
XQUERY db2-fn:xmlcolumn('XPS.DOC')//Address;
XQUERY db2-fn:xmlcolumn('XPS.DOC')/Customer/Address/*;
XQUERY db2-fn:xmlcolumn('PS.DOC')//*; If we have to retrieve all elements in a hierarchy (instead of in the whole document), we first locate the root of the hierarchy, then retrieve the elements.
84
We illustrate this with the hierarchy under the Address element as shown in Example 4-13.
Example 4-13 Fragment of sample.xml containing Address element
... </Name> <Address country="US"> <Street>46 Oak Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95030</Zip> </Address> <Phone type="work">123-456-7890</Phone> ... Example 4-14 shows how to retrieve all elements in a hierarchy including the root of the hierarchy. In Example 4-15 we retrieve all the elements without the root.
Example 4-14 Retrieving hierarchy
XQUERY db2-fn:xmlcolumn('XPS.DOC')//Address/descendant-or-self::element();
Example 4-15 Retrieving hierarchy without root element
XQUERY db2-fn:xmlcolumn('XPS.DOC')//Address//*;
XQUERY db2-fn:xmlcolumn('XPS.DOC')/Customer/Phone/text(); Example 4-17 and Example 4-18 show how to retrieve text nodes containing the address information for a customer.
Example 4-17 Retrieving the address information (A)
XQUERY db2-fn:xmlcolumn('XPS.DOC')//Address//text();
85
XQUERY db2-fn:xmlcolumn('XPS.DOC')//Address/*/text();
XQUERY db2-fn:xmlcolumn('XPS.DOC')/Customer/Address/fn:name(attribute()); Example 4-20 shows how to retrieve attribute values. It returns as result values of type attribute for Phone elements in sample.xml. Similar to our previous example, we use a function call. Here the function is string, which transforms a value to string.
Example 4-20 Retrieving the value of attribute node
XQUERY db2-fn:xmlcolumn('XPS.DOC')/Customer/Address/fn:string(@type));
4.1.4 Predicates
A predicate filters a sequence by keeping only the qualifying items. It consists of a predicate expression that is enclosed in square brackets ([]). The predicate expression is evaluated for each item in the sequence with the selected item as the context item. The result of the evaluation is xs:boolean and is called predicate truth value. Only the items for which the predicate truth value is true are retained. All the other items are filtered. The predicate true value is calculated based on the type of the predicate expression: If the predicate expression returns a numeric value, the predicate truth value is true only for the item that is at the same position in the sequence as the value of the predicate expression. If the predicate returns a nonnumeric value, the predicate truth value is the boolean value of the predicate expression.
86
The most common use of predicates is to filter the result of path expression based on some criteria. Example 4-21 shows how to retrieve the work phone number. We use a predicate to select only those Phone elements that have a type attribute equal to "work".
Example 4-21 Retrieving the work phone number
XQUERY db2-fn:xmlcolumn('XPS.DOC')//Phone[@type=work]/text(); Example 4-22 shows the use of a numeric predicate. It demonstrates how to retrieve the first Email element from sample.xml.
Example 4-22 Usage of numeric predicate
XQUERY db2-fn:xmlcolumn('XPS.DOC')/Customer/Email[1];
4.2 XQuery
XQuery is a functional language that extends XPath. Its basic building blocks are expressions constructed from keywords, operators (symbols), and operands (that are usually other expressions). Expressions can be nested with full generality. A query is composed of a prologue and a body. The query prologue is optional and consists of declarations that define the execution environment of the query. The query body consists of an expression that provides the result of the query. The input and the output of the query are values (instances) of the XDM. In Example 4-23 we show a typical XQuery query. It begins with the XQUERY key word followed with a prologue (optional) and a body. The prologue in our example contains default namespace declaration (the second line). The rest of the query is its body. It consists of one or more XQuery expressions.
Example 4-23 Sample XQuery
XQUERY declare default element namespace "http://sample.name.space.com"; for $cust in db2-fn:xmlcolumn('XPS.DOC') return $cust/Name/LastName;
87
88
There is a constructor function for each of the atomic types which converts a value of one atomic type into an instance of another atomic type. All constructor functions for built-in data types share the same generic syntax: type-name(value) In this expression, value is the value that has to be converted into an instance of the target data type, and the type-name is the target data type. Example 4-24 shows an invocation of the constructor function for the xs:string atomic data type with an argument of integer data type.
89
xs:string(-123)
XQuery expressions
Expressions are the basic building blocks of a query. They can be used alone or in a combination to form complex queries. DB2 supports several kinds of expressions for working with XML data. In our previous section, we discussed path expressions and predicates. Here we present the rest of expression types. The FLWOR expression is presented in the following section.
Primary expressions
Primary expressions are the basic primitive XQuery expressions. They include literals, variable references, parenthesized expressions, context item expressions, constructors, and function calls. Following are some primary expressions: Literals: 12, -134.97, apple Variable references: $i, $phone, $seq Parenthesized expressions: (27 + 16) * (43 - 18) Context item expressions: (1, 3, 5, 7, 9)[2] Constructors: xs:date(2006-08-26), xs:string(a b c) Function calls: fn:true(), fn:abs(-7)
Arithmetic expressions
Arithmetic expressions perform addition, subtraction, multiplication, division, and modulus. Table 4-7 describes the arithmetic operators and lists them in order of operator precedence.
Table 4-7 Arithmetic expressions Operator - (unary), + (unary) *, div, idiv, mod +, Purpose Negates value of operand. Multiplication, division, integer division, modulus. Addition, subtraction.
Comparison expressions
XQuery provides three kinds of comparison expressions: value comparisons, general comparisons, and node comparisons.
Value comparisons compare two atomic values. The operands must be of the
same type, or one of them has to be a subtype of the other. The result of value comparison is Boolean. Table 4-8 lists the value comparison operators in DB2.
90
Table 4-8 Value comparison operators Operator eq ne lt le gt ge Description Returns true if the first value is equal to the second value. Returns true if the first value is not equal to the second value. Returns true if the first value is less than the second value. Returns true if the first value is less than or equal to the second value. Returns true if the first value is greater than the second value. Returns true if the first value is greater than or equal to the second value.
/Customer/Name[FirstName eq Steve]/LastName/text()
91
Table 4-10 lists the results of comparing (1, 2) and (2, 3) sequences.
Table 4-10 Results of sample general comparisons Expression (1, 2) = (2, 3) (1, 2) != (2, 3) (1, 2) < (2, 3) (1, 2) <= (2, 3) (1, 2) > (2, 3) (1, 2) >= (2, 3) Result true true true true false true Comments 2=2 1 != 2 1<2 1 <=2 neither 1 or 2 greater than 2 or 3 2 >= 2
Note that if we add 4 to the first sequence, all general comparisons will return true.
Node comparisons compare the position of two nodes in the document order. The result of comparison is Boolean. Table 4-11 lists the node comparison operators.
Table 4-11 Node comparison operators Operator is << >> Description Returns true if the two nodes have the same identity. Returns true if the first operand node precedes the second operand node in document order. Returns true if the first operand node follows the second operand node in document order.
Logical expressions
Logical expressions use and and or operators. Their arguments are of Boolean type and they return results in boolean value. Table 4-12 lists logical operators. The and operator is with higher precedence than or operator.
Table 4-12 Logical expression operators Operator and or Description Returns true if both arguments are true. Returns true if at least one of arguments is true.
92
Constructors
Constructors create XML structures inside a query. There are two types of constructors: direct and computed.
Direct constructors create XML structures within query using XML-like notation. Example 4-26 shows direct constructor creating Address element containing an attribute and four child elements. The attribute is country and the child elements are Street, City, State, and Zip. Each of the child elements has an text node as its child.
Example 4-26 Using direct constructor
XQUERY <Address country="US"> <Street>46 Oak Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95030</Zip> </Address>; result: <Address country="US"> <Street> 46 Oak Street </Street> <City> Los Gatos </City> <State> CA </State> <Zip> 95030 </Zip> </Address>
93
XQUERY element Address { attribute country {"US"}, element Street {"46 Oak Street"}, element City {"Los Gatos"}, element State {"CA"}, element Zip {"95030"} }; result: <Address country="US"> <Street> 46 Oak Street </Street> <City> Los Gatos </City> <State> CA </State> <Zip> 95030 </Zip> </Address>
Conditional expressions Conditional expressions evaluate one of two expressions based on whether the
value of a test expression is true or false. The structure of a conditional expression is shown in Table 4-13.
Table 4-13 Conditional expression Expression if (test_expr) then expr1 else expr2 Description test_expr is evaluated. If its value is true, the result is the evaluation of expr1, otherwise the result is the evaluation of expr2.
94
Quantified expressions Quantified expressions return true or false depending on whether some or every
item in one or more sequence satisfies a specific condition. Here are two examples: some $i in (1 to 10) satisfies $i mod 7 eq 0 every $i in (1 to 5) , $j in (6, 10) satisfies $i < $j The quantified expression begins with a quantifier: some or every. The quantifier is followed by one or more clauses that bind variables to sequences that are returned by expressions. In our first example, $i is the variable and (1 to 10) is the sequence. In second example, we have two variables $i and $j that are bound to (1 to 5) and (6 to 10). Then we have a test expression in which bound variables are referenced. Test expression is used to determine if some or all of bound variables satisfy a specific condition. In our first example the condition is if $i mod 7 is equal to 0. The qualifier for this expression is some, and there is a value for which the test expression is true, so the result is true. In the second example, the condition is if $i is less than $j. The qualifier is every, so we check to see if every $i is less than every $j. The result is true.
Cast expressions Cast expressions are used to transform a value from one type into another type,
for example, from string to integer. A cast expression takes two arguments: an input expression and a target data type. It evaluates the expression and attempts to create a new value of the target data type based on the result of the evaluation. Here is an example of cast expression converting string to double: "123.456" cast as xs:double
Sequence expressions Sequence expressions are used to construct, combine, and filter sequence of
items. A sequence can be constructed using a comma operator or a range expression. Using a comma operator, we specify two or more items separated by a comma. Example 4-28 shows using the comma operator to create a sequence containing numbers 1, 3, 5, and 7.
Example 4-28 Creating sequence using comma operator
(1, 3, 5, 7) Using the range expression, we create sequences of consecutive integers. We specify the first and the last values, separating them with the to operator. Example 4-29 shows a sequence containing numbers 4, 5, 6, and 7 that is created using a range expression.
95
(4 to 7) Example 4-30 shows that the comma operator and the range expressions can be combined. The resulting sequence is (1, 2, 3, 1, 7, 9, 11).
Example 4-30 Combining comma operator and range expression
(1 to 3, 1, 7, 9 to 9, 11) The result of every expression that returns a sequence can be filtered using a predicate. The combination of primary expression and one or more predicates is called a filtering expression. Example 4-31 shows creating a sequence containing the numbers 5 and 10 using a filtering expression.
Example 4-31 Filtering expression
(4 to 11)[. mod 5 eq 0] The information that is available at the time an expression is evaluated is called dynamic context. The focus, which consists of the context item, context position, and context size, is an important part of the dynamic context. The focus changes as DB2 processes each item in a sequence. The focus consists of the following information: Context item is the atomic value of the node that is currently processed. It can be retrieved using the context item expression, which consists of a single dot (.). Context position is the position of the context item in the sequence that is processed. It can be retrieved using the fn:position() function. Context size is the number of items in the sequence that is processed. In Example 4-31, the sequence that is processed is (4, 5, 6, 7, 8, 9, 10, 11). The context size is eight because the sequence contains eight items. The context position of item 5 is two because it is the second item in the sequence. The context position of 10 is seven. During the processing all the nodes in the sequence are iterated one by one and the atomic value of every one of them is processed as a context item. It is referenced in the predicate using a dot (.). The process of converting the sequence of items into sequence of atomic values is called atomization. Each item in a sequence is converted to an atomic value by applying the following rules: If the item is atomic value, then its value is returned.
96
if the item is a node, then its typed value is returned. The typed value of a node is a sequence of zero or more atomic values that can be extracted from the node. If the node has no typed value, then an error is returned. Example 4-32 shows the atomization. First, two items, <a>4</a> and <b>5</b>, in the sequence are nodes. After the atomization they are converted to 4 and 5. The result of the filtering expression is the same as in Example 4-31 on page 96.
Example 4-32 Atomization
XQuery functions
DB2 supports a set of built-in functions for working with XML data. These functions include DB2-defined functions and XQuery-defined functions.
DB2-defined functions
There are two DB2-defined functions that are used to access XML data from a DB2 database. They belong to the namespace that is bound to the db2-fn prefix. The db2-fn:xmlcolumn function accepts as an argument a name of an XML column in a table or a view and returns as a result a sequence that is the concatenation of the non-null XML values in the specified column. Example 4-33 shows a call to the db2-fn:xmlcolumn function. It returns a sequence of all XML documents stored in the column DOC of XPS table.
Example 4-33 Invocation of the db2-fn:xmlcolumn function
XQUERY db2-fn:xmlcolumn('XPS.DOC'); The db2-fn:sqlquery function accepts as an argument string containing fullselect that specify a single-column result set of XML data type and returns as a result a sequence that is the concatenation of the non-null values returned by the specified fullselect. Example 4-34 shows a call to the db2-fn:sqlquery function. It returns the same result as Example 4-33.
Example 4-34 Invocation of the db2-fn:sqlquery
XQUERY db2-fn:sqlquery('SELECT doc FROM xps'); The db2-fn:sqlquery function is very important. It allows us to integrate SQL statements inside XQuery. We discuss the usage of db2-fn:sqlquery in more detail in 4.3.2, SQL/XML on page 127.
97
XQuery-defined functions
XQuery-defined functions are in the namespace which is bound to the fn prefix. This is the default namespace. XQuery defined functions can be invoked without specifying their namespace (unless you want to override the default function namespace). XQuery-defined functions can be divided into eight different categories based on the type of the data they process, as follows: String functions: Table 4-14 lists the string functions of XQuery-defined functions.
Table 4-14 XQuery-defined string functions Function fn:codepoints-to-string fn:compare fn:concat fn:contains fn:ends-with fn:lower-case fn:matches fn:normalize-space Description Returns the string equivalent of a sequence of Unicode code points. Compares two strings. Returns -1, 0, or 1. Returns a string that is the concatenation of two or more atomic values. Determines whether a string contains a given substring. Returns true or false. Determines whether a string end with a given substring. Returns true or false. Converts a string to lowercase. Determines whether a string matches a given pattern. Returns true or false. Strips leading and trailing whitespace characters from a string and replaces each internal sequence of whitespace characters with a single blank character. Performs Unicode normalization on a string. Compares each set of characters within a string to a given pattern and then replaces the characters that match the pattern with another set of characters. Determines whether a string begins with a given substring. Returns the string representation of a value. Returns a string that is generated by concatenating items separated by a separator character.
fn:normalize-unicode fn:replace
98
Description Returns the length of a string. Returns a sequence of Unicode code points that correspond to a string value. Returns a substring that occurs in a string. Returns a substring that occurs in a string after the end of the first occurrence of a given search string. Returns a substring that occurs in a string before the first occurrence of a given search string. Breaks a string into a sequence of substrings. Replaces selected characters in a string with replacement characters. Converts a string to uppercase.
Boolean functions: Table 4-15 lists the boolean functions of XQuery defined functions.
Table 4-15 XQuery-defined Boolean functions Function fn:boolean fn:false fn:not fn:true fn:zero-or-one Description Returns the effective boolean value of a sequence. Returns the xs:boolean value false. Returns true if its argument is false and false if its argument is true. Returns the xs:boolean value true. Returns its argument if it is a sequence containing zero or one elements. Otherwise, an error is returned.
99
Description Returns the smallest integer that is greater than or equal to the argument. Returns the largest integer that is less than or equal to the argument. Returns the maximum of the values in a sequence. Returns the minimum of the values in a sequence. Converts the value of its argument to xs:double data type. Returns the integer that is closest to the a given numeric value. Returns the integer value with a specified precision that is closest to a given numeric value. Returns the sum of the values in a sequence.
100
101
fn:namespace-uri-from-QName fn:QName
fn:resolve-QName
102
For and return clauses We discuss for and return together in order to show a complete example.
Without the return clause, the FLWOR expression will not be complete. A for clause iterates through the result of an expression and binds a variable to each item in the sequence. Example 4-35 shows the simplest type of for clause containing one variable $i and an expression that results in the sequence (1, 2, 3).
103
XQUERY for $i in (1 to 3) return $i; result: 1 2 3 The expression in the for clause (1 to 3) is evaluated, and it generates the sequence (1, 2, 3). Each of the values in this sequence binds to the variable $i, one at a time. The return clause is evaluated for each of the bindings. The results of these evaluations form a sequence that is returned as a result. In Example 4-35, a return clause is executed for each of the bindings of variable $i and returns a sequence of its values. A for clause can contain multiple variables bound to different expressions. Example 4-36 shows a for clause containing two variables $i and $j, and two expressions whose results are bind to the variables.
Example 4-36 Using for clause with two variables
XQUERY for $i in (1 to 2), $j in (2 to 3) return ($i, $j); result: 1 2 2 2 1 3 2 3 When the for clause is evaluated, a tuple of variable bindings is created for each combination of values. $i can be bound to 1 and to 2. $j can be bound to 2 and to 3. The four possible combinations results in four tuples of variable bindings: $i $i $i $i = = = = 1, 2, 1, 2, $j $j $j $j = = = = 1 1 2 2
104
The return clause executes ones for each tuple of bindings returning the values of $i and $j. If a binding expression evaluates to an empty sequence, no for bindings are generated, and no iterations occur. Example 4-37 shows a for clause where the expression for the variable $j is an empty sequence. As a result, there are no iterations and the result returned by the return clause is an empty sequence.
Example 4-37 Using for clause with empty sequence
Note that even if $j is not part of the return clause, the result is an empty sequence. When a variable iterates over the items in a sequence, an index or position number is generated for each item in the list. You can declare a position variable for this index with the for clause. The positions are integers starting with 1. The positional variable is defined by the keyword at. Example 4-38 shows the usage of positional variables.
Example 4-38 Using for clause with positional variable
XQUERY for $name at $pos in ("Elena", "Maria", "Emma") return <name pos="{$pos}">{$name}</name>; result: <name pos="1"> Elena </name> <name pos="2"> Maria </name> <name pos="3"> Emma </name>
105
Note that the actual order of the elements in the output stream is not guarantied unless the FLWOR expression contains an order by clause. Example 4-39 shows that the results produced by the FLWOR expression in Example 4-38 on page 105 could be in a different order.
Example 4-39 Possible results from FLWOR expression
result: <name pos="2"> Maria </name> <name pos="1"> Elena </name> <name pos="3"> Emma </name>
Let clause Let clauses bind a variable to the entire result of an expression. The let clause
does not iterate through a sequence of input, binding each item to a variable, as the for clause does. Instead, a let clause assigns the whole sequence (or just a value if it is a sequence of one item) to a variable. Example 4-40 shows the differences between for and let clauses.
Example 4-40 Differences between for and let clauses
XQUERY for $i in (1 to 2) return <value>{$i}</value>; result: <value> 1 </value> <value> 2 </value> XQUERY let $i := (1 to 2) return <value>{$i}</value>;
106
result: <value> 1 2 </value> When the let clause in Example 4-40 on page 106 is executed, a single binding is created for the entire sequence that results from (1 to 2). The return clause executes once. If the binding expression is an empty sequence, a let binding is created and it contains an empty sequence.
XQUERY for $i in (1 to 2) let $j := (2 to 3) return <value>{$i, $j}</value>; result: <value> 1 2 3 </value> <value> 2 2 3 </value> A variable that is bound in a for or let clause is in scope and can be used in all of the sub-expressions that appear after the variable binding in the FLWOR expression. In Example 4-42 we use $i to set a value for $j. Note how the results differ from Example 4-41.
107
XQUERY for $i in (1 to 2) let $j := ($i to 3) return <value>{$i, $j}</value>; result: <value> 1 1 2 3 </value> <value> 2 2 3 </value>
Where clause A where clause filters the tuples of variable bindings that are generated by for
and let clauses in a FLWOR expression. The where clause specifies a condition that is applied to each tuple of variable bindings. If the condition is not true, the tuple is discarded. The return clause is executed only for the remaining tuples, so the where clause effectively filters the results. Example 4-43 shows the usage of where clause. We keep only the tuples for which $j is equal to $j and returning only the values that are in both input sequences.
Example 4-43 Using where clause
Order by clause
An order by clause in a FLWOR expression determines the order in which the tuples of variable binding are evaluated by return clause. If no order by clause is specified, the results of a FLWOR expression are returned in a non-deterministic order. An order by clause contains one or more ordering specifications. Each ordering specification consists of an expression and an order modifier, which specifies the sort order (ascending or descending).
108
XQUERY for $name in ("Elena", "Maria", "Emma", "Antoaneta") order by $name ascending return $name; result: Antoaneta Elena Emma Maria
INSERT INTO xps (id, doc) VALUES (2, XMLPARSE ( DOCUMENT '<Customer> <Name> <FirstName>Brad</FirstName> <LastName>Hunn</LastName> </Name> <Address country="US"> <Street>24 Palm Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95030</Zip> </Address> <Phone type="work">123-678-9012</Phone> <Phone type="home">123-789-0123</Phone> <Phone type="cell">123-890-1234</Phone> </Customer>')); INSERT INTO xps (id, doc) VALUES (3, XMLPARSE ( DOCUMENT '<Customer> <Name> <FirstName>Domenico</FirstName> <LastName>Blefari</LastName>
109
</Name> <Address country="US"> <Street>68 Cherry Street</Street> <City>San Jose</City> <State>CA</State> <Zip>95134</Zip> </Address> <Phone type="work">234-901-2345</Phone> <Phone type="home">234-012-3456</Phone> <Email>[email protected]</Email> </Customer>')); We create one more table, CPL, to store the complaints received from customers. SQL statements for creation of this table are shown in Example 4-46.
Example 4-46 Creation of CPL table
CONNECT TO xmlrb; CREATE TABLE cpl(case_id INTEGER NOT NULL, cust_id INTEGER NOT NULL, complain XML); In the CASE_ID column, we store a unique ID for each complaint. In CUST_ID, we store the ID of the complaining customer. In the COMPLAIN column, we store an XML document containing the complaint. A sample complaint is shown in Example 4-47.
Example 4-47 Sample complaint XML document
<Complain status=closed> <Received>2007-07-01</Received> <ReplyTo>[email protected]</ReplyTo> <Problem>Have not received yet my last order.</Problem> </Complain> The structure of the XML document is as follows: <Complain> is the root element of the XML document. It has an attribute status with possible values open and closed. The value of the attribute indicates if the complaint is open or closed. Each other element is a child of this element. <Received> contains a date on which the complaint was received. <ReplyTo> is an e-mail address of the customer as contact information. <Problem> contains a description of the problem submitted by a customer.
110
We use the code shown in Example 4-48 to insert five records in a CPL table.
Example 4-48 Inserting data into CPL table
INSERT INTO cpl (case_id, cust_id, complain) VALUES (1, 1, XMLPARSE (DOCUMENT '<Complain status="closed"> <Received>2006-07-01</Received> <ReplyTo>[email protected]</ReplyTo> <Problem>Have not received yet my last order.</Problem> </Complain>')); INSERT INTO cpl (case_id, cust_id, complain) VALUES (2, 1, XMLPARSE (DOCUMENT '<Complain status="open"> <Received>2006-07-06</Received> <ReplyTo>[email protected]</ReplyTo> <Problem>One of the items received is broken.</Problem> </Complain>')); INSERT INTO cpl (case_id, cust_id, complain) VALUES (3, 1, XMLPARSE (DOCUMENT '<Complain status="open"> <Received>2006-07-11</Received> <ReplyTo>[email protected]</ReplyTo> <Problem>The replacement I received does not work.</Problem> </Complain>')); INSERT INTO cpl (case_id, cust_id, complain) VALUES (4, 3, XMLPARSE (DOCUMENT '<Complain status="open"> <Received>2006-07-3</Received> <ReplyTo>[email protected]</ReplyTo> <Problem>Yesterday was put on hold for 4 hours.</Problem> </Complain>')); INSERT INTO cpl (case_id, cust_id, complain) VALUES (5, 3, XMLPARSE (DOCUMENT '<Complain status="closed"> <Received>2006-07-12</Received> <ReplyTo>[email protected]</ReplyTo> <Problem>Have not received my refund.</Problem> </Complain>'));
111
XQUERY for $doc in db2-fn:xmlcolumn('XPS.DOC') return $doc; Note that this XQuery returns the whole XML document. Example 4-50 shows a fragment of the query output. In our example, the comment line of our first XML document is included.
Example 4-50 XML document returned from XQuery
<!-- sample xml file --> <Customer> <Name> <FirstName> Steve </FirstName> <LastName> Ferrington </LastName> </Name> <Address country="US"> <Street> 46 Oak Street </Street> <City> Los Gatos </City> <State> CA </State> <Zip> 95030 </Zip> </Address> <Phone type="work"> 123-456-7890 </Phone> <Phone type="home"> 234-567-8901
112
</Phone> <Email> [email protected] </Email> <Email> [email protected] </Email> </Customer> ... </Customer> <Customer> .... </Customer> 3 record(s) selected. If you have only the documents root element without other nodes such as processing instructions, you can use the code shown in Example 4-51. Note that the document comment is not included in the result because we retrieve the root element. The root element is the child of the document node, not of the root element.
Example 4-51 Retrieve element only
XQUERY for $cust in db2-fn:xmlcolumn('XPS.DOC')/Customer return $cust; Result: <Customer> <Name> <FirstName> Steve </FirstName> <LastName> Ferrington </LastName> </Name> <Address country="US"> <Street> 46 Oak Street </Street> <City> Los Gatos
113
</City> <State> CA </State> <Zip> 95030 </Zip> </Address> <Phone type="work"> 123-456-7890 </Phone> <Phone type="home"> 234-567-8901 </Phone> <Email> [email protected] </Email> <Email> [email protected] </Email> </Customer> <Customer> ... </Customer> <Customer> ... </Customer> 3 record(s) selected. If you do not know the name of the root element, you can use /* instead of /Customer, as shown here: XQUERY for $cust in db2-fn:xmlcolumn('XPS.DOC')/* return $cust
114
XQUERY for $cust in db2-fn:xmlcolumn('XPS.DOC')/Customer return $cust/Email result: <Email> [email protected] </Email> <Email> [email protected] </Email> <Email> [email protected] </Email> If we do not require the entire Email element, but only the e-mail address itself, we can use the text() function, as shown in Example 4-53.
Example 4-53 Retrieving the text in element
XQUERY for $cust in db2-fn:xmlcolumn('XPS.DOC')/Customer return $cust/Email/text() result: [email protected] [email protected] [email protected]
XQUERY for $cust in db2-fn:xmlcolumn('XPS.DOC')/Customer where $cust/Address/Zip/text()="95030" return concat($cust/Name/FirstName, " ", $cust/Name/LastName)
115
results: Steve Ferrington Brad Hunn Very often we can filter data in the path expression instead of in where clause. Example 4-55 produces the same results as the code in Example 4-54 on page 115. It is your choice of how and where to specify the filter. The rule of thumb is to use the one that will make your code more readable and easier to understand.
Example 4-55 Using predicate instead of where clause
XQUERY for $cust in db2-fn:xmlcolumn('XPS.DOC')/Customer[Address/Zip/text()="95030"] return concat($cust/Name/FirstName, " ", $cust/Name/LastName) results: Steve Ferrington Brad Hunn Another criteria for filtering is the existence or number of occurrences of a specific element in an XML document. In Example 4-56, we show how to select only the customers that do not have an e-mail address.
Example 4-56 Retrieving customers without e-mail address
XQUERY for $cust in db2-fn:xmlcolumn('XPS.DOC')/Customer where not(exists($cust/Email)) return concat($cust/Name/FirstName, " ", $cust/Name/LastName); result: Brad Hunn We use the exists and not functions. Note that not($cust/Email) returns the same result as not(exists($cust/Email)). In Example 4-57 we select only the customers with more than one e-mail address.
Example 4-57 Retrieving customers with more than one e-mail address
116
XQUERY for $cust in db2-fn:xmlcolumn('XPS.DOC')/Customer let $name := concat($cust/Name/FirstName, " ", $cust/Name/LastName) let $n := count($cust/Email) return (concat($name, " : ", $n), $cust/Email/text()); result: Steve Ferrington : 2 [email protected] [email protected] Brad Hunn : 0 Domenico Blefari : 1 [email protected] In Example 4-59, the query outputs only the first e-mail address per customer.
Example 4-59 Selecting the first e-mail address
117
let $name := concat($cust/Name/FirstName, " ", $cust/Name/LastName) return (concat($name, " : ", $cust/Email[1]/text()); result: Steve Ferrington : [email protected] Brad Hunn : Domenico Blefari : [email protected] Assume that we require a list for customer contact information. We prefer e-mail addresses. For customers without e-mail, we would like to have their phone number. XQuery in Example 4-60 generates this information using conditional expressions. Here we select the last e-mail and phone (assuming that it is most recent).
Example 4-60 Using conditional expression
XQUERY for $cust in db2-fn:xmlcolumn('XPS.DOC')/Customer let $name := concat($cust/Name/FirstName, " ", $cust/Name/LastName) let $info := if (exists($cust/Email)) then($cust/Email[last()]/text()) else($cust/Phone[last()]/text()) return (concat($name, " : ", $info)) result: Steve Ferrington : [email protected] Brad Hunn : 123-678-9012 Domenico Blefari : [email protected] You can try to modify the code so that if there is no e-mail address, first look for home phone, then for cell phone, and finally for work phone.
118
<Customer> ... </Customer> ... </CustomerContacts> The code in Example 4-62 does the transformation. Note that we include the whole FLWOR body between two <CustomerContacts> tags. We also use the curly bracket to instruct DB2 to evaluate the enclosed expression rather than treating it as a literal string.
Example 4-62 Transforming an XML document
XQUERY <CustomerContacts>{ for $cust in db2-fn:xmlcolumn('XPS.DOC')/Customer let $name := concat($cust/Name/FirstName, " ", $cust/Name/LastName) let $info := if (exists($cust/Email)) then($cust/Email[last()]/text()) else($cust/Phone[last()]/text()) let $type := if (exists($cust/Email)) then("email") else("phone") return <Customer> <Name>{$name}</Name> <Contact type="{$type}">{$info}</Contact> </Customer> }</CustomerContacts> result: <CustomerContacts> <Customer> <Name> Steve Ferrington </Name> <Contact type="email"> [email protected] </Contact> </Customer> <Customer> <Name> Brad Hunn </Name> <Contact type="phone"> 123-890-1234 </Contact>
119
</Customer> <Customer> <Name> Domenico Blefari </Name> <Contact type="email"> [email protected] </Contact> </Customer> </CustomerContacts>
XQUERY let $e := db2-fn:xmlcolumn('CPL.COMPLAIN')/Complain/ReplyTo/text() let $de := distinct-values($e) for $cust in db2-fn:xmlcolumn('XPS.DOC')/Customer where $cust/Email/text() = $de return $cust/Name/LastName/text() results: Ferrington Blefari
120
WHERE clause. We continue to use the table XPS created in Example 4-2 on page 81 to show how to change an XML document. Example 4-64 shows the SQL SELECT command you can use to display the content of the XML document before and after update.
Example 4-64 Content of the XML that will be updated
SELECT doc FROM xps WHERE id = 1; Example 4-65 shows the SQL UPDATE command that updates the whole XML document.
Example 4-65 Updating an XML document
UPDATE xps SET doc = XMLPARSE ( DOCUMENT ( <!-- sample xml file --> <Customer> <Name> <FirstName>Steve</FirstName> <LastName>Ferrington</LastName> </Name> <Address country="US"> <Street>46 Oak Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95030</Zip> </Address> <Phone type="work">987-654-3210</Phone> <Phone type="home">234-567-8901</Phone> <Email>[email protected]</Email> <Email>[email protected]</Email> </Customer>')) WHERE id = 1
121
Another approach is to create an update stored procedure that is capable of updating XML documents stored in the database. In this section, we introduce an as-is XML update stored procedure XMLUPDATE. We show how to use this stored procedure to update a part of an XML document. For full details and more examples about this stored procedure, refer to the following Web site: http://www.ibm.com/developerworks/db2/library/techarticle/dm-0605singh/ The XMLUPDATE stored procedure supports the following partial update functions in an XML document: Change the value of any text or attribute node. Insert a new element. Replace an element node (along with all its children) with another element. Delete a node. Note that when updating a portion of an XML document with a stored procedure, DB2 writes the entire XML document back to the database under the covers.
122
IN QUERYSQL VARCHAR(32000), IN UPDATESQL VARCHAR(32000), OUT errorCode INTEGER, OUT errorMsg VARCHAR(32000)) DYNAMIC RESULT SETS 0 LANGUAGE JAVA PARAMETER STYLE JAVA NO DBINFO FENCED NULL CALL MODIFIES SQL DATA PROGRAM TYPE SUB EXTERNAL NAME 'db2xmlfunctions:com.ibm.db2.xml.functions.XMLUpdate.Update'
DB2XMLFUNCTIONS.XMLUPDATE (commandXML, querySQL, updateSQL, errorCode, errorMsg) commandXML This is an XML document that describes the update commands. These commands are applied to the XML document specified by querySQL. The structure of the commandXML document is: <update namespaces="name=prefix:namespace"> <update using="SQL" col="column_number" path="XPathExpression"> update value </update> </updates> The essential arguments in this XML document are: @col: This is the number of the column being modified in the querySQL. @path: This is the XPath location of the node in the target XML document. @action: This is the action to be performed: replace: Replace the target node with update node. append: Append the update value as child to the target node. delete: Delete the target node.
123
update value: This should be a text node or an element. querySQ This is a valid SQL statement for retrieving XML document to be updated. updateSQ This is a parameterized update SQL statement.
CALL DB2XMLFUNCTIONS.XMLUPDATE( '<updates> <update action="append" col="1" path="/Customer"> <Email>[email protected]</Email> </update> </updates>', 'select doc from xps where id=1', 'update xps set doc=? where id=1', ?, ?);
Now change the customer address. Example 4-69 shows how to replace an element in an XML document. Here are the arguments we submit: action=replace: We replace an element. path=/Customer/Address: This is the element we are replacing. The new element is listed in Example 4-70.
Example 4-69 Replacing the Address element
124
<Street>123 Woodstone Road</Street> <City>Clinton</City> <State>MS</State> <Zip>39056</Zip> </Address> </update> </updates>', 'select doc from xps where id=1', 'update xps set doc=? where id=1', ?, ?)
Example 4-70 Modified Address element
<Address> <Street>123 Woodstone Road</Street> <City>Clinton</City> <State>MS</> <Zip>39056</Zip> </Address> In Example 4-71 we demonstrate how to update a text element using the XMLUPDATE stored procedure. We replace the work phone with a new number. Here are the parameters we submit: action=replace: We replace a text. path=/Customer/Phone[@type="work"]/text(): The text we replace. Note that we replaced double quote with "e; because we require nested quotes. The new phone number: 601-925-1234
Example 4-71 Changing the work phone number
CALL DB2XMLFUNCTIONS.XMLUPDATE( '<updates> <update action="replace" col="1" path="/Customer/Phone[@type="work"]/text()"> 601-925-1234 </update> </updates>', 'select doc from xps where id=1', 'update xps set doc=? where id=1', ?, ?)
125
XQUERY for $cust in db2-fn:sqlquery('SELECT DOC FROM XPS')/Customer let $name := concat($cust/Name/FirstName, " ", $cust/Name/LastName) let $info := if (exists($cust/Email)) then($cust/Email[last()]/text()) else($cust/Phone[last()]/text()) return (concat($name, " : ", $info)); result: Steve Ferrington : [email protected] Brad Hunn : 123-678-9012 Domenico Blefari : [email protected] As you can see, the results are the same. This is because the SQL SELECT statement that we use in the db2-fn:sqlquery function returns the same data as the db2-fn:xmlcolumn function used in Example 4-60 on page 118. The power of the db2-fn:sqlquery function is that we can utilize all the features of the SQL SELECT statement. Let us suppose that instead of contact information for every customer, we only require contact information for a customer with a specific ID number. The ID field is not stored in the XML document; we can access the data using the SQL statement instead of accessing it from the FLWOR expression. Using the db2-fn:sqlfunction allows us to put this restriction in the WHERE clause of the SQL SELECT statement. In Example 4-73, by embedding SQL in XQuery, we combine XML and relational predicates.
126
XQUERY for $cust in db2-fn:sqlquery('SELECT doc FROM xps WHERE id=1')/Customer let $name := concat($cust/Name/FirstName, " ", $cust/Name/LastName) let $info := if (exists($cust/Email)) then($cust/Email[last()]/text()) else($cust/Phone[last()]/text()) return (concat($name, " : ", $info)); result: Steve Ferrington : [email protected] Note that in the embedded SQL statement, we are not limited only to the table containing the XML column we are retrieving. In Example 4-74 we retrieve the names of all customers that have complaints.
Example 4-74 Retrieving customers with complaints
XQUERY for $cust in db2-fn:sqlquery('SELECT doc FROM xps WHERE EXISTS (SELECT case_id FROM cpl WHERE cust_id = id)')/Customer return $cust/Name/LastName/text(); results: Ferington Blefari
4.3.2 SQL/XML
SQL/XML is part of the XML language. It defines the XML data type and a set of functions for querying, constructing, validating, and transforming XML data. Some of the most often used SQL/XML functions supported in DB2 are XMLQUERY, XMLTABLE, and XMLEXISTS. These functions allow us to embed XQuery expressions in SQL. For a complete list of SQL/XML functions, refer to the DB2 manuals, SQL Reference, Volume 1, SC10-4249, and SQL Reference, Volume 2, SC10-4250.
XMLEXISTS predicate
The XMLEXISTS predicate determines whether an XQuery expression returns a sequence of one or more elements. If the specified XQuery expression returns an empty sequence, XMLEXISTS returns false, otherwise it returns true.
127
In Example 4-75, we demonstrate the XMLEXISTS predicate. It is commonly used in the WHERE clause to express predicates over XML data. Here we display the IDs of all customers that have an e-mail address.
Example 4-75 Using the XMLEXISTS function
SELECT id FROM xps WHERE XMLEXISTS('$d/Customer/Email' passing xps.doc as "d"); results: 1 3 In the XPS table, if you want to see the customer ID of a customer whose first name is "Brad", the query in Example 4-76 returns this information.
Example 4-76 Selecting the customer ID of a specific customer
SELECT XPS.ID FROM XPS WHERE xmlexists('$i/Customer/Name[FirstName = "Brad"]' passing XPS.DOC as "i"); Result: ID -------------------2 1 record(s) selected. Example 4-77 shows an intuitive way of coding a query to get the customer ID of customer Brad. This query returns all the customer IDs in the table, and this is not what is expected.
128
SELECT XPS.ID FROM XPS WHERE XMLEXISTS('$i/Customer/Name/FirstName = "Brad"' PASSING XPS.DOC AS "i"); Result: ID -------------------1 2 3 3 record(s) selected. The XMLEXISTS function tests the existence of XML values and returns TRUE and FALSE. The function returns FALSE only if the XQuery in XMLEXISTS returns an empty sequence. Otherwise, it always returns TRUE. As shown in Example 4-78, the expression Customer/Name/FirstName="Brad" returns TRUE or FALSE, not sequences. When this expression is used in the XMLEXITS function, the function returns TRUE. The condition is met, and DB2 returns all customer IDs in the table.
Example 4-78 XQuery returns TRUE
XQUERY for $i in db2-fn:xmlcolumn('XPS.DOC') /Customer/Name/FirstName = "Brad" return $i; Results: 1 -------------------true 1 record(s) selected.
XMLQUERY function
XMLQUERY is an SQL function that allows you to execute XQuery expressions within an SQL statement. The XMLQUERY function returns a sequence. In Example 4-79 we introduce the XMLQUERY function. It is typically used in a SELECT clause to extract data from an XML column. Here we look for all customers having a zip code equal to 95030. In addition to the ID, we also display the last name extracted from the XML data.
129
SELECT id, XMLQUERY('for $ln in $d/Customer/Name/LastName/text() return $ln' passing xps.doc as "d") FROM xps WHERE XMLEXISTS('$d/Customer/Address/Zip[text()="95030"]' passing xps.doc as "d"); results: 1 <LastName>Ferrington</LastName> 2 <LastName>Hunn</LastName> Note that the type of the column containing the last name is XML. In Example 4-80 we use XMLCAST to cast the XML value to VARCHAR(20).
Example 4-80 Using XMLCAST to convert results from XMLQUERY
SELECT id, XMLCAST( XMLQUERY('for $ln in $d/Customer/Name/LastName/text() return $ln' passing xps.doc as "d") AS VARCHAR(20)) FROM xps WHERE XMLEXISTS('$d/Customer/Address/Zip[text()="95030"]' passing xps.doc as "d"); results: 1 Ferrington 2 Hunn
XMLTABLE function
XMLTABLE is an SQL/XQL function that returns a table from an XQuery expression. XQuery expressions return a sequence of values. However, the XMLTABLE function allows you to execute an XQuery expression and to have the values returned as a table. The table that is returned can contain columns of any SQL type, including XML. In Example 4-81 we use the XMLTABLE function to retrieve multiple XML element values: customer first name, last name, and zip code.
Example 4-81 Using XMLTABLE function
SELECT id, firstname, lastname, zipcode FROM xps, XMLTABLE( 'for $cust in $d/Customer
130
return $cust' passing xps.doc as "d" COLUMNS firstname VARCHAR(20) path 'Name/FirstName/text()', lastname VARCHAR(20) path 'Name/LastName/text()', zipcode VARCHAR(10) path 'Address/Zip/text()') as nameszip results: 1 Steve 2 Brad 3 Domenico
XMLTABLE generates tabular output from XML data. It is very useful for providing us with a relational view of XML data. Suppose that we have to filter the results from Example 4-81 on page 130 in order to display data only for customers with the zip code 95030. We can put the restriction in two different places: In the WHERE clause of the SQL statement in the XQUERY used in the XMLTABLE function call In Example 4-82 we show how to filter results using the WHERE clause.
Example 4-82 Filtering results using WHERE clause
SELECT id, firstname, lastname, zipcode FROM xps, XMLTABLE( 'for $cust in $d/Customer return $cust' passing xps.doc as "d" COLUMNS firstname VARCHAR(20) path 'Name/FirstName/text()', lastname VARCHAR(20) path 'Name/LastName/text()', zipcode VARCHAR(10) path 'Address/Zip/text()') as nameszip WHERE zipcode = '95030' results: 1 Steve 2 Brad
Ferrington Hunn
95030 95030
SELECT id, firstname, lastname, zipcode FROM xps, XMLTABLE( 'for $cust in $d/Customer[Address/Zip/text()="95030"]
131
return $cust' passing xps.doc as "d" COLUMNS firstname VARCHAR(20) path 'Name/FirstName/text()', lastname VARCHAR(20) path 'Name/LastName/text()', zipcode VARCHAR(10) path 'Address/Zip/text()') as nameszip results: 1 Steve 2 Brad
Ferrington Hunn
95030 95030
CONNECT TO xmlrb; CREATE TABLE trtime (zip_code CHAR(10), duration INTEGER); INSERT INTO trtime (zip_code, duration) VALUES ('95030', 5), ('95035', 4), ('95134', 3); To produce a list of all the customers and the delivery days required, we use the zip code to join these two tables. In TRTIME, the zip code is in a relational column; and in XPS, the zip code is in an XML document.
Example 4-85 Joining relational and XML data
SELECT fname, lname, duration FROM trtime, xps, XMLTABLE ( 'for $c in $a/Customer return $c' passing xps.doc as "a" COLUMNS fname varchar (20) path 'Name/FirstName', lname varchar (20) path 'Name/LastName', zip varchar(10) path 'Address/Zip') as T WHERE trtime.zip_code = zip result: Steve Ferrington 5 Brad Hunn 5 Domenico Blefari 3
132
SELECT zip, sum(1) FROM xps, XMLTABLE ('$a/Customer/Address/Zip' passing doc as "a" COLUMNS zip varchar(10) PATH 'text()') as T GROUP BY zip result: 95030 2 95134 1
Plain SQL
Using plain SQL, you can only work with full XML documents. It can be used to insert, update, or delete XML documents. When you retrieve a full XML document using plain SQL, your selection of the document has to be based only on relational (non-XML) data.
133
SQL/XML
SQL/XML allows you to do the following operations: Use predicates on both relational and XML data. Access and extract fragments of XML data. Use aggregation and grouping of XML data on the SQL level. Join relational and XML data. Pass parameters to XQuery expressions.
XQuery
XQuery is a language specifically designed for querying XML data. It is very suitable if you have to work with XML data only. You can easily extract, transform, and join XML data using XQuery.
134
Feature Join XML data Join XML with relational data Aggregate and group XML data Call external functions Pass parameter markers Execute full text search
SQL ---++ + +
SQL/XML + ++ ++ ++ ++ ++
XPath ++ -+/----
The following legends are used in the table: ++ Indicates that the feature is fully supported by the language. + Indicates that the feature is supported. However, using another language might be easier and more efficient. +/- Indicates that feature could be expressed but it is difficult and inefficient. -- Indicates that the feature is not supported. Following are some guidelines for choosing languages to perform the required activities: Insert, update, and delete XML documents: INSERT, UPDATE, and DELETE statements can be used to insert, update, or delete XML documents. If we can identify the required documents without accessing XML data, SQL is enough. Otherwise, we have to use SQL/XML. Retrieve full XML documents: All four languages support retrieval of full XML documents. If identifying the required document is based only on relational data, we can use SQL. If it is based only on the XML data, we can use XQuery. If we have to use both relational and XML data to identify required documents, we have to use SQL/XML or XQuery with embedded XQL. Retrieve parts of XML documents: Both SQL/XML and XQuery can be used for retrieving parts of XML document. Transform XML data: The easiest way is to use XQuery. It can be done also with SQL/XML but is usually more difficult to code.
135
Use relational predicates: Plain XQuery does not support relational predicates. You have to embed SQL in XQuery or just use SQL or SQL/XML. Use XML predicates: Plain SQL does not support them. You have to use SQL/XML or XQuery. Join XML data: The easiest way is to use XQuery. It can be done with SQL/XML, but usually is more difficult to code. Join XML with relational data: Both SQL/XML and XQuery with embedded SQL can be used. Aggregate and group XML data: The easiest way is to use SQL/XML and integrated SQL functions. Aggregating and grouping XML data can be done with XQuery with embedded SQL but is more difficult to code. Call external functions: XQuery is the only one not supporting external function call. Pass parameter markers: XQuery is the only one not supporting parsing parameter markers.
Namespaces in XML
XML namespaces are defined to avoid naming conflicts. Elements from different documents can have the same name, but completely different content. In this section, the example we use is an Address element for storing customer address information. The structure of the Address element with some sample data is shown in Example 4-87.
Example 4-87 Address element containing customer address
136
</Address> It is possible that in another XML document, an element named Address is used to store an IP address of a system. In Example 4-88 we show such an element.
Example 4-88 Address element containing IP address
<Address>123.123.123.123</Address> There will be a naming conflict if we want to use both Address elements in a single document. This conflict can be avoided by using namespaces.
Namespace declaration
A namespace is defined by associating a prefix to a unique Uniform Resource Identifier (URI). Then this prefix can be used to define element names. The full prefixed name is known as a qualified name or QName. It consists of two parts: the prefix, known as the namespace prefix; and the element name, known as the local name. In Example 4-89, we define two namespaces. The URI for the first one is http://first.sample.address.space.com, and the URI for the second one is http://second.sample.address.space.com. Note that XML interprets URIs as strings, and the URLs do not have to point to real locations.
Example 4-89 Definition of namespaces
<sample xmlns:fns="http://first.sample.name.space.com" xmlns:sns="http://second.sample.name.space.com"> <fns:Address fns:country="US"> <fns:Street>46 Oak Street</fns:Street> <fns:City>Los Gatos</fns:City> <fns:State>CA</fns:State> <fns:Zip>95030</fns:Zip> </fns:Address> <sns:Address> 123.123.123.123 </sns:Address> </sample> We use the xmlns attribute to define a namespace and to bind its prefix to a URI. Then we indicate that an element belongs to that namespace by prefixing it. We can define a default namespace. It will be applied to all elements that are without a prefix and appear within the element containing the declaration of the default namespace. Example 4-90 shows the use of a default namespace.
137
<sample xmlns="http://first.sample.name.space.com" xmlns:sns="http://second.sample.name.space.com"> <Address fns:country="US"> <Street>46 Oak Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95030</Zip> </Address> <sns:Address> 123.123.123.123 </sns:Address> </sample> Here we define http://first.sample.name.space.com to be the default namespace. Note that we do this by not specifying any prefix. It is the default namespace for the sample element and all of its descendant elements. If any element is defined without a prefix, it belongs to the default namespace. In our example these elements are: sample, Address (first occurrence), Street, City, State, Zip. The second Address element is prefixed by sns and it belongs to the other namespace we defined, http://second.sample.name.space.com. We can use namespaces not only with element names, but also with attribute and function names. Note that the default namespace does not apply to attribute names. If an attribute name has no prefix, it does not belong to any namespace. In Example 4-90, if we omit the fns prefix of the country attribute of the Address element, it will not belong to any namespace.
Predeclared namespaces
There is a set of predeclared namespaces in DB2. You can use their prefixes directly in your code without any explicit declaration. Table 4-24 lists these namespaces. Note that the URI associated with the xml prefix cannot be redefined.
Table 4-24 Predeclared namespaces Prefix xml xs xsi fn URI http://www.w3.org/XML/1998/namespace http://www.w3.org/2001/XMLShema http://www.w3.org/XMLShema-instance http://www.w3.org/2005/xpath-functions Description XML reserved namespace XML schema namespace XML schema instance namespace Default function namespace
138
declare namespace fns = "http://first.sample.name.space.com"; During the query processing, XQuery expands the QNames and replaces the namespace prefix with the URI that is bound to it. Two QNames are equal if both their local names and namespace URLs are equal.
CONNECT TO xmlrb; CREATE TABLE nss(id INTEGER NOT NULL, doc XML); INSERT INTO nss (id, doc) VALUES (1, XMLPARSE ( DOCUMENT '<sample xmlns="http://namespace.sample.com/one" xmlns:two="http://namespace.sample.com/two"> <eone> this is in default namespace </eone> <two:eone anoone="nons" two:atwo="nstwo"> this is in namespace two </two:eone> </sample>'));
139
Local-name, name, and namespace-uri functions The fn:local-name function returns the local name of a node. The name function returns the prefix and local part of a node name. The namespace-uri function
returns the namespace URI of the node QName. In Example 4-93, we use these functions to display the local names, QNames, and namespace URIs of all element and attribute nodes from the ns-sample.xml document.
Example 4-93 Using local-name, name, and namespace-uri functions
XQUERY let $d := db2-fn:sqlquery('SELECT doc FROM nss WHERE id = 1') for $e in $d//element() return <element> {fn:local-name($e), fn:name($e), fn:namespace-uri($e)} </element>; result: <element> sample sample http://namespace.sample.com/one </element> <element> eone eone http://namespace.sample.com/one </element> <element> eone two:eone http://namespace.sample.com/two </element> XQUERY let $d := db2-fn:sqlquery('SELECT doc FROM nss WHERE id = 1') for $e in $d//attribute() return <attribute> {fn:local-name($e), fn:name($e), fn:namespace-uri($e)} </attribute>; result: <attribute> anoone anoone </attribute> <attribute> atwo two:atwo http://namespace.sample.com/two </attribute>
140
Note that there is no namespace URI for the anoone attribute because the default namespace is not applied for the attributes.
XQUERY declare default element namespace "http://namespace.sample.com/one";let $d := db2-fn:sqlquery('SELECT doc FROM nss WHERE id = 1') return fn:in-scope-prefixes ($d/sample); result: xml two The result includes three prefixes. The xml prefix is always in-scope. If there is a default namespace, it is presented with a string with length zero. In the result, it is the first record.
QName and node-name functions The QName function returns an expanded name (of xs:QName type) that is
constructed from a namespace URI and a string that contains a lexical QName. In Example 4-95 we show how to use the QName function. The returned value is an xs:QName value with namespace URI of "http://sample.name.space.com", a prefix of "sampleprefix", and local name "localname".
Example 4-95 Using QName function
XQUERY fn:QName ("http://sample.name.space.com", "sampleprefix.localname"); result: sampleprefix.localname The node-name function returns the expanded name (of xs:QName type) for the given node of an XML document.
local-name-from-QName and namespace-uri-from-QName functions The local-name-from-QName function returns the local part of an xs:QName value. The namespace-uri-from-QName function returns the namespace URI
141
part of an xs:QName value. In Example 4-96, we demonstrate the use of these two functions. We display the local names and the namespace URIs for all elements in the ns-sample.xml document.
Example 4-96 Using local-name-from-QName and namespace-uri-from-QName
XQUERY let $d := db2-fn:sqlquery('SELECT doc FROM nss WHERE id = 1') for $e in $d//element() return <element> {fn:local-name-from-QName(fn:node-name($e)), fn:namespace-uri-from-QName(fn:node-name($e))} </element>; result: <element> sample http://namespace.sample.com/one </element> <element> eone http://namespace.sample.com/one </element> <element> eone http://namespace.sample.com/two </element>
142
VALUES (101, (SELECT doc FROM xps WHERE id = 1)); In Example 4-98, we use XML publishing functions to construct the XML value then insert it into an XML column.
Example 4-98 Inserting XML data using constructed data
INSERT INTO nss (id, doc) VALUES (102, XMLDOCUMENT(XMLELEMENT(name "Test", 'Test Element'))); Another option of preparing XML data to be inserted into an XML column is using a string representation of an XML document. In any case, DB2 ensures that the provided value is a well-formed XML document.
XML parsing
XML parsing is the process of transforming XML data from the string representation to a hierarchical (tree-like) format. Parsing can be implicit or explicit. XMLPARSE function is used to explicitly check the correctness of an XML value in an SQL statement. If an XMLPARSE function is not specified in an INSERT statement, it will be used implicitly to make sure that the provided value is a well-formed XML document. In Example 4-99, we show the INSERT statement with and without explicit call to XMLPARSE function.
Example 4-99 Implicit and explicit parsing
INSERT INTO nss (id, doc) VALUES (103, '<sample>with no explicit parsing</sample>'); INSERT INTO nss (id, doc) VALUES (104, XMLPARSE (DOCUMENT '<sample>with explicit parsing</sample>')); The syntax of the XMLPARSE function is: XMLPARSE (DOCUMENT <string value> [PRESERVE|STRIP WHITESPACE]) XMLPARSE provides two options to control whitespace processing: STRIP WHITESPACE: Removes the extra whitespace. This is the default option. PRESERVE WHITESPACE: Preserves the whitespace in the string value. In Example 4-100, we illustrate the use of PRESERVE WHITESPACE and STRIP WHITESPACE. The XML document we insert with each of these options contains several spaces that are kept or deleted depending on the option we provide for WHITESPACE.
143
INSERT INTO nss (id, doc) VALUES (105, XMLPARSE (DOCUMENT '<sample> </sample>' PRESERVE WHITESPACE)); INSERT INTO nss (id, doc) VALUES (106, XMLPARSE (DOCUMENT '<sample> </sample>' STRIP WHITESPACE)); XQUERY db2fn-sqlquery('SELECT doc FROM nss WHERE id = 105'); XQUERY db2fn-sqlquery('SELECT doc FROM nss WHERE id = 106'); results: <sample> </sample> <sample/>
XML validation
XML validation is the process of checking whether the structure, data types, and content of an XML document are valid. XML validation adds type annotations to XML elements, attributes, and values. It also strips off ignorable whitespace in an XML document. The ignorable whitespace is whitespace that can be eliminated from an XML document. The XML schema document determines which whitespace is ignorable whitespace. The XMLVALIDATE function is used to validate an XML document against a schema document. The schema defines the structure of the XML document, including node types, default values, and multiple occurrences. Before using a schema document, it has to be registered (see 5.2, Schema management on page 198 for more details). Example 4-101 shows the use of the XMLVALIDATE function.
Example 4-101 Validating XML document using XMLVALIDATE function
INSERT INTO nss (id, doc) VALUES (105, XMLVALIDATE (XMLPARSE (DOCUMENT '<sample>with explicit parsing</sample>') ACCORDING TO XMLSCHEMA ID schemaid)) Here we suppose that schema ID is a registered XML schema. If we pass as an argument a value of character data type, it will be implicitly parsed before validation.
XML encoding
The encoding of XML data can be determined in two ways: It can be derived from the data itself, which is known as internally encoded data.
144
It can be derived from external sources, which is known as externally encoded data. The application data type that is used to exchange the XML data between the application and an XML column determines how the encoding is derived. XML data that is in character or graphic application data types is considered to be externally encoded. XML data that is in these data types is encoded in the application code page. XML data that is in binary application data type or binary data that is in a character data type is considered to be internally encoded. With internal encoding, the content of the data determines the encoding. DB2 derives the internal encoding from the document content according to the XML standard. It can be derived from three components: Unicode Byte Order Mark (BOM): BOM is a byte sequence of Unicode character codes. It is in the beginning of XML data. The BOM indicates the byte order of data followed. DB2 recognizes BOM only for XML data type. For XML data that is stored in a column of a non XML data type, it is treated as an ordinary data. XML declaration: An XML document can contain processing instructions that provide specific information about the XML data. Encoding declaration: This is an optional part of an XML declaration that specifies the encoding used in the document. Here, we list some encoding considerations for placing XML data into a database: If you have externally encoded XML data (data sent to the DB2 server using character data types): If its internal and external encoding is not Unicode, be sure that any internally encoded declarations match the external encoding. Otherwise, DB2 rejects the XML document. If both internal and external encoding are Unicode, but the encoding schema does not match, DB2 ignores the internal encoding. If you have internally encoded XML data (data sent to the DB2 server using binary data types): The sending application must ensure that data contains the correct encoding information.
145
<DeliveryTimes> <Entry> <Zip>12345</Zip> <Duration>2</Duration> </Entry> <Entry> <Zip>23456</Zip> <Duration>4</Duration> </Entry> <Entry> <Zip>34567</Zip> <Duration>6</Duration> </Entry> </DeliveryTimes>
146
2. Tables and columns: We decide on the relational tables into which we will shred the XML data. In our example, we use the TRTIME table defined in Example 4-84 on page 132. This table has two columns: ZIPCODE and DURATION, and we will use them to store the information from Zip and Duration elements from our XML document. 3. Schema: We require a schema document that describes the structure of the XML data. Example 4-103 here shows a schema for the XML file given in Example 4-102 on page 146. This schema states that the root element of the document is DeliveryTimes, which can contain several Entry elements. Every Entry element contains a Zip element and a Duration element.
Example 4-103 Schema document s.xsd
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="DeliveryTimes"> <xsd:complexType> <xsd:sequence maxOccurs="unbounded"> <xsd:element name="Entry"> <xsd:complexType> <xsd:sequence> <xsd:element name="Zip" type="xsd:string"/> <xsd:element name="Duration" type="xsd:integer"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xs:dsequence> </xsd:complexType> </xsd:element> </xsd:schema> 4. Adding annotations to our schema: This step describes in which column to store which XML data. It is done by adding annotations to our schema. In our example, the annotations are very simple. We specify that the Zip element will be stored in the ZIP_CODE column of the TRTIME table and that the Duration element will be stored in the DURATION column of the TRTIME table. Annotations can be used for describing much more complex shredding rules. For more information, refer to DB2 XML Guide, SC10-4254.
147
In Example 4-104 we show our schema with the added annotations. Note that you have to replace the value of <db2-xdb:defaultSQLSchema> with the DB2 schema you are using.
Example 4-104 Schema document with annotations sa.xsd
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:db2-xdb="http://www.ibm.com/xmlns/prod/db2/xdb1"> <xsd:annotation> <xsd:appinfo> <db2-xdb:defaultSQLSchema>ADMIN</db2-xdb:defaultSQLSchema> </xsd:appinfo> </xsd:annotation> <xsd:element name="DeliveryTimes"> <xsd:complexType> <xsd:sequence maxOccurs="unbounded"> <xsd:element name="Entry"> <xsd:complexType> <xsd:sequence> <xsd:element name="Zip" type="xsd:string" db2-xdb:rowSet="TRTIME" db2-xdb:column="ZIP_CODE"/> <xsd:element name="Duration" type="xsd:integer" db2-xdb:rowSet="TRTIME" db2-xdb:column="DURATION"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema> 5. Registering the schema to DB2 XML Schema Repository (XSR): The DB2 XML Schema Repository is described in more detail in 5.2, Schema management on page 198. We use the command shown in Example 4-105 to register our annotated XML schema and to specify that it will be used for decomposition.
Example 4-105 Registering sa.xsd schema
148
6. Shredding XML data: Now we are ready for shredding. We use the DECOMPOSE command as shown in Example 4-106.
Example 4-106 Shredding s.xml file
DECOMPOSE XML DOCUMENT d:/work/db2/s.xml XMLSCHEMA admin.test_shr Issuing the command shown in Example 4-107 could check that the data from the s.xml file is inserted into the TRTIME table.
Example 4-107 Verifying results of decomposition
SELECT * FROM TRTIME results: ZIP_CODE 95030 95035 95134 12345 23456 34567
DURATION 5 4 3 2 4 6
149
The most suitable data type for converting XML data is BLOB because retrieval of binary data minimizes encoding problems.
XMLELEMENT function
This function creates an XML element node. The arguments include an element name, optional namespace declarations, optional attributes, and zero or more expressions that are the elements content. In Example 4-108, we show how to create an XML element using the XMLELEMENT function. Note that the XMLELEMENT function can be nested.
Example 4-108 Using XMLELEMENT function
VALUES (XMLELEMENT (name "Name", XMLELEMENT (name "FirstName", 'Steve'), XMLELEMENT (name "LastName", 'Ferrington'))); Result: <Name> <FirstName>Steve</FirstName> <LastName>Ferrington</LastName> </Name>
XMLATTRIBUTES function
This function creates an attribute node. It can be called only as an argument of the XMLELEMENT function. In Example 4-109 we demonstrate how to use the XMLATTRIBUTES function to add an attribute gender to Name element. You can create several attributes using one call to the XMLATTRIBUTES function.
Example 4-109 Using XMLATTRIBUTES function
VALUES (XMLELEMENT (name "Name", XMLATTRIBUTES ('MALE' as "gender"), XMLELEMENT (name "FirstName", 'Steve'), XMLELEMENT (name "LastName", 'Ferrington'))) results: <Name gender="MALE"> <FirstName>Steve</FirstName> <LastName>Ferrington</LastName> </Name>
150
XMLFOREST function
This function creates a sequence (forest) of XML element nodes. In Example 4-110 here we use the XMLFOREST function to produce the same result as in Example 4-109 on page 150.
Example 4-110 Using XMLFOREST function
VALUES (XMLELEMENT (name "Name", XMLATTRIBUTES ('MALE' as "gender"), XMLFOREST ('Steve' as "FirstName", 'Ferrington' as "LastName"))) results: <Name gender="MALE"> <FirstName>Steve</FirstName> <LastName>Ferrington</LastName> </Name>
XMLDOCUMENT function
This function creates an XML document node. Every XML document stored in an XML column must have a document node. It contains the root XML element and optional comments and processing instructions. A document node is not visible in the serialized string representation of XML. In Example 4-111, we show the usage of the XMLDOCUMENT function. We insert an XML document into the NSS table. Note that if we omit the XMLDOCUMENT function (and use only the XMLELEMENT function), there will be an error.
Example 4-111
INSERT INTO nss (id, doc) VALUES (999, XMLDOCUMENT( XMLELEMENT (name "Name", XMLATTRIBUTES ('MALE' as "gender"), XMLFOREST ('Steve' as "FirstName", 'Ferrington' as "LastName"))));
XMLNAMESPACES function
This function creates namespace declarations. It can be called only as an argument from the XMLELEMENT, XMLFOREST, and XMLTABLE functions. In Example 4-112, we use the XMLNAMESPACES function to define the default namespace while creating the Name element.
Example 4-112 Using XMLNAMESPACES function
151
XMLATTRIBUTES ('MALE' as "gender"), XMLFOREST ('Steve' as "FirstName", 'Ferrington' as "LastName"))))) results: <Name xmlns="http://sample.default.nspace.com" gender="MALE"> <FirstName>Steve</FirstName> <LastName>Ferrington</LastName> </Name>
XMLCONCAT function
This function creates a sequence of variable number of XML input arguments. In Example 4-113 here, we use the XMLCONCAT function to produce the same result as in Example 4-112 on page 151.
Example 4-113 Using XMLCONCAT function
VALUES (XMLELEMENT (name "Name", XMLNAMESPACES (DEFAULT 'http://sample.default.nspace.com'), XMLATTRIBUTES ('MALE' as "gender"), XMLCONCAT( XMLELEMENT (name "FirstName", 'Steve'), XMLELEMENT (name "LastName", 'Ferrington')))) Results: <Name xmlns="http://sample.default.nspace.com" gender="MALE"> <FirstName>Steve</FirstName> <LastName>Ferrington</LastName> </Name>
XMLCOMMENT function
This function creates a comment node. We show its usage in Example 4-114.
Example 4-114 Using XMLCOMMENT function
VALUE(XMLELEMENT (name "Name", XMLATTRIBUTES ('MALE' as "gender"), XMLCOMMENT ('Comment line'), XMLELEMENT (name "FirstName", 'Steve'), XMLELEMENT (name "LastName", 'Ferrington'))) Result: <Name gender="MALE"> <!--Comment line-->
152
XMLPI function
This function creates a processing instruction node. We show the usage of the XMLPI function in Example 4-115.
Example 4-115 Using XMLPI function
VALUE(XMLELEMENT (name "Name", XMLATTRIBUTES ('MALE' as "gender"), XMLPI (name "Instruction", 'Do nothing'), XMLELEMENT (name "FirstName", 'Steve'), XMLELEMENT (name "LastName", 'Ferrington'))) Results: <Name gender="MALE"> <?Instruction Do nothing?> <FirstName>Steve</FirstName> <LastName>Ferrington</LastName> </Name>
XMLTEXT function
This function creates a text node. We illustrate its usage in Example 4-116.
Example 4-116 Using XMLTEXT function
VALUE(XMLELEMENT (name "Name", XMLELEMENT (name "FirstName", XMLTEXT ('Steve')), XMLELEMENT (name "LastName", XMLTEXT ('Ferrington')))) Results; <Name> <FirstName>Steve</FirstName> <LastName>Ferrington</LastName> </Name>
153
Application
Java, JDBC, C, C++
SQL integrated Stored procedure
Internet/Intranet
End User
Administration db2text ... DB2 Control Center
DB2 client
Client Server
DB2 server
Admin.
Search Engine
Database
NSE Admin
Text Indexes
filesystem
User tables
NSE meta data
Table space
Portions of this chapter are excerpted from the article XML full-text search in DB2 by Holger Seubert and Sabine Perathoner, originally published in IBM developerWorks , June 2006. http://www.ibm.com/developerworks/db2/library/techarticle/dm-0606seubert/index.html
154
DB2 NSE offers an efficient and intelligent method for searching full-text documents stored in a DB2 database using SQL queries. Instead of sequentially searching through the text data using string matching, like the way search is done with the XQuery contain() function, DB2 NSE searches textual data that is stored in the column of a DB2 database table using a text index, which typically consists of significant terms that are extracted from the text document. With XML data, the significant terms and their locations in the XML document structure are maintained in a text index. DB2 NSE is a separately installed feature that is shipped with DB2 9. Its text search capability is integrated into SQL and optimized by the DB2 optimizer for run time. The NSE administration function can be invoked from the DB2 Control Center to prepare administrative tasks such as creating, updating, or deleting text indexes.
We walk you through these steps using a simple table defined in Example 4-117. The COMMENT column stores the customer feedback in XML documents.
Example 4-117 Defining FEEDBACK table
CREATE TABLE FEEDBACK ("APPL_ID" INTEGER primary key not null, "COMMENT" XML); Example 4-118 shows the five sample customer feedback XML documents.
Example 4-118 Sample data for comment column
---comment1.xml----<?xml version="1.0" encoding="UTF-8"?> <feedback xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="C:\Documents and Settings\chng1me.T40-92U-V46\IBM\rationalsdp6.0\workspace\aaa\WebConten t\WEB-INF\feedback.xsd"> <entry>
155
<dateOfEntry> 2005-07-07 </dateOfEntry> <rating>4</rating> <comment>Excellent on-line application.</comment> </entry> <entry> <dateOfEntry> 2005-09-12 </dateOfEntry> <rating>2</rating> <comment>I still have not heard from the bank. Not sure the satus of my application. Slow response time.</comment> </entry> </feedback> --- comment2.xml ----<?xml version="1.0" encoding="UTF-8"?> <feedback xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <entry> <dateOfEntry> 2005-08-09 </dateOfEntry> <rating>5</rating> <comment lan=en>Quick approve time.</comment> </entry> </feedback> --- comment3.xml --<?xml version="1.0" encoding="UTF-8"?> <feedback xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <entry> <dateOfEntry> 2005-08-09 </dateOfEntry> <rating>5</rating> <comment lan=en>Good service and quick Response time. I will recommand to others.</comment> </entry> </feedback> --- comment4.xml --<?xml version="1.0" encoding="UTF-8"?> <feedback xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <entry> <dateOfEntry> 2005-08-16 </dateOfEntry> <rating>3</rating> <comment lan=en>There should be more loan prodcuts.</comment> </entry> </feedback> --- comment5.xml --<?xml version="1.0" encoding="UTF-8"?>
156
<feedback xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <entry> <dateOfEntry> 2005-08-21 </dateOfEntry> <rating>2</rating> <comment lan=en >The interests rate is too high.</comment> </entry> </feedback> Example 4-119 shows the INSERT statement used to insert five customer feedbacks.
Example 4-119 SQL statement to insert data into the table FEEDBACK.
INSERT INTO feedback VALUES( Preserve whitespace)); INSERT INTO feedback VALUES( Preserve whitespace)); INSERT INTO feedback VALUES( Preserve whitespace)); INSERT INTO feedback VALUES( Preserve whitespace)); INSERT INTO feedback VALUES( Preserve whitespace));
100, xmlparse(document'<comment1.xml>' 200, xmlparse(document'<comment2.xml>' 300, xmlparse(document'<comment3.xml>' 400, xmlparse(document'<comment4.xml>' 500, xmlparse(document'<comment5.xml>'
157
158
159
Example 4-120 shows how to use the scalar CONTAIN() function to perform full-text search.
Example 4-120 The scalar CONTAINS() function
SELECT appl_id FROM feedback WHERE CONTAINS(comment, ' "good" ')=1 Example 4-122 shows how to use XMLQUERY in a query. The query returns the comment element of the feedback that contains the term good.
Example 4-122 Using NSE function in combination with XMLQUERY()
SELECT XMLQUERY('$com//comment' passing comment as "com") FROM feedback WHERE CONTAINS(comment, ' "good" ')=1 The query in Example 4-122 can be written in XQuery as shown in Example 4-123.
Example 4-123 Using text search functionality in XQUERY
XQUERY for $com in db2-fn:sqlquery("select comment from feedback WHERE CONTAINS(comment, ' "good" ')=1") return $com//comment
160
DB2 NSE supports the abbreviated XPath location-step syntax, the use of the child axis(/), and the attribute axis (@). No other XPath expression or functions are supported. Example 4-124 shows a query to search for the term good and service that is limited to the COMMENT element of the XML document.
Example 4-124 Limit search with XMLQUERY
SELECT XMLQUERY('$com//comment' passing comment as "com") FROM feedback WHERE CONTAINS(comment, ' section("/feedback/entry/comment") ("good","service") ')=1 The SECTION clause used as part of the search argument indicates the part within the XML document where text search occurs. The query in Example 4-124 can be written in XQuery as shown in Example 4-125.
Example 4-125 Limit search with XQuery
XQUERY for $com in db2-fn:sqlquery("select comment from feedback WHERE CONTAINS(comment, 'section("/feedback/entry/comment") ("good","service") ')=1") return $com//comment In addition, text search can be limited to specific XML attributes. Example 4-126 shows how to use the attribute axis in the query.
Example 4-126 Limit search on XML attributes
SELECT XMLQUERY('$com//rating' passing comment as "com") FROM feedback WHERE CONTAINS(comment, ' section("/feedback/entry/comment/@lan") "en" ')=1; Example 4-127 shows how searching in different sections can be combined using the AND Boolean operator (&).
Example 4-127 searching with AND Boolean operator
SELECT XMLQUERY('$com//rating' passing comment as "com") FROM feedback WHERE CONTAINS(comment, ' section("/feedback/entry/comment/@lan") "en" & section("/feedback/entry/rating")"5"')=1;
161
Proximity search
The query in Example 4-128 finds all the application ID of those feedbacks with the terms good and quick response not only in the COMMENT element of the XML document, but also in the same sentence.
Example 4-128 Search for the terms good and quick response in the same sentence
select appl_id from feedback where CONTAINS(comment, ' section("/feedback/entry/comment") "good" in same sentence as "quick response"')=1 There are two constraints in Example 4-128. The first one is CONTAINS(comment, ' section("/feedback/entry/comment"). It limits the search in the COMMENT element. The second one is "good" in same sentence as "quick response". It searches for the terms good and quick response in the same sentence. Example 4-129 shows how to search using tokens instead of phrases.
Example 4-129 Token search
select appl_id from feedback where CONTAINS(comment, ' section("/feedback/entry/comment") "good" in same sentence as ("quick", "response")')=1
Boolean operators
Using Boolean operators AND (&), OR (|) and NOT, you can combine different search terms with other search terms. Example 4-130 combines several search terms by using the Boolean operators AND and OR.
Example 4-130 Using Boolean operators AND and OR
select appl_id from feedback where CONTAINS(comment, ' section("/feedback/entry/comment") "good" & "quick" | "response" ')=1 The query in Example 4-130 returns the application ID of those feedbacks having the term good and quick or response in the element comment.
162
Using the Boolean operator NOT, you can exclude particular terms from the search result. For example, the following query in Example 4-131 searches for document having the term good and quick and excluding the term response in the COMMENT element.
Example 4-131 Using Boolean operator NOT
select appl_id from feedback where CONTAINS(comment, ' section("/feedback/entry/comment") "good" & "quick" & NOT "response" ')=1
Fuzzy search
This search finds documents that contain the search term spelled in a similar way to the specified search term. The match level indicates the desired degree of accuracy. Fuzzy search is normally used when misspellings are possible in the document. You can specify values between 1 to 100 to show the degree of accuracy, where 100 is an exact match. Example 4-132 is a fuzzy search example. The term responce is misspelled purposefully. In the fuzzy search, a document containing the term responce is found. The degree of accuracy is 60 in this example.
Example 4-132 Query uses fuzzy search to find similar spelled terms
select appl_id from feedback where CONTAINS(comment, ' section("/feedback/entry/comment") fuzzy form of 60 "responce" & "good"')=1
Stemming search
The stemming search, which is used to search for the stemmed form of a term, causes the term to be reduced to its word stem before the search is carried out. This form of search is not case-sensitive. Currently, only English stemming is supported and the word must follow regular inflection endings. Example 4-133 shows searching for the fuzzy form of the term responce or the stemmed form of "goody". The stemmed search returns documents that have terms such as good and goods.
Example 4-133 Query uses stemmed search
select appl_id from feedback where CONTAINS(comment, ' section("/feedback/entry/comment") fuzzy form of 60 "responce" | stemmed form of "goody"')=1
163
Wildcard search
Wildcard search is also known as character masking search. DB2 NSE uses two masking characters: Percent (%) masks any number of arbitrary characters. Underscore (_) masks any single character in a search term. DB2 NSE uses these masking characters for wildcard search the same way the DB2 predicate LIKE uses them. A sample query using wildcard search is shown in Example 4-134.
Example 4-134 Query uses wildcard search
"respon_e" &
SELECT appl_id FROM feedback WHERE CONTAINS(comment, ' section("/feedback/entry/comment") "Slow"')=1 Example 4-135 returns the application ID of those feedbacks containing the term Slow in the element COMMENT. Example 4-136 is the result of this query.
Example 4-136 Query result
164
Suppose we want to discern which date the customer of application ID 100 entered the comment with the term Slow in it. Example 4-137 shows the XML query.
Example 4-137 Searching date that customer enters comment with the term Slow
SELECT XMLQUERY('$com//dateOfEntry' passing comment as "com")FROM feedback WHERE CONTAINS(comment, ' "Slow" ')=1 and APPL_ID=100 Example 4-138 shows the result of the query in Example 4-137. The result shows two dates, 2005-07-07 and 2005-09-12. The customer of application ID 100 has multiple entries of comments. He only enters the term Slow on date 2005-09-12, but both dates 2005-07-07 and 2005-09-12 show up because the search always returns a complete XML document, as shown in Example 4-138.
Example 4-138 Query result
<dateOfEntry xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 2005-07-07 </dateOfEntry> <dateOfEntry xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 2005-09-12 </dateOfEntry> 1 record(s) selected. Example 4-139 shows another way that the query in Example 4-138 can be written in XQuery. The XQuery also returns both dates 2005-07-07 and 2005-09-12.
Example 4-139 Sample XQuery
XQUERY for $com in db2-fn:sqlquery("select comment from feedback WHERE APPL_ID=100 and CONTAINS(comment, ' section("/feedback/entry/comment") stemmed form of "Slow" ')=1") return $com//dateOfEntry You can use full-text search to filter the documents having the term Slow in the COMMENT element as shown in Example 4-140. The search returns the complete XML documents that satisfied the search condition.
Example 4-140 Filter documents
SELECT XMLQUERY('$com/feedback/entry[contains(comment,"Slow")]/dateOfEntry' passing comment as "com")FROM feedback WHERE CONTAINS(comment, 'section("/feedback/entry/comment") "Slow"')=1 and APPL_ID=100
165
The same query as in Example 4-140 on page 165 can be expressed in XQuery context, as shown in Example 4-141.
Example 4-141 Same query expressed in XQuery
XQUERY for $com in db2-fn:sqlquery("select comment from feedback WHERE APPL_ID=100 and CONTAINS(comment, 'section("/feedback/entry/comment") "Slow" ')=1") return $com/feedback/entry[contains(comment,"Slow")]/dateOfEntry Both queries return the same result, as shown in Example 4-142.
Example 4-142 Filtered result
<?xml version="1.0" encoding="UTF-8"?> <itso:feedback xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:itso="http://www.itso.org"> <itso:entry> <itso:dateOfEntry> 2005-08-16 </itso:dateOfEntry> <itso:rating>3</itso:rating> <itso:comment lan="en">There should be more loan prodcuts.</itso:comment> </itso:entry> </itso:feedback>
Example 4-144 shows a query that returns no result because the path in the query is not fully qualified.
166
select appl_id from feedback where CONTAINS(comment, ' section("/feedback/entry/comment") "loan"')=1 Example 4-145 shows a query that returns an application ID because the path in the query is fully qualified.
Example 4-145 Example of a query that returns result
167
<XMLModel> <XMLFieldDefinition name="comment" locator="/feedback/entry/comment" /> <XMLAttributeDefinition name="rating" type="NUMBER" locator="/feedback/entry/rating"/> </XMLModel>
XMLModel
XMLModel is the top-level element. Two XML child elements are allowed for XML Model: XMLFieldDefinition and XMLAttributeDefinition.
XMLFieldDefinition
This element defines the custom name for the specific part, element, or attribute of the XML document that is identified by the locator attribute.
XMLAttributeDefinition
This element defines an NSE attribute based on an XML element or attribute. This attribute can be defined as being of type NUMBER, which allows for numeric operations and ranges to be used during searches.
Locator attribute
The locator attribute is an XPath expression that defines the part of the document that should be indexed. The following subset of XPath expressions are supported for the locator attribute: Child axis (/) Descendent-and-self axis(//) Attribute axis (@) Wildcards (*) Comment node (comment()) Processing-instruction node (processing-instruction()) Union of elements (A | B)
Example 4-147 is an example of possible locator attribute values
168
Name attribute
The name attribute defines the name that can be used to refer to the part of the XML document identified by the locator attribute. There are three special values that can be used to automatically generate a name value: name=$(NAME) Represents the qualified name of the XML element or attribute that is identified by the locator attribute. This would include any namespace associated with the data. name=$(LOCALNAME) Represents the locale name (no namespace) of the XML element that is identified by the locator attribute. name=$(PATH) Represents the absolute path to the XML element or attribute identified by the locator attribute. This is equivalent to the value returned from the locator XPath expression.
Type attribute
The type attribute can only be used with the XMLAttributeDefinition element. The only value allowed is NUMBER. Using this attribute specifies that the underlying data of the XML document can be considered numeric and allows for parametric search of this data.
169
db2text create index ixd2 for text on feedback(comment) format xml documentmodel XMLModel in C:\model.xml connect to databaseName; You must update the index ixd2 before you can use it. Example 4-149 shows the update command.
Example 4-149 Update index ixd2.
DB2TEXT UPDATE INDEX ixd2 FOR TEXT CONNECT TO databaseName The document model parameter specifies the root element and the location of the file. The document model file is only used during the creation of the index, so any later changes to the file would have no affect on existing indexes.
"bank"')=1
However, the next query will not return any results even though it references the same part of the document. Example 4-151 shows the query by using path. This is so, because the document model explicitly states only those names that are valid for searching.
Example 4-151 query returns no result.
"bank"')=1
170
Parametric search
If you define elements using the XMLAttributeDefinition in the document model to be numeric, parametric search is possible. Example 4-152 shows a query that searches by using values specified by the name attribute rating as defined in the document model. The query searches for the application IDs that have ratings between 0 and 6.
Example 4-152 Sample parametric search query
SELECT appl_id FROM feedback WHERE CONTAINS(comment, ' ATTRIBUTE "rating" BETWEEN 0 AND 6')=1
171
172
Chapter 5.
173
174
The nodes and subtrees in an XML data page form regions in a document. The XML regions index provides a logical mapping of those regions so that the document data can be retrieved from the XML data pages. The document ID and version ID in the XML Data Descriptor are used to do an index look-up in the regions index. The regions index key entry has the record ID of the root node of the XML document in the XDA. The XML regions index is automatically created by DB2 9 when the first XML column is created or added to a table. Even though the table has multiple XML columns, just one XML regions index is created. Accessing the XML documents stored in XML storage always goes through the XML regions index. Figure 5-1 shows how the XML regions index works.
Table T1
ID 1 2 3 4 XMLDOC
page
page
page
175
176
<?xml version="1.0"?> <Application> <Customer> <Name> <FirstName>Ichiro</FirstName> <LastName>Ohta</LastName> </Name> <DateOfBirth>2/11/1999</DateOfBirth> <SSN>111-33-3627</SSN> <Address country="JP"> <Street>33 AKEBONO</Street> <City>Takatushi-shi</City> <State>Osaka</State> <Zip>33333</Zip> </Address> <Phone type="work">201-999-9646</Phone> <Phone type="home">039-999-0251</Phone> <Email>[email protected]</Email> <Employer> <Company>My company3</Company> <Position>Developer</Position> </Employer> <FinancialData> <Income>76800.00</Income> <Debt>44500.00</Debt> <Expenses>40000.00</Expenses> <Assets>1400.00</Assets> </FinancialData> </Customer> <LoanType>0</LoanType> <Campain>1</Campain> </Application>
177
CREATE UNIQUE
INDEX
index-name
ON
table-name (xml-column-name)
xmlpattern
AS
SQL
When creating an XML index, the following fields are required: Index name: Specify the name of an XML index. Table and column names: Specify which XML column is indexed. XMLPATTERN: Specify the node you want to index. XMLPATTERN is similar to XPath expression. The difference between XMLPATTERN and XPath is that XMLPATTERN cannot have any conditional expressions. For example, the following expression is valid for XPath but not valid for XMLPATTERN: /Application/Customer/Address[@country="JP"] Data type: Specify the SQL data type for the XML index. Here, we create an XML index for our example XML document. If you require a query for looking up zip codes in the XML column, the XQuery could be similar to the following piece of code: XQUERY for $Application in db2-fn:xmlcolumn('LOAN_APPLICATION.APPL_DOC')/Application return $Application/Customer/Address/Zip;
178
The XML index you might create for this specific query would look similar to the following code sample: CREATE INDEX zipindex ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/Address/Zip' AS SQL VARCHAR; Figure 5-3 shows a conceptual structure of this XML index. CREATE INDEX ZIPINDEX statement creates a B-tree index ZIPINDEX, which contains the paths for the Zip node and the value of the Zip nodes.
CREATE CREATEINDEX INDEXZIPINDEX ZIPINDEXON ONLOAN_APPLICATION(APPL_DOC) LOAN_APPLICATION(APPL_DOC)GENERATE GENERATEKEY KEY USING USINGXMLPATTERN XMLPATTERN'/Application/Customer/Address/Zip' '/Application/Customer/Address/Zip'AS ASSQL SQLVARCHAR VARCHAR
Loan_Application table
APPL_DOC <Application> <Customer> <Name> <Address country="JP"> <Zip>33333</Zip> <Application> <Customer> <Name> <Address country="JP"> <Zip>90540</Zip> <Application> <Customer> <Name> <Address country="JP"> <Zip>95030</Zip>
ZIPINDEX
Path /Application/Customer/Address/Zip /Application/Customer/Address/Zip /Application/Customer/Address/Zip Value 33333 90540 95030
Figure 5-3 Create XML index statement and logical structure of XML index
Data type
When you create an XML index, you must specify the SQL data type for the node value that you want to index so that DB2 9 can convert XML node values specified in the xmlpattern clause to the SQL data type. The values then can be stored in a B-tree index. There are five SQL data types you can use: DOUBLE, VARCHAR(n), VARCHAR HASHED, DATE, and TIMESTAMP.
DOUBLE
The data type DOUBLE should be used to index numeric XML node values. Note that unbounded decimal types and 64-bit integers might lose precision when they are stored as DOUBLE. Following is an example of using the DOUBLE data type: CREATE INDEX zipindexd ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/Address/Zip' AS SQL DOUBLE;
179
VARCHAR(n)
This data type is used to index varying-length string node values. The maximum length "n" specified in bytes is a constraint. The index is guaranteed to store complete string values. If you try to insert an XML document which would have an indexed string node value that is longer than the specified maximum length, the insertion will fail. If you try to create an XML index that would have an indexed string node value that is longer than the specified maximum length, the CREATE INDEX statement will fail. CREATE INDEX zipindexv ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/Address/Zip' AS SQL VARCHAR(5); Depending on page sizes, the maximum length that you can specify in VARCAHR(n) varies: 4K page: Maximum length is 817 bytes. 8K page: Maximum length is 1841 bytes. 16K page: Maximum length is 3889 bytes. 32K page: Maximum length is 7985 bytes. In the case of indexing on the zip code node in the LOAN_APPLICATION table, both DOUBLE and VARCHAR data types can be used. Note that you can create two XML indexes on same xmlpatterns if they have different SQL data types.
VARCHAR HASHED
VARCHAR HASHED is used to handle indexing of character strings with arbitrary lengths. The VARCHAR HASHED data type can be used in the following cases: If the length of the character string values to be indexed is unknown. When you cannot use VARCHAR(n) because the character string to be indexed exceeds the maximum length, n can specified for the page in which the index is based. In this case, the system generates an eight-byte hash code over the entire string and there is no limit on the length of the indexed string. Note that range scans cannot be performed if you specify VARCHAR HASHED data type, because the index contains hash codes instead of the actual character data. Indexes using hashed character strings can be used only for equality lookups. Following is an example using VARCHAR HASHED: CREATE INDEX zipindexvh ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/Address' AS SQL VARCHAR HASHED;
DATE
DATE data type values will be normalized to Coordinated Universal Time (UTC) or Zulu time before being stored in the index. Note that the XML schema data
180
type for DATE allows greater precision than the SQL data type. If an out-of-range value is encountered, an error is returned. Following is an example using DATE data type: CREATE INDEX birthdate ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/DateOfBirsth AS SQL DATE;
TIMESTAMP
TIMESTAMP data type values will be normalized to UTC or Zulu time before being stored in the index. If XML documents have node values as shown in the following example; you can use TIMESTAMP data type to index the node. <date>2006-08-23T07:21:00.000000Z</date>
Unique index
You can create a unique index over an XML column. Note that values that you specify for xmlpattern in your CREATE INDEX statement must be unique, not only in one XML document, but also unique in all the XML documents stored in the XML column. In our example, you can create a unique index for SSN because SSN is supposed to be unique. Following is the CREATE INDEX statement: CREATE UNIQUE INDEX ssnindex ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/SSN' AS SQL VARCHAR(11); The following statement will fail because zip code is not unique: CREATE UNIQUE INDEX ZIPINDEXD ON LOAN_APPLICATION(APPL_DOC) GENERATE KEY USING XMLPATTERN '/Application/Customer/Address/Zip' AS SQL DOUBLE;
181
and has the indextype XVIP. The logical index is always created and assigned an index ID first. The physical index is created immediately afterwards and is assigned the next consecutive index ID.
SYSCAT.INDEXES
As with relational indexes, index information for the XML indexes is stored in SYSCAT.INDEXES. Even though the XML column path index and the XML regions index are system created indexes, they are visible in SYSCAT.INDEXES. Four new index types have been added to SYSCAT.INDEXES: XVIL: Index on an XML column (logical) XVIP: Index on an XML column (physical) XPTH: XML paths index XRGN: XML regions index Here, we create an XML index, APPLNAME, and see what information DB2 stored in SYSCAT.INDEXES. CREATE INDEX applname ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/Name' AS SQL VARCHAR(32); After creating that index, issue this SELECT statement to retrieve XML index information. SELECT indname, TABNAME, INDEXTYPE FROM WHERE TABNAME='LOAN_APPLICATION'; The result would be similar to Example 5-2.
Example 5-2 SELECT statement output
SYSCAT.INDEXES
INDNAME TABNAME -------------------- ---------SQL060821123323580 LOAN_APPLI SQL060821123323700 LOAN_APPLI APPLNAME LOAN_APPLI SQL060821154539000 LOAN_APPLI 4 record(s) selected.
The following index types are listed in the INDEXTYPE column: XRGN: XML regions index XPTH: XML column path index XVIL: Logical index for APPLNAME XVIP: Physical index for APPLNAME
182
The regions index and XML column path index are created by DB2 9 automatically when the LOAN_APPLICATION table is created. The logical and physical indexes are created when the CREATE INDEX statement is executed successfully. Actual index values are stored in the physical index.
SYSCAT.INDEXXMLPATTERNS
In SYSCAT.INDEXXMLPATTERNS, you can find the name, length, and xmlpattern of the XML index that you have created. For the following query: SELECT indname, pindname, datatype, length, pattern FROM syscat.indexxmlpatterns The result might be similar to this example: INDNAME PINDNAME DATATYPE -------- ------------------ -------APPLNAME SQL060821154539000 VARCHAR LENGTH PATTERN ------ -------------------------32 /Application/Customer/Name
db2dart
The DB2 database analysis and reporting tool db2dart can be used to examine the architectural correctness of databases and objects within it. You can use this tool to see what values are stored in XML indexes. To check the XML index using db2dart, you have to provide the index object ID and table space ID where the XML index is stored. Example 5-3 shows the query to find the table space ID.
Example 5-3 Get table space ID
SELECT tabname, tbspaceid FROM syscat.tables WHERE tabname = 'LOAN_APPLICATION' TABNAME TBSPACEID ------------------- --------LOAN_APPLICATION 2 The index object ID can be obtained from the syscat.indexes catalog table INDEX_OBJECTID column using the query shown in Example 5-4.
Example 5-4 Get index object ID
SELECT indname, index_objectid FROM syscat.indexes WHERE indname in ('APPLNAMEW') INDNAME INDEX_OBJECTID --------- -------------APPLNAME 4
183
Now that you know the table space ID and index object ID, you are ready to issue db2dart. Specify the database name XMLRB, table space ID 2 for /tsi option, and index object ID 4 for the /oi option as follows: db2dart XMLRB /di /tsi 2 /oi 4 /ps 0 /np 10000 /v y The options used are: /di: dump formatted index data /tsi: table space id /oi: object id /ps: page number to start dumping /np: number of pages /v: verbose option Example 5-5 shows a clip of the output generated by db2dart. You can see the structure of the XML index APPLNAME and the value stored in APPLNAME.
Example 5-5 Output from db2dart
Key 1: Offset Location = 3546 (xDDA) Record Length = 39 (x27) Key Part 1: Long Integer Value = 105 Key Part 2: Variable Length Character String Actual Length = 10 49636869 726F4F68 7461 IchiroOhta Key Part 3: Big Integer Value = 22799473113613056 Key Part 4: Variable Length Binary String Actual Length = 2 2222 "" Key Part 5: Fixed Length Character String 00 . Key Part 6: Fixed Length Character String 31 1 Table RID: x(0000 0083 000B) r(00000083;000B) d(131;11) ridFlags=x2 Punc
184
db2cat
The system catalog analysis command db2cat can analyze the contents of packed descriptors. Given a database name, schema name, and table name, this command will query the system catalogs for table information and format the results. The same index statistics are collected (such as nlevels and nleafs) for XML indexes as relational indexes when using RUNSTATS. You can use db2cat to check the statistical collection for an XML column. db2cat has these options: -d : -s : -n : -o : database name schema name table name output file
Example 5-6 shows the db2cat command and the output it generates with information about the XML column statistics of the APPL_LOAN table section.
Example 5-6 A part of the output file from db2cat
db2cat -d XMLRB -s db2admin -n loan_application -o db2catoutput.txt ++++++++++++++++++++++++++++++++++++++++ XML column statistics ++++++++++++++++++++++++++++++++++++++++ Column ID = 1 No. NULL XML docs = 0 No. non-NULL XML docs = 102 ---------------------------------------Catch All Pathid Bucket ---------------------------------------Distinct Pathid count = 44 Sum Node Counts = 4794 Sum Doc Counts = 4488 ---------------------------------------Top-k Pathid node counts ---------------------------------------Max no. of path counts = 44 Cur no. of path counts = 44 Cnt( /root()/Application/Customer/Phone ) = 204 Cnt( /root()/Application/Customer/Phone/type ) = 204 ....................................... ---------------------------------------PathID = /root()/Application/Customer/Address/Street/text() Distinct Value Cnt = 3 2nd Highest Key = 19:46 East Main Street 2nd Lowest Key = 14:1-1-1 AINOKAWA Sum Node Cnt = 102 Sum Doc Cnt = 102
185
XSCAN
DB2 uses the XSCAN operator to traverse XML document trees and if required, to evaluate predicates and extract document fragments and values. XSCAN can appear in an execution plan after a base table scan to process each of the documents retrieved from a table. For example, if you want to search XML documents that have Ichiro in the FirstName element, XQuery can be similar to this: db2-fn:xmlcolumn('LOAN_APPLICATION.APPL_DOC')/Application/Customer/Name [FirstName="Ichiro"] If there is no index on the FirstName element, DB2 9 has to read all of the XML documents stored in the APPL_DOC column to check each XML document that satisfies those XPath conditions. Figure 5-4 shows how DB2 9 gets results for this XQuery. DB2 first accesses the LOAN_APPLICATION table to get the first XML column and then accesses the XML regions index to get the address of the first node of the first XML document. DB2 then traverses nodes from the first node to see if the document satisfies the condition /Application/Customer/Name[FirstName="Ichiro"]. This continues until DB2 finds the XML document which has a customer with last name Ichiro. This might be the last document in the database, as in our example.
186
db2-fn:xmlcolumn('LOAN_APPLICATION.APPL_DOC') /Application/Customer/Name[FirstName="Ichiro"]
LOAN_APPLICATION
APPL_ID 1 . 101 102 APPL_DOC APPL_STATUS
DOC1
DOC2
Doc101
Doc102
Figure 5-4 Logical model for accessing XML column without an index
XISCAN
XISCAN scans an XML index. This operator is likely to be used if an adequate XML index for the XQuery has already been created. Assume that an XML index APPLFIRSTNAME for FirstName element has been created using the following command: CREATE INDEX applfirstname ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/Name/FirstName' AS SQL VARCHAR(32) Now that we have an XML index for the FirstName element, DB2 no longer has to read all XML documents to find the record. Instead, DB2 only accesses the XML documents that are required. Figure 5-5 shows conceptually how DB2 accesses XML documents via XML indexes. When the same query is issued, since the FirstName element has already been indexed, DB2 checks the values in the index APPLFIRSTNAME. Once DB2 finds it, DB2 uses the RID to access the base table descriptor.
187
Through the XML region index, DB2 accesses the XML document to construct the result sequence.
db2-fn:xmlcolumn('LOAN_APPLICATION.APPL_DOC') /Application/Customer/Name[FirstName="Ichiro"]
LOAN_APPLICATION
APPL_ID 1 100 101 APPL_DOC APPL_STATUS
The index can tell that Doc102 has Ichiro as FirstName node values
102
Regions Index
Doc100
Doc101
Doc102
Figure 5-5 Logical model for accessing XML column with an index
XANDOR
The XANDOR operator merges multiple results from the XISCAN operators. In DB2 9, XANDOR supports only ANDing. You define such an XML index for the Zip element as described here: CREATE INDEX applzip ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/Address/Zip' AS SQL VARCHAR(5) Now you have two XML indexes on the LOAN_APPLICATION one is APPLFIRSTNAME for the FirstName element, another is APPLZIP for the Zip element. DB2 probably uses the XANDOR operator for a query similar to this: XQUERY for $i in db2-fn:xmlcolumn('LOAN_APPLICATION.APPL_DOC')/Application/Customer[Name /FirstName="Ichiro" and Address/Zip="33333"]/Name return $i
188
The index APPLFIRSTNAME is used for resolving the condition /Application/Customer/Name[FirstName=Ichiro] and the index APPLSIZP is used for resolving the condition /Application/Customer/Address[Zip=33333]. XANDOR is used to merge the input from both XML indexes.
db2exfmt
To get the access plan, you have to create Explain tables. DB2 provides a script EXPLAIN.DDL to create Explain tables. This script is in the directory %SQLLIB%\misc. To run the script, issue the following command from DB2 CLP: db2 -tvf EXPLAIN.DDL If you want to follow our example using the data we provide, drop the XML indexes created in the previous section and refresh the table statistics using the following command: DROP INDEX APPLFIRSTNAME; DROP INDEX APPLZIP; RUNSTATS ON TABLE DB2ADMIN.LOAN_APPLICATION;
189
Example 5-7 shows the formatted output. After the XSCAN operator is chosen, all XML documents in XML column are read for checking if an XPath specified in XQuery or SQL/XML is matched or not.
Example 5-7 Access plan where XSCAN operator is used
Access Plan: ----------Total Cost: 805.299 Query Degree:1 Rows RETURN ( 1) Cost I/O | 1 NLJOIN ( 2) 805.299 106 /--+--\ 102 0.00980392 TBSCAN XSCAN ( 3) ( 4) 30.8137 7.59299 4 1 | 102 TABLE: DB2ADMIN LOAN_APPLICATION
190
Access Plan: ----------Total Cost: 15.3607 Query Degree:1 Rows RETURN ( 1) Cost I/O | 1 NLJOIN ( 2) 15.3607 2 /-+-\ 1 1 FETCH XSCAN ( 3) ( 7) 7.76775 7.59299 1 1 /----+---\ 1 102 RIDSCN TABLE: DB2ADMIN ( 4) LOAN_APPLICATION 0.175903 0 | 1 SORT ( 5) 0.173137 0 | 1 XISCAN ( 6) 0.166633 0 | 102 XMLIN: DB2ADMIN APPLFIRSTNAME
191
db2 set current explain mode yes; db2 XQUERY for $i in db2-fn:xmlcolumn('LOAN_APPLICATION.APPL_DOC') /Application/Customer[Name/FirstName="Ichiro" and Address/Zip="33333"]/Name return $i; db2 set current explain mode no; db2exfmt -d xmlrb -o plan3.txt -1; Example 5-10 shows the access plan of this query. The access plan shows that two defined indexes APPLFIRSTNAME and APPLZIP are used. The XANDOR operator is used to merge multiple XISCAN output to produce the result set.
Example 5-10 Access plan where XANDOR operator is used
Access Plan: ----------Total Cost: 8.0112 Query Degree:1 Rows RETURN ( 1) Cost I/O | 0.00970874 NLJOIN ( 2) 8.0112 1.0098 /--+--\ 0.00980392 0.990291
192
FETCH XSCAN ( 3) ( 9) 0.418209 7.59299 0.00980392 1 /----+----\ 0.00980392 102 RIDSCN TABLE: DB2ADMIN ( 4) LOAN_APPLICATION 0.341843 0 | 0.00980392 SORT ( 5) 0.339077 0 | 0.00980392 XANDOR ( 6) 0.333267 0 /-----+-----\ 1 1 XISCAN XISCAN ( 7) ( 8) 0.166633 0.166633 0 0 | | 102 102 XMLIN: DB2ADMIN XMLIN: DB2ADMIN APPLFIRSTNAME APPLZIP
193
Visual Explain
Visual Explain allows you to view the access plan for the explained SQL or XQuery statements as a graph. Using Visual Explain is probably the easiest way to get an access plan. If the Explain tables do not exist, Visual Explain automatically creates those tables for you. Visual Explain can be invoked from the Command Editor by entering the query for which you want to see an access plan and clicking the access plan icon . See Figure 5-6.
194
Figure 5-7 shows an access plan graph on Visual Explain. If you click any box on the pane, you can see further information.
CASE 1: Predicate
An XML index can be used if the XML index contains the query predicate, for example, it is equally or less restrictive than the predicate.
195
Consider the following index and query on the LOAN_APPLICATION table: CREATE INDEX applname ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/Name' AS SQL VARCHAR(32) XQUERY for$i in db2-fn:xmlcolumn('LOAN_APPLICATION.APPL_DOC') /Application/Customer/Name[FirstName="Ichiro"] return $i In this example, the XML index is on the Name element whose value is a combination of the node values from FirstName element and LastName element, for example IchiroOhta. The XQuery will not be able to utilize the index APPLNAME because the query predicate /Application/Customer/Name[FirstName=Ichiro] does not match the string node values in the XML index. To use an XML index, the XML index should include FirstName element. Defining the XML indexes in either of the following ways allows DB2 to use the index: CREATE INDEX applname ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/Name/FirstName' AS SQL VARCHAR(32) CREATE INDEX applname ON loan_application(appl_doc) GENERATE KEY USING XMLPATTERN '/Application/Customer/Name/*' AS SQL VARCHAR(32)
196
SELECT l.appl_id FROM loan_application l WHERE XMLEXISTS('$i/Application/Customer/Name[FirstName = "Ippei"]' PASSING l.appl_doc AS "i") AND XMLEXISTS('$i/Application/Customer/Address[Zip = "22222"]' passing l.appl_doc AS "i"); SELECT l.appl_id FROM loan_application l WHERE xmlexists('$i/Application/Customer[Name/FirstName="Ippei"]/Address[Zip= "22222"]' PASSING l.appl_doc AS "i"); Note that the two XQueries in Example 5-12 cost almost the same.
Example 5-12 Two XQueries
XQUERY for $i in db2-fn:xmlcolumn('LOAN_APPLICATION.APPL_DOC')/Application where $i/Customer/Name/FirstName = "Ippei" and $i/Customer/Address/Zip = "22222" return $i/Customer/Name; XQUERY for $i in db2-fn:xmlcolumn('LOAN_APPLICATION.APPL_DOC')/Application/Customer[Name /FirstName="Ippei" and Address/Zip="22222"]/Name return $i
197
CREATE INDEX APPLALL ON LOAN_APPLICATION(APPL_DOC) GENERATE KEY USING XMLPATTERN '//*' AS SQL VARCHAR HASHED
/Application/Customer
/Application/LoanType
/Application/Campaign
.. .. .. .. .. ..
2 102 102
. . . . . .
<?xml version="1.0"?> <Application> <Customer> <Name> <FirstName>Ichiro</FirstName> <LastName>Ohta</LastName> </Name> <DateOfBirth>2/11/1999</DateOfBirth> <SSN>111-33-3627</SSN> <Address country="JP"> <Street>33 AKEBONO</Street> <City>Takatushi-shi</City> <State>Osaka</State> <Zip>33333</Zip> </Address> <Phone type="work">201-999-9646</Phone> <Phone type="home">039-999-0251</Phone> <Email>[email protected]</Email> <Employer> <Company>My company3</Company> <Position>Developer</Position> </Employer> <FinancialData> <Income>76800.00</Income> <Debt>44500.00</Debt> <Expenses>40000.00</Expenses> <Assets>1400.00</Assets> </FinancialData> </Customer> <LoanType>0</LoanType> <Campaign>1</Campaign> </Application>
198
199
<xs:import namespace="http://www.itso.org/cat" schemaLocation="cat.xsd" /> <xs:import namespace="http://www.itso.org/dog" schemaLocation="dog.xsd" /> <xs:element name="PETS"> <xs:complexType> <xs:sequence> <xs:element name="DOG" type="do:DOG"/> <xs:element name="CAT" type="ca:CAT"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
Example 5-14 XML schema cat.xsd
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.itso.org/cat"> <xsd:complexType name="CAT"> <xsd:sequence> <xsd:element name="NAME" type="xsd:string" /> <xsd:element name="AGE" type="xsd:integer" /> </xsd:sequence> </xsd:complexType> </xsd:schema>
Example 5-15 XML schema dog.xsd
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.itso.org/dog"> <xsd:complexType name="DOG"> <xsd:sequence> <xsd:element name="NAME" type="xsd:string" /> <xsd:element name="AGE" type="xsd:integer" /> </xsd:sequence> </xsd:complexType> </xsd:schema>
200
pets.xsd by itself is incomplete to validate a document containing elements "dogs" and "cats". In order to validate such a document, you also have to register dogs.xsd and cats.xsd. Use the following steps to register a schema: 1. Register the primary schema. The information required to register a primary schema document is: The fully-qualified file name for the primary XML schema document: The fully-qualified name means the file name plus the path to where the XML schema document is located. The fully-qualified file name is used in the REGISTER XMLSCHEMA command. The SQL identifier: You can choose any valid SQL two-part name to identify the XML schema. You should choose a meaningful name. The SQL identifier will be used for explicitly validating XML instance documents. You also require the SQL identifier when you drop the registered XML schema document. Some additional information such as the schema location of the primary schema document can also be used in registration. Just as its name suggests, a schema location indicates where a schema document is located. The schema location can be a URL, FTP address, or a fully-qualified file name on the local machine. The schema location is stored in the catalog table after registration. The information can be used for implicitly validating an XML instance document. Example 5-16 shows the command to register the primary schema document pets.xsd.
Example 5-16 Registering a primary schema document
register xmlschema http://sample from c:\pets.xsd as sample.pets In this example, c:\pets.xsd is the full-qualified name. pets.xsd is located in root path of C drive in the local directory. The schema location using is the URL http://sample. SQL identifier is sample.pets. 2. Add schema documents. You must add all the XML schema documents that the primary XML schema document directly or indirectly imports/includes if the primary XML has any imports/includes. You can skip this step if the primary XML has no imports/includes. The schema location must match the schemaLocation attribute from the import/include declaration in the importing/including schema document. Example 5-17 shows the command to add XML schema documents cat.xsd and dog.xsd.
201
add xmlschema document to sample.pets add cat.xsd from c:\cat.xsd add xmlschema document to sample.pets add dog.xsd from c:\dog.xsd In this example, sample.pets is the SQL identifier of the registered primary XML schema. c:\cat.xsd and c:\dog.xsd are the schema locations that match the import tags for cat.xsd and dog.xsd in the primary XML schema document pets.xsd. C:\cat.xsd and c:\dog.xsd are the full-qualified file names on the local machine. No specific order is required when you add an XML schema document. In the example, we could have added dog.xsd first, then cat.xsd. 3. Complete the registration: After you register the primary schema document and add all the involved schema documents, you can complete the registration process. This step checks the schemaLocation values in import and include tags in XML schema documents and the schema locations provided by add xmlschema document commands. If there is any mismatch, the completing registration would fail with an error. This step also checks if the XML schema documents are well-formed. If one or more schema documents are not well-formed, the completing registration would also fail with an error. Example 5-18 shows the command to complete the registration.
Example 5-18 Completing schema with success
complete xmlschema sample.pets In the example above, sample.pets is the SQL identifier we used to register the primary XML schema document pets.xsd. Example 5-19 is an example of completing registration that failed with error.
Example 5-19 Completing schema fails with error
complete xmlschema sample.pets SQL20329N The completion check for the XML schema failed because one or more XML schema documents is missing. One missing XML schema document is identified by "NAMESPACE" as "http://www.itso.org/cat". SQLSTATE=428GI In this example, we did not add cat.xsd. Registration failed with error code SQL20329N. After an XML schema is registered in XSR, it is a XSR object. You can remove a XSR object by dropping it. The schema repository does not have a notion for each XML schema document. When you drop an XML schema, all XML schema documents that belong to the XML schema are dropped.
202
If one or more of the XML schema documents have to be changed, you must drop the XML schema and recreate the XSR object with the new XML schema documents. You can drop an XML schema whether it is completed or not. Example 5-20 is an example of dropping an XML schema.
Example 5-20 Dropping a schema
drop xsrobject sample.pets In this example, sample.pets is the SQL identifier that we used to previously to register the primary XML schema document.
Column name --------------OBJECTID OBJECTSCHEMA OBJECTNAME TARGETNAMESPACE SCHEMALOCATION OBJECTINFO OBJECTTYPE OWNER CREATE_TIME ALTER_TIME STATUS DECOMPOSITION REMARKS
Type schema -----SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM
Type name Length Scale Nulls --------- -------- ----- -----BIGINT 8 0 No VARCHAR 128 0 No VARCHAR 128 0 No VARCHAR 1001 0 Yes VARCHAR 1001 0 Yes XML 0 0 Yes CHARACTER 1 0 No VARCHAR 128 0 No TIMESTAMP 10 0 No TIMESTAMP 10 0 No CHARACTER 1 0 No CHARACTER 1 0 No VARCHAR 254 0 Yes
Suppose we require information about the target namespace, schema location, object schema, and object name. Example 5-22 shows the query and its output. Each row in the view SYSCAT.XSROBJECTS represents an XML schema.
Example 5-22 query view SYSCAT.XSROBJECTS
203
3 record(s) selected. If you require information about XML schema documents, you can query the system catalog SYSCAT.XSROBJECTCOMPONENTS. Example 5-23 shows the table description of the view SYSCAT.XSROBJECTCOMPONENTS.
Example 5-23 Query view SYSCAT.XSROBJECTCOMPONENTS
Column name --------------OBJECTID OBJECTSCHEMA OBJECTNAME COMPONENTID TARGETNAMESPACE SCHEMALOCATION COMPONENT CREATE_TIME STATUS
Type schema --------SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM SYSIBM
Type name Length Scale Nulls --------- -------- ----- -----BIGINT 8 0 No VARCHAR 128 0 No VARCHAR 128 0 No BIGINT 8 0 No VARCHAR 1001 0 Yes VARCHAR 1001 0 Yes BLOB 31457280 0 No TIMESTAMP 10 0 No CHARACTER 1 0 No
Example 5-24shows the query of SYSCAT.XSROBJECTCOMPONENTS and its output. Unlike SYSCAT.XSROBJECTS, each row in the view SYSCAT.XSROBJECTCOMPONENTS represents an XML schema document.
Example 5-24 Query view SYSCAT.XSROBJECTCOMPONENTS
select TARGETNAMESPACE, SCHEMALOCATION, OBJECTSCHEMA, OBJECTNAME FROM SYSCAT.XSROBJECTCOMPONENTS TARGETNAMESPACE -------------------------http://www.itso.org/sample http://person http://www.itso.org/dog http://www.itso.org/pets 4 record(s) selected. SCHEMALOCATION -------------http://sample http://person dog.xsd http://sample OBJECTSCHEMA -----------SAMPLE JOHN SAMPLE SAMPLE OBJECTNAME ---------ORDER PERSON PETS PETS
204
You can create a filter to filter out the schema you do not want to see. Go to Selected Filter Create. You can see a pop-up window as shown in Figure 5-10. You can filter out schemas based on the predicates that you set on XML Artifact Name, Schema Name, Target Namespace, Type, and Comment.
205
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:complexType name="item"> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="quantity" type="xsd:integer"/> <xsd:element name="price" type="xsd:integer"/> </xsd:sequence> <xsd:attribute name="weight" type="xsd:integer"/> <xsd:attribute name="color" type="xsd:string"/> </xsd:complexType> <xsd:element name="order" > <xsd:complexType> <xsd:sequence> <xsd:element name="order" type="item" maxOccurs="unbounded"/>
206
</xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema> The chain store is expanding and starting to offer free store memberships to the customers. Store member customers receive special discounts on some items. The chain store decides to let customers know how much they are saving when buying as store members. The customer will see the savings on each purchased item in the receipts. The schema order.xsd must be changed to reflect the new business rules. The change is called schema evolution. Example 5-26 shows the new XML Schema Definition, order_new.xsd. The order_new.xsd is compatible with the order.xsd in Example 5-25 on page 206 . The change is adding a new element discount. The element discount has attribute minOccurs set to 0 because not every item has a discount price.
Example 5-26 order_new.xsd
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:complexType name="item"> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="quantity" type="xsd:integer"/> <xsd:element name="price" type="xsd:integer"/> <xsd:element name="discount" type="xsd:integer" minOccurs="0"/> </xsd:sequence> <xsd:attribute name="weight" type="xsd:integer"/> <xsd:attribute name="color" type="xsd:string"/> </xsd:complexType> <xsd:element name="order" > <xsd:complexType> <xsd:sequence> <xsd:element name="order" type="item" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema> You should register the new schema file order_new.xsd. The application should validate the new document instance with the new registered schema. If you do not want to change the application and you use the explicit validation, you can drop the old schema and register the new schema with information that is identical with the old schema.
207
5.3.1 IMPORT
The IMPORT utility can be used to move data into DB2 database tables. In DB2 9, the IMPORT utility was enhanced to support the XML data type. With this enhancement, you can use the IMPORT utility to insert one or more XML data files to DB2 relational tables with XML data type columns. You can only import well-formed XML documents because the columns defined as XML data types can only contain complete XML documents. If you are importing a data file with a row containing a document that is not well-formed, it will be rejected by DB2. The IMPORT utility will treat an XML document as Unicode unless the importing document contains a declaration tag specifying the encoding attribute. XML data can be imported into a DB2 table with or without XML schema validation. Without XML schema validation, IMPORT inserts well-formed XML documents into the database without checking if the data in your XML documents is valid. In the following sections we explore how importing XML data into a DB2 9 table can be achieved with and without XML schema validation. In this section, we show you, by examples, how to import data into XML columns with or without validating the XML data. We discuss in detail the new IMPORT utility options. In our examples, we use a CUSTOMER table for demonstration. The CUSTOMER table can be created using the following command: CREATE TABLE customer(id INT NOT NULL PRIMARY KEY, name VARCHAR(20), customer_info XML);
208
<?xml version="1.0"?> <ContactInfo> <Address> <Street>22 Willow Street </Street> <City>Los Gatos </City> <State>CA </State> <Zip>95030</Zip> </Address> <Phone> <work>408-677-8888 </work> <home>408-588-9999 </home> <mobile>408-345-6666 </mobile> </Phone> </ContactInfo> Example 5-28 shows the import data file customer.del we prepared. The XDS specifies the document file name.
Example 5-28 Delimited ASCII file for input to DB2 IMPORT
209
FIL is the name of the system file in which the XML document is stored. This
attribute is required.
OFF is the byte offset of the XML data in the file specified in the FIL attribute. The offset starts from zero. LEN is the length of the XML data in the file specified in the FIL attribute. SCH is the fully qualified XML schema name that is used for validating the XML documents.
For each row in the delimited input data file, the number of XDSs must be equal to or less than number of XML columns in a table. For our sample table, we have one XML column, so we can have zero to one XDS for each row. If we have a row in data file with two or more XDSs, that row will be rejected by the DB2 IMPORT utility. Example 5-29 shows an import data file with two XDSs in a row for a table with two XML columns.
Example 5-29 Input file with multiple XDSs 10000,"Sarah Young","<XDS FIL='contactInfo1.xml'/>","<XDS FIL='file1.xml'/>" 10002,"Jadan Phillips","<XDS FIL='contactInfo2.xml'/>","<XDS FIL='file2.xml'/>"
Importing data
With the data file, XML source file, and the database table created, we can import the data using the new XML FROM option as shown in the following IMPORT command: IMPORT FROM "C:\XmlRedbook\PROG\import\customer.del" OF DEL XML FROM "C:\XmlRedbook\PROG\import" INSERT INTO db2admin.customer;
210
If you have XML files that reside in more than one location, you can specify the IMPORT command to look into more than one path as follows: IMPORT FROM "C:\XmlRedbook\PROG\import\customer.del" OF DEL XML FROM "C:\XmlRedbook\PROG\import", "C:\XmlRedbook\data\xml\" INSERT INTO db2admin.customer; The sample output of the DB2 IMPORT command is shown in Example 5-30.
Example 5-30 IMPORT without validation sample output
IMPORT FROM "C:\XmlRedbook\PROG\import\customer.del" OF DEL XML FROM "C:\XmlRedbook\PROG\import" INSERT INTO db2admin.customer SQL3109N The utility is beginning to load data from file "C:\XmlRedbook\PROG\import\customer.del". SQL3110N The utility has completed processing. from the input file. SQL3221W SQL3222W "1" rows were read
...Begin COMMIT WORK. Input Record Count = "1". ...COMMIT of any database changes was successful.
SQL3149N "1" rows were processed from the input file. "1" rows were successfully inserted into the table. "0" rows were rejected.
of of of of of of
= = = = = =
1 0 1 0 0 1
SELECT * FROM DB2ADMIN.CUSTOMER WHERE ID=10000 ID NAME CONTACT_INFO ----------- -------------------- -------------------------------------10000 Sarah Young <ContactInfo><Address><Street>22 Willow Street </Street><City>Los Gatos </City><State>CA </State><Zip>95030</Zip></Address><Phone><work>408-677-8888 </work><home>408-588-9999 </home><mobile>408-345-6666 </mobile></Phone></ContactInfo>
211
If you have XML files that reside in a different path than the one you specified in the IMPORT command, DB2 IMPORT will raise an error SQL3229N with reason code 1 stating that the file name cannot be found.
<?xml version="1.0"?> <ContactInfo> <Address> <Street>555 Lincoln Blvd</Street> <City>San Jose</City> <State>CA</State> <Zip>95136</Zip> </Address> <Phone> <work>408-677-8888</work> <home>408-588-9900</home> <mobile>408-345-7777</mobile> </Phone> </ContactInfo>
XML schema
To validate XML documents during import, you must have an XML schema that specifies the acceptable XML elements, the order of the elements, the minimum and maximum occurrences of elements, data types, and required or optional elements. For our sample XML schema, we use IBM Rational Software Development to generate an XML schema, but you can use any text editor or tool to create an XML schema. The IBM Rational Software Development was previously known as WebSphere Application Development Studio. The schema we create checks every imported XML document for the following information: For every group of customer information, the order is ID, NAME, then CONTACT_INFO.
212
For every group of contact information, the order is ADDRESS and then PHONE. For every group of address information, the order is Street, City, State, and Zip. For Address, provide at least one address; you can have up to two of them. For Phone, there must be at least one phone number and you can have up to three Phone elements per group. For Zip information, the value must be at least a five-digit number or nine-digit number in one of these formats: xxxxx or xxxxx-yyyy. For the XML schema contactInfo.xsd, see A.2, contactInfo.xsd on page 366. Before you can validate against a specific schema, you must register it with DB2. The following command will register customerInfo.xsd with DB2: REGISTER XMLSCHEMA 'C:/XmlRedbook/PROG/import/' FROM C:\XmlRedbook\PROG\import\contactInfo.xsd AS DB2ADMIN.CONTACTINFO; The fully qualified SQL identifier of the XML schema will be used in the XSD file to instruct DB2 to validate data using the schema during import. In our example, it is DB2ADMIN.CONTACTINFO.
213
Following is an IMPORT command example with the new option XMLVALIDATE indicating that the data will be validated during importing: IMPORT FROM "C:\Import\DATA\customer2.del" OF DEL XML FROM "C:\Import\DATA" XMLVALIDATE USING XSD INSERT INTO db2admin.customer; If we modify our contactInfo1.xml document and remove all three phone elements, the modified file looks as shown in Example 5-34:
Example 5-34 Modified contactInfo1.xml file
<?xml version="1.0"?> <ContactInfo> <Address> <Street>22 Willow Street </Street> <City>Los Gatos </City> <State>CA </State> <Zip>95030</Zip> </Address> <Phone> </Phone> </ContactInfo> We also modified the zip code in the file contactInfo2.xml by adding a dash(-) after the fifth digit. The modified file is shown in Example 5-35.
Example 5-35 Modified contactInfo2.xml
<?xml version="1.0"?> <ContactInfo> <Address> <Street>555 Lincoln Blvd</Street> <City>San Jose</City> <State>CA</State> <Zip>95136-</Zip> </Address> <Phone> <work>408-677-8888</work> <home>408-588-9900</home> <mobile>408-345-7777</mobile> </Phone> </ContactInfo>
214
Validation against the registered schema DB2ADMIN.CONTACTINFO will raise error SQL16123N for contactinfo1.xml because our schema defined that at least one contact phone is required in an input XML document. For the second file contactInfo2.xml, we should see error SQL16210N because the input data violates the defined format for zip code, which was xxxxx or xxxxx-xxxx, where x is a number from 0 to 9. The output of the IMPORT command is shown in Example 5-36.
Example 5-36 IMPORT command output
------------------------------ Commands Entered ----------------------IMPORT FROM "C:\Import\DATA\customer.del" OF DEL XMLVALIDATE USING XDS INSERT INTO DB2ADMIN.CUSTOMER; SELECT * FROM DB2ADMIN.CUSTOMER; ----------------------------------------------------------------------IMPORT FROM "C:\Import\DATA\customer.del" OF DEL XMLVALIDATE USING XDS INSERT INTO DB2ADMIN.CUSTOMER SQL3109N The utility is beginning to load data from file "C:\Import\DATA\customer.del". SQL3148W A row from the input file was not inserted into the table. SQLCODE "-16123" was returned. SQL16123N XML document contains an element "((work,home),mobile)" with empty content where the content model requires content for this element. SQLSTATE=2200M SQL3185W The previous error occurred while processing data from row "1" of the input file. SQL3148W A row from the input file was not inserted into the table. SQLCODE "-16210" was returned. SQL16210N XML document contained a value "95136-" that violates a facet constraint. Reason code = "13". SQLSTATE=2200M SQL3185W The previous error occurred while processing data from row "2" of the input file. SQL3110N The utility has completed processing. from the input file. "2" rows were read
215
SQL3221W SQL3222W
...Begin COMMIT WORK. Input Record Count = "2". ...COMMIT of any database changes was successful.
SQL3149N "2" rows were processed from the input file. "0" rows were successfully inserted into the table. "2" rows were rejected.
of of of of of of
= = = = = =
2 0 0 0 2 2
SELECT * FROM DB2ADMIN.CUSTOMER ID NAME CONTACT_INFO ----------- -------------------- -------------------------------------0 record(s) selected.
216
The IMPORT utility will look for the input XML files in C:\Import\XML and C:\Import\Data paths. MODIFIED BY: Two new MODIFIED BY file-type mode options are added for XML data, MODIFYED BY XMLCHAR and MODIFIED BY XMLGRAPHIC. These two options specify that the incoming XML data is encoded in the character or graphic code page. Most commonly the code page is ASCII or UTF-8. The MODIFIED BY XMLCHAR is valid for delimited and nondelimited ASCII file types. Following is an IMPORT command with MODIFIED BY XMLCHAR options: IMPORT FROM "C:\XmlRedbook\PROG\sample_loan_app.del" OF DEL XML FROM "C:\XmlRedbook\PROG" MODIFIED BY XMLCHAR INSERT INTO db2admin.customer; The MODIFIED BY XMLGRAPHICs option is useful when incoming XML documents are encoded in a specific graphic code page but have no encoding declaration at the beginning of the XML document. The MODIFIED BY XML GRAPHIC is available to use with delimited and non-delimited ASCII data file types. If you import an XML document that contains an encoding attribute, the encoding must match the character or graphic code page value or the row is rejected. The code page value is the value specified by the CODEPAGE file type modifier or the graphic component of the application code page. If the modifier is not specified for the IMPORT command, the application default character code page is used. For more information about character code pages, see the DB2 9 publication, XML Guide for DB2 Version 9, SC10-4254: ftp://ftp.software.ibm.com/ps/products/db2/info/vr9/pdf/letter/en_US /db2xge90.pdf If the XML file is encoded with the ASCII character code page, you can use the CODEPAGE file type modifier as shown in the following command: IMPORT FROM "C:\Import\DATA\customer.del" OF DEL XML FROM "C:\Import\XML" MODIFIED BY CODEPAGE=367 XMLCHAR INSERT INTO db2admin.customer; If the XML file is encoded with UTF-16, you can specify the CODEPAGE file type modifier with matching graphic code page for the encoding UTF-16 as shown in the following example: IMPORT FROM XML FROM MODIFIED INSERT INTO "C:\XmlRedbook\PROG\import\graphic.del" OF DEL "C:\XmlRedbook\PROG\import\" BY CODEPAGE=1204 XMLGRAPHIC db2admin.customer;
217
Note: If you specify the MODIFIED BY XMLGRAPHIC option with the IMPORT command, the XML document to be imported must be encoded in the UTF-16 code page, and the file type modifier CODEPAGE value must match the UTF-16 code page, or the row is rejected. XMLPARSE STRIP/PRESERVE WHITESPACE: This option specifies to remove or not to remove white space when XML documents are parsed. When the XMLPARSE option is omitted, the parser behavior for XML documents will be determined by the value of the CURRENT XMLPARSE OPTION special register. Example 5-37 shows the IMPORT command using the XMLPARSE STRIP WHITESPACE option and the data inserted.
Example 5-37 IMPORT with XMLPARSE STRIP WHITESPACE option
IMPORT FROM "C:\Import\DATA\customer.del" OF DEL XML FROM "C:\Import\XML" XMLPARSE STRIP WHITESPACE INSERT INTO db2admin.customer; SELECT * FROM db2admin.customer WHERE id=10000; ID NAME CONTACT_INFO ----------- -------------------- ----------------------------------10000 Sarah Young <ContactInfo><Address><Street>22 Willow Street </Street><City>Los Gatos </City><State>CA </State><Zip>95030 </Zip></Address><Phone><work>408-677-8888 </work><home>408-588-9999 </home><mobile>408-345-6666 </mobile></Phone></ContactInfo> Example 5-38 shows the output of the CUSTOMER table where the XML file is imported with the whitespace preserved.
Example 5-38 IMPORT command using XMLPARSE PRESERVE WHITESPACE.
IMPORT FROM "C:\Import\DATA\customer.del" OF DEL XML FROM "C:\Import\DATA\" XMLPARSE PRESERVE WHITESPACE INSERT INTO db2admin.customer SELECT * FROM db2admin.customer WHERE id=10001 ID NAME CONTACT_INFO ----------- -------------------- ----------------------------------10001 Jay Martins <ContactInfo> <Address>
218
<Street>555 Lincoln Blvd</Street> <City>San Jose</City> <State>CA</State> <Zip>95136</Zip> </Address> <Phone> <work>408-677-8888</work> <home>408-588-9900</home> <mobile>408-345-7777</mobile> </Phone> </ContactInfo> 1 record(s) selected. XMLVALIDATE USING XDS: This option indicates that XML documents are validated against a schema. The schema used for validation is determined by the SCH attribute of the XML Data Specifier (XDS) for each row within the main data file. The USING XDS is a default option when the XMLVALIDATE option is invoked. In the case where the SCH attribute is omitted, no schema validation will occur unless you specified that a default schema is to be used by the DEFAULT schema_qualifier.schema_name clause. The following example shows an example of the XMLVALIDATE USING XDS option in the IMPORT command: IMPORT FROM "C:\Import\DATA\customer.del" OF DEL XML FROM "C:\Import\XML" XMLVALIDATE USING XDS INSERT INTO db2admin.customer; XMLVALIDATE USING XDS DEFAULT schema_sqlid: This option can only be used when the USING XDS parameter is specified to modify the schema determination behavior. The schema specified through the DEFAULT clause indicates the schema to be used for validation when an SCH attribute of the XDS is omitted. This DEFAULT clause takes precedence over the IGNORE and MAP clause to be addressed in the following paragraph. The DEFAULT, IGNORE, and MAP clauses apply to the specifications of the XDS and not to each other. In the following example, the schema name to be used as default for validation in the above IMPORT command is DB2ADMIN.CONTACTINFO. IMPORT FROM "C:\Import\DATA\customer.del" OF DEL XML FROM "C:\Import\XML" XMLVALIDATE USING XDS DEFAULT DB2ADMIN.CONTACTINFO INSERT INTO db2admin.customer;
219
XMLVALIDATE USING XDS IGNORE schema_sqlid: Similar to XMLVALIDATE USING XDS DEFAULT schema_sqlid, this option can only be used when the USING XDS option is specified to modify the schema determination behavior. You can use this option to indicate one or more schemas to ignore if they are identified by an SCH attribute. In the case where an SCH attribute exists in your XDS, and the schema identified by your SCH attribute is included in the list of schemas to IGNORE, no schema validation will take place for the imported XML documents. The command in Example 5-39 tells the IMPORT utility to ignore schema DB2ADMIN.LOAN_APP.
Example 5-39 IMPORT: ignore XML validation
IMPORT FROM "C:\Import\DATA\customer.del" OF DEL XML FROM "C:\Import\XML" XMLPARSE PRESERVE WHITESPACE XMLVALIDATE USING XDS DEFAULT DB2ADMIN.CONTACTINFO IGNORE ( DB2ADMIN.LOAN_APPL ) INSERT INTO db2admin.customer; XMLVALIDATE USING XDS MAP (schema_sqlid, schema_sqlid): This option can be used when the USING XDS parameter is specified. You can use the MAP clause to indicate alternate schemas to be used for validation in place of the schemas identified by the SCH attribute of the XML Data Specifier (XDS). The MAP clause indicates a list of one or more schema pairs, where each pair represents a mapping of the original schema to a substitute schema. The original schema is specified by the SCH attribute in an XDS, and the substitute schema is the one that should be used for schema validation. The IMPORT command in Example 5-40 tells the IMPORT utility to use DB2ADMIN.CUSTOMER for validation instead of the original schema specified by the SCH attribute in the input data file, which is DB2ADMIN.CONTACTINFO.
Example 5-40 IMPORT validation using DB2ADMIN.CUSTOMER
IMPORT FROM "C:\Import\DATA\customer.del" OF DEL XML FROM "C:\Import\XML" XMLPARSE PRESERVE WHITESPACE XMLVALIDATE USING XDS DEFAULT DB2ADMIN.CONTACTINFO IGNORE ( DB2ADMIN.LOAN_APPL ) MAP ( ( DB2ADMIN.CONTACTINFO, DB2ADMIN.CUSTOMER ) ) INSERT INTO db2admin.customer;
220
XMLVALIDATE USING SCHEMA schema_sqlid: This option is used to indicate that all XML documents are to be validated against a specific XML schema. In this case, the SCH attribute of the XDS will be ignored for all XML columns. The following sample command in Example 5-41 tells the DB2 IMPORT utility to ignore whatever schema name is specified in the input data file and just use the schema DB2ADMIN.CUSTOMER for validation.
Example 5-41 Using DB2ADMIN.CUSTOMER for validation
IMPORT FROM "C:\Import\DATA\customer.del" OF DEL XML FROM "C:\Import\XML" XMLPARSE PRESERVE WHITESPACE XMLVALIDATE USING SCHEMA DB2ADMIN.CUSTOMER INSERT INTO db2admin.customer; XMLVALIDATE USING SCHEMALOCATION HINTS: This option indicates that documents are validated against the schemas identified by XML schema location hints in the source XML documents. If a schemaLocation (SCH) attribute in the source XML document is not found, no validation will take place. When this option is specified, the SCH attribute of the XDS within the main data file will be ignored for all XML columns. Example 5-42 shows a sample command for XMLVALIDATE USING SCHEMALOCATION HINTS.
Example 5-42 SCHEMALOCATION HINTS
IMPORT FROM "C:\Import\DATA\customer.del" OF DEL XML FROM "C:\Import\XML" XMLPARSE PRESERVE WHITESPACE XMLVALIDATE USING SCHEMALOCATION HINTS INSERT INTO db2admin.customer; For detailed information about these options, refer to the Data Movement Utilities Guide and Reference, SC10-4227.
5.3.2 EXPORT
The EXPORT utility extracts data from DB2 tables to one or more files on your system. The exported files can be used to import to tables in a different database on the same server or a different server. DB2 9 supports exporting XML data in delimited (DEL) and integrated exchanged format (IXF).
221
EXPORT TO "C:\Export\DATA\customer.del" OF DEL XML TO "C:\Export\DATA", "C:\export\XML" SELECT * FROM db2admin.customer; This command generates the main data file customer.del in C:\export\DATA. The QDM instances are written to two file locations as specified in the
222
EXPORT command. They are C:\Export\DATA\customer.del.001.xml and C:\Export\XML\customer.del.002.xml. XMLFILE filename: This option indicates the base file names to be used for XML data. This option works similarly to export LOB data. The following example shows how this option is used in the EXPORT command: EXPORT TO "C:\Export\DATA\customer.del" OF DEL XML TO "C:\Export\DATA", "C:\export\XML" XMLFILE "CUSTINFO" SELECT * FROM db2admin.customer; This command generates the main data file customer.del under the C:\Export\DATA directory. The QDM instances are written to two locations as specified in the EXPORT command. They are C:\Export\DATA\CUSTINFO.001.xml and C:\export\XML\CUSTINFO.002.xml. As we have learned from DB2 8, the LOBINFILE option provides a means to specify the base name of the LOB file generated by the EXPORT utility. Similarly, in DB2 9, the XMLFILE option specifies the name of the XML file generated by the export utility. By default, the XML file base name is the name of the exported data file with an .xml extension. The full name of the XML file consists of the base name, followed by a number extension that is padded to three digits, and the .xml extension. For LOB, the full name would consist of the base name, followed by a number extension that is padded to three digits and a .lob extension. MODIFIED BY XMLNODECLARATION: This option in Example 5-44 indicates that QDMs are written without an XML declaration tag. QDM instances are exported with an XML declaration tag that includes an encoding attribute at the beginning of the XML file by default.
Example 5-44 Modifying by XMLNODECLARATION
EXPORT TO "C:\Export\DATA\customer.del" OF DEL XML TO "C:\Export\DATA", "C:\export\XML" XMLFILE "CUSTINFO" MODIFIED BY XMLNODECLARATION SELECT * FROM db2admin.customer; MODIFIED BY XMLCHAR: This option in Example 5-45 on page 224 indicates that QDM instances are written in the character code page. The character code page is the value that can be controlled by specify the CODEPAGE file type modifier. If it is not specified, the application code is applied. QDM instances are written out in Unicode (UTF-8) by default.
223
EXPORT TO "C:\Export\DATA\customer.del" OF DEL XML TO "C:\Export\DATA", "C:\export\XML" XMLFILE "CUSTINFO" MODIFIED BY XMLCHAR SELECT * FROM db2admin.customer; MODIFIED BY XMLGRAPHIC: This options indicates that the encoding to be used for the exported XML document is XMLGraphic codepage. When this modifier is used in the EXPORT command, your exported XML document will be encoded in UTF-16 regardless of the CODEPAGE file type modifier or the application code page. See, Example 5-46.
Example 5-46 Modifying by XMLGRAPHIC
EXPORT TO "C:\Export\DATA\customer.del" OF DEL XML TO "C:\Export\DATA", "C:\export\XML" XMLFILE "CUSTINFO" MODIFIED BY XMLGRAPHIC SELECT * FROM db2admin.customer; MODIFIED BY XMLINSEPFILES: When this option is specified, each QDM instance is written to a separate file. By default all exported XML documents are concatenated together in the same file. When you specify MODIFIED BY XMLINSEPFILES, each of the exported XML document will be placed in a separate file. See Example 5-47.
Example 5-47 Modifying by XMLINSEPFILES
EXPORT TO "C:\Export\DATA\customer.del" OF DEL XML TO "C:\Export\DATA", "C:\export\XML" XMLFILE "CUSTINFO" MODIFIED BY XMLINSEPFILES SELECT * FROM db2admin.customer; MODIFIED BY LOBINSEPFILES: Like the XMLINSEPFILES option, this means each LOB value is to be written into a separate file. By default, multiple values are concatenated together in the same exported LOB file. Example 5-48 and Example 5-49 show the EXPORT command with the MODIFIED BY LOBSINSEPFILE option. In Example 5-48, all LOBs values are written to the file C:\Export\LOB\CUST_INFO.001.lob and all XML data are written to the file C:\Export\XML\CUST_INFO.001.xml.
224
Example 5-48 Using MODIFIED BY LOBSINSEPFILE option - all LOBs are in one file
EXPORT TO "a" OF DEL LOBS TO "C:\Export\LOB" LOBFILE "CUST_INFO" XML TO "C:\Export\XML" XMLFILE "CUST_INFO" MODIFIED BY LOBSINFILE SELECT * FROM db2admin.customer; In Example 5-49, each LOB value is written to a separate file under the specified directory C:\Export\LOB. The base name for the exported LOB file is CUST_INFO.00X.lob, where X increases with each LOB value to be exported. Each QDM instance is written to C:\Export\XML\CUST_INFO.001.xml.
Example 5-49 Using MODIFIED BY LOBSINSEPFILE option - separated LOB files
EXPORT TO "C:\Export\DATA\customer.del" OF DEL LOBS TO "C:\Export\LOB" LOBFILE "CUST_INFO" XML TO "C:\Export\XML" XMLFILE "CUST_INFO" MODIFIED BY LOBSINFILE LOBSINSEPFILES SELECT * FROM db2admin.customer; XMLSAVSCHEMA: This option indicates that XML schema information should be saved for all XML columns. For each exported XML document that was validated against an XML schema at the time it was inserted, this option specifies that the name of the XML schema used to validate the XML document be written to the corresponding XML Data Specifier (XDS) as an SCH attribute. In the case where the exported document was not validated against an XML schema or that XML schema no longer existed in the database, the SCH attribute will be omitted in the XDS. The EXPORT command with XMLSAVESCHEMA option is shown in this example: EXPORT TO "C:\Export\DATA\customer.del" OF DEL XMLSAVESCHEMA SELECT * FROM db2admin.customer; The main data file is written to file C:\Export\DATA\customer.del, and the QDM instance is written to XML file C:\Export\DATA\customer.del.001.xml. The content of the main data file looks as follows: 10000,"Sarah Young","<XDS FIL='customer.del.001.xml' OFF='0' LEN='273' SCH='DB2ADMIN.CONTACTINFO' />" 10001,"Jay Martins","<XDS FIL='customer.del.001.xml' OFF='273' LEN='266' SCH='DB2ADMIN.CONTACTINFO' />"
225
Example 5-50 shows how to specify an XML file location and file name prefix.
Example 5-50 Specify XML file location and prefix
EXPORT TO "C:\Export\DATA\customer.del" OF DEL XML TO "C:\export\XML" XMLFILE "CUSTINFO" MODIFIED BY XMLINSEPFILES XMLSAVESCHEMA SELECT * FROM db2admin.customer; Each XML file is written to a separate file with the default file names C:\export\XML\CUSTINFO.001.xml and C:\export\XML\CUSTINFO.0002.xml.
----------------------------- Commands Entered ----------------------CONNECT TO DEMO; SELECT * FROM DB2ADMIN.CUSTOMER; EXPORT TO "C:\Export\DATA\customer.del" OF DEL XML TO "C:\Export\XML" XMLFILE "CUST_INFO" MODIFIED BY XMLINSEPFILES XMLSAVESCHEMA SELECT * FROM DB2ADMIN.CUSTOMER; ----------------------------------------------------------------------CONNECT TO DEMO; Database Connection Information Database server SQL authorization ID Local database alias = DB2/NT 9.1.0 = DB2ADMIN = DEMO
SELECT * FROM DB2ADMIN.CUSTOMER ID NAME CONTACT_INFO ----------- -------------------- -------------------------------------10000 Sarah Young <ContactInfo><Address><Street>22 Willow Street </Street><City>Los Gatos </City><State>CA
226
</State><Zip>95030</Zip></Address><Phone><work>408-677-8888 </work><home>408-588-9999 </home><mobile>408-345-6666 </mobile></Phone></ContactInfo> 10001 Jay Martins <ContactInfo><Address><Street>555 Lincoln Blvd</Street><City>San Jose</City><State>CA</State><Zip>95136</Zip></Address><Phone><work>408677-8888</work><home>408-588-9900</home><mobile>408-345-7777</mobile></ Phone></ContactInfo> 2 record(s) selected.
EXPORT TO "C:\Export\DATA\customer.del" OF DEL XML TO "C:\Export\XML" XMLFILE "CUST_INFO" MODIFIED BY XMLINSEPFILES XMLSAVESCHEMA SELECT * FROM DB2ADMIN.CUSTOMER SQL3104N The Export utility is beginning to export data to file "C:\Export\DATA\customer.del". SQL3105N The Export utility has finished exporting "2" rows.
Number of rows exported: 2 Two XML files produced as the result of the EXPORT command are C:\export\XML\CUSTINFO.001.xml and C:\export\XML\CUSTINFO.002.xml. The main data file C:\Export\DATA\customer.del produced with content is shown in Example 5-52.
Example 5-52 Main data file created by EXPORT
10000,"Sarah Young","<XDS FIL='CUST_INFO.001.xml' SCH='DB2ADMIN.CONTACTINFO' />" 10001,"Jay Martins","<XDS FIL='CUST_INFO.002.xml' SCH='DB2ADMIN.CONTACTINFO' />" When you specify the XML TO option for the EXPORT command, be sure that the path name is correct or the EXPORT command will fail. If you specify the EXPORT command with the XMLTO option to a directory or path that does not exist, the EXPORT command raises the error SQL3235N stating that the EXPORT utility cannot use the specified path. Example 5-53 illustrates the error received when you specify the bad path name.
227
------------------------------ Commands Entered ----------------------EXPORT TO "C:\Export\DATA\customer.del" OF DEL XML TO "C:\export\BADFOLERNAME\" XMLFILE "CUSTINFO" MODIFIED BY XMLINSEPFILES SELECT * FROM DB2ADMIN.CUSTOMER; SELECT * FROM DB2ADMIN.CUSTOMER; ----------------------------------------------------------------------EXPORT TO "C:\Export\DATA\customer.del" OF DEL XML TO "C:\export\BADFOLERNAME\" XMLFILE "CUSTINFO" MODIFIED BY XMLINSEPFILES SELECT * FROM DB2ADMIN.CUSTOMER SQL3235N The utility cannot use the "XML" path "C:\export\BADFOLERNAME\" parameter as specified. Reason code: "3". SQL3235N The utility cannot use the "XML" path "C:\export\BADFOLERNAME\" parameter as specified. Reason code: "3". Explanation: One of the following reason codes may apply:
1 Either the path "<path-name>" is not a valid sqlu_media_list or the values provided are not valid. The media_type must be SQLU_LOCAL_MEDIA and all path names must be terminated with a valid path separator. 2 There is not enough space on the paths provided for the EXPORT utility to hold all the data of type "<type>". 3 The path "<path-name>" cannot be accessed. User Response: Determine which reason code applies above, correct the problem, and resubmit your command.
228
5.3.3 RUNSTATS
The RUNSTATS command updates statistics about physical characteristics of table columns and associated indexes. These statistics include information such as number of records, number of pages, and average record length. These statistics are used by the DB2 optimizer to determine the optimal access paths to the data. As XML becomes a native data type in DB2 9, the RUNSTATS utility has been updated to collect XML column statistics. The statistics collected are at both the XML document and node level, and can be used for cost estimation when selecting execution plans. This information was not available on the catalog for the user to view or update in the first release of DB2 9.
229
Statistics collection on XML type columns is governed by two DB2 database system registry values: DB2_XML_RUNSTATS_PATHID_K DB2_XML_RUNSTATS_PATHVALUE_K These two parameters are similar to the NUM_FREQVALUES parameter in that they specify the number of frequency values to collect. If not set, a default of 200 will be used for both parameters.
230
231
RUNSTATS ON TABLE schemaName.tableName ON COLUMNS columnList1 WITH DISTRIBUTION ON COLUMNS columnList2 This command collects basic column statistics on all columns included in columnList1 plus distribution statistics for all columns included in columnList2. For each XML column that is included in either columnList1 or columnList2, the basic column statistics are collected, because the basic XML column statistics are the same as the distribution statistics for any XML column. Using the EMPLOYEE table, the following two commands collect XML statistics for both xcol1 and xcol2: RUNSTATS ON TABLE db2admin.employee ON COLUMN(c1) WITH DISTRIBUTION; RUNSTATS ON TABLE db2admin.employee ON COLUMNS(c1) WITH DISTRIBUTION ON COLUMNS (xcol2, xcol2); If you only want to collect XML statistics for column xcol1, the RUNSTATS command would look similar to this: RUNSTATS ON TABLE DB2ADMIN.EMPLOYEE ON COLUMNS(xcol1) WITH DISTRIBUTION(xcol1); RUNSTATS ON TABLE schemaName.tableName EXCLUDING XML COLUMNS For the convenience of users, DB2 9 supports a new clause for a RUNSTATS utility called EXCLUDING XML COLUMNS. You can specify the EXCLUDING XML COLUMNS clause in the RUNSTATS command to exclude all XML columns from statistics collection if statistics for XML columns are not required or if you want to have XML columns statistics collected at another time. The EXCLUDING XML COLUMNS clause takes precedence over all other clauses that specify XML columns for RUNSTATS, so be aware that the EXCLUDING XML COLUMNS can be ambiguous at times. For example, in the following RUNSTATS command: RUNSTATS ON TABLE db2admin.employee ON COLUMNS(c1, xcol2) WITH DISTRIBUTION ON ALL COLUMN EXCLUDING XML COLUMN No statistics for XML column xcol2 are collected even though you explicitly specify the xcol2 in the column list, because you have the EXCLUDING XML COLUMNS clause specified. The RUNSTATS command simply omitted all of the XML columns from statistics collection in this case. Note: In DB2 9, RUNSTATS does not support the KEY COLUMNS clause for the XML data type because an XML type column cannot be a key column.
232
Row-level access control means that you can control which users are allowed to access which rows. Column-level access control means that you can control
user access at the column-level. To achieve these controls, we can use the new DB2 9 feature label-based access control (LBAC). An access control at the XML node level means that you can control user access level on the XML elements or attributes inside one XML document. To achieve this, we can use the VIEW and XMLTABLES function.
5.4.1 LBAC
LBAC is a new DB2 9 security feature that provides a configurable capability to control access on individual rows and columns. A security administrator who is granted with SECADM authority performs the LBAC security setup. The security administrator configures the LBAC system by creating security policies. A security policy describes the criteria that are used to decide who has access to specific data. Under the security policy, the security administrator creates security labels and associates the label with the rows and columns to be protected. The security administrator also associates labels with users. The LBAC policies compare the data labels with the users label to determine if the user has access to the specific row or column.
Security labels have a new data type, DB2SECURITYLABEL. A new SECURED WITH option is added to the CREATE TABLE and ALTER TABLE statements to associate the security label with the rows or columns.
SECADM
SECADM authority is a brand new authority in DB2 9. SECADM is aimed to centralize security privileges. The abilities given by SECADM are not given by any other authority, not even SYSADM. Functions that only SECADM is allowed to perform are as follows:
Create and drop security label components. Create and drop security policies. Create and drop security labels. Grant and revoke security labels. Grant and revoke LBAC rule exemptions.
233
Grant and revoke SETSESSIONUSER privileges. Execute the SQL statement TRANSFER OWNERSHIP on objects that you do not own.
234
Employee Database
UserA
HR
CEO Director Manager
UserB
General information for regular employees Confidential information for regular employees
Engineer
Architect
In order to adopt this security policy, we separate the employee information into two parts, general and confidential. Each of them is described as an XML document and stored into two XM columns. Example 5-54 shows a sample of general information in XML format.
Example 5-54 General information
<Employee> <Name>John Smith1</Name> <EmpNo>001</EmpNo> <Title>Manager</Title> <Phone type="work">312-964-0001</Phone> <Email>[email protected]</Email> </Employee> Example 5-55 is a sample of employee confidential information in XML format.
Example 5-55 Confidential information
235
<Address country="US"> <Street>1 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> <Phone type="home">678-181-0001</Phone> </Address> <Salary>10000</Salary> </Employee> The employee table has an ID column, two columns for XML documents, and a column for LBAC. Figure 5-12 illustrates the data model for the employee table EMP. The general information of each employee is stored in EMP1 column and the confidential information is stored in EMP2 column. USERA from HR is authorized to access all the data in the EMP table. The regular employee USERB can only access column EMPID and EMP1 in row#2 and row#3 where row#2 is a record of an engineer and row#3 is a record of an architect. The columns USERB can access are highlighted in gray.
Employee Database
EMP Table
EMPID 001 EMP1 (XML) EMP2 (XML) SEC
Manager
UserA HR
UserA can access everything.
Engineer
Architect
<?xml version="1.0"?> <Employee> <Name>John Smith5</Name> <EmpNo>005</EmpNo> <Title>Manager</Title> <Phone type="work">312-964-0005</Phone> <Email>[email protected]</Email> </Employee>'),
Director
UserB
UserB can only access gray cells
<?xml version="1.0"?> <Employee> <Name>John Smith5</Name> <EmpNo>005</EmpNo> <DateOfBirth>2/25/1967</DateOfBirth> <SSN>892-76-0005</SSN> <Address country="US"> <Street>5 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> <Phone type="home">678-181-0005</Phone> </Address> <Salary>50000</Salary> </Employee>
CEO
Figure 5-12 Data mode of EMP table and access scope of USERA and USERB
236
Implementing LBAC
In this section, we show the procedure of implementing data security using LBAC. Following are the steps required to set up the LBAC: 1. 2. 3. 4. 5. 6. 7. Appointing a security administrator Creating a security label component Creating a security policy Creating a security label Adding security policy to table Assigning security labels to users Verifying the security setting
237
The RESTRICT NOT AUTHORIZED WRITE SECURITY LABEL clause indicates that the insert or update operation will fail if the user is not authorized to write the explicitly specified security label that is provided in the INSERT or UPDATE statement.
GRANT SECURITY LABEL emp_policy.hr_only TO USER usera FOR ALL ACCESS GRANT SECURITY LABEL emp_policy.public TO USER userb FOR ALL ACCESS GRANT SECURITY LABEL emp_policy.hr_only TO USER db2admin FOR ALL ACCESS GRANT SECURITY LABEL emp_policy.public TO USER db2admin FOR ALL ACCESS GRANT EXEMPTION ON RULE DB2LBACWRITEARRAY WRITEDOWN FOR emp_policy TO USER db2admin The GRANT EXEMPTION statement gives DB2ADMIN an access rule exception for the security policy. With this exception, DB2ADMIN can create a table and insert data into tables that are associated with EMP_POLICY.
238
CONNECT TO XMLRB USER db2admin USING db2admin CREATE TABLE "EMP" ( ID char(3) NOT NULL PRIMARY KEY, EMP1 XML, EMP2 XML SECURED WITH HR_ONLY, SEC DB2SECURITYLABEL) SECURITY POLICY EMP_POLICY; The EMP2 column must be protected and it is defined with a SECURED WITH clause with the security label HR_ONLY. We also must protect those rows that store the managers information. For this, we have the SEC column defined as a DB2SECURITYLABLE type for storing the LBAC information. When a row is inserted, the HR_ONLY label will be added. The SECURITY POLICY clause associates the security policy with the table. After the table is created, we grant SELECT, INSERT, UPDATE, DELTE privileges to USERA and USERB on the EMP table using the following commands: GRANT select, insert, update, delete ON db2admin.emp TO USER userb GRANT select, insert, update, delete ON db2admin.emp TO USER usera
INSERT INTO EMP VALUES ( '001', XMLPARSE( DOCUMENT '<?xml version="1.0"?> <Employee> <Name>John Smith1</Name> <EmpNo>001</EmpNo> <Title>Manager</Title> <Phone type="work">312-964-0001</Phone> <Email>[email protected]</Email> </Employee>'),
239
XMLPARSE( DOCUMENT '<?xml version="1.0"?> <Employee> <Name>John Smith1</Name> <EmpNo>001</EmpNo> <DateOfBirth>2/21/1967</DateOfBirth> <SSN>892-76-0001</SSN> <Address country="US"> <Street>1 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> <Phone type="home">678-181-0001</Phone> </Address> <Salary>10000</Salary> </Employee>'), SECLABEL_BY_NAME('EMP_POLICY','HR_ONLY'));
The security setting can be verified by trying to access the data as USERA or USERB using the following SELECT statement: SELECT ID, xmlquery('$c/Employee/Name' passing emp1 as "c") as NAME, xmlquery('$c/Employee/Title' passing emp1 as "c") as TITLE FROM DB2ADMIN.EMP Example 5-59 shows the result set from USERA and USERB using the same query. The result sets are different due to the security restriction. USERA who has authorized to use the HR_ONLY security label received all the records in the EMP table. USERB can only get two rows that contain the personnel records of non-management employees. The DB2 LBAC mechanism detects that USERB has an authority of PUBLIC label only and does not allow USERB to access data labeled with HR_ONLY, confirming that row-level protection is successfully implemented.
Example 5-59 Query result from USERA
-- Result set from USERA NAME ----------------------------001 <Name>John Smith1</Name> 002 <Name>John Smith2</Name> 003 <Name>John Smith3</Name> 004 <Name>John Smith4</Name> 005 <Name>John Smith5</Name>
240
-- Result set from USERB NAME ----------------------------002 <Name>John Smith2</Name> 003 <Name>John Smith3</Name>
To check the column-level protection, the following query is run by USERA and USERB: SELECT ID, xmlquery('$c/Employee/Name' passing emp1 as "c") as NAME, xmlquery('$c/Employee/Salary' passing emp2 as "c") as SALARY FROM DB2ADMIN.EMP Example 5-60 shows the result from USERA and USERB. USERA can issue the XQuery on the EMP2 column because it is labeled as HR_ONLY, and USERA is authorized on this label. USERB receives an error message indicating that he is not authorized to read data from the EMP2 column. This confirms that the column-level access control is also successfully implemented.
Example 5-60 Result of accessing protected data
-- Result from 001 <Name>John 002 <Name>John 003 <Name>John 004 <Name>John 005 <Name>John
-- Result from USERB -SQL20264N For table "EMP", authorization ID "USERB" does not have "READ" access to the column "EMP2". SQLSTATE=42512
241
Employee Database
USERA
HR
USERB
In this scenario. a table, EMPLOYEE, is created to store employees information in XML format. There are five employees, John Smith1~ JohnSmith5, in this table. We want to define the range that USERA and USERB can access. Figure 5-14 illustrates the logical model and access control required for the EMPLOYEE table. There are five XML documents stored in the EMP column in the EMPLOYEE table. USERA can access all information in XML documents, on the other hand, USERB can only access the following general information: /Employee/Name /Employee/EmpNo /Employee/Title /Employee/Email
242
Employee Database
EMPLOYEE Table
EMPID EMP (XML)
UserA
ca n ev acc ery es inf s orm at i
HR
on .
005
UserB
<?xml version="1.0"?> <Employee> <Name>John Smith1</Name> <EmpNo>001</EmpNo> <Title>Manager<Title> <DateOfBirth>2/21/1967</DateOfBirth> <SSN>892-76-0001</SSN> <Address country="US"> <Street>1 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> <Phone type="home">678-181-0001</Phone> <Phone type="work">312-964-0001</Phone> <Phone type="home">678-181-0001</Phone> <Email>[email protected]</Email> <Salary>10000</Salary> </Employee>
JohnSmith1
JohnSmith2
JohnSmith3
JohnSmith4
JohnSmith5
Figure 5-14 data model of EMPLOYEE table and access range for USERA and USERB
What do we do to construct a security model to meet these requirements? If you simply give USERA and USERB select authority for the EMPLOYEE table, USERB would be able to issue XQuery to the EMPLOYEE table and select the confidential data. The simplest way to achieve this data-access security requirement is using the view in DB2. You can map element and attribute values in XML documents into a relational column and then define views for USERA and USERB. There are several ways to map XML elements and attributes into relational columns. We use XMLTABLE functions in this scenario. Figure 5-15 illustrates how data access can be restricted using views. Two views are defined for USERA and USERB. Both are based on the EMPLOYEE table. These two views reflect the result sets of the XMLTALBES function from the DB2ADMIN.EMPLOYEE table. We show the scenario setup and the creation of view commands in the next section.
243
Employee Database
VIEW : CONFIDENTIAL_EMP_DATA (ALIAS EMPLOYEE)
EMPNO EMPNAME DATEOF BIRTH SSN ADDRESS WORKPH ONE HOMEPH ONE EMAIL SALARY
UserA
HR
005
CREATE VIEW CONFIDENTIAL_EMP_DATA as ( SELECT X.* from XMLTABLE ('db2fn:xmlcolumn("EMPLOYEE.EMP")/Employee' COLUMNS "NAME" VARCHAR(32) PATH 'Name', .. ) AS "X") CREATE VIEW GENERAL_EMP_DATA as ( SELECT X.* from XMLTABLE ('db2fn:xmlcolumn("EMPLOYEE.EMP")/Employee' COLUMNS "NAME" VARCHAR(32) PATH 'Name', . ) AS "X")
DB2ADMIN.EMPLOYEE Table
EMPID 001 002 003 004 005 EMP (XML) <Employee/> <Employee/> <Employee/> <Employee/> <Employee/>
UserB
EMPNO
EMPNAME
WORKPHONE
244
<?xml version="1.0"?> <Employee> <Name>John Smith1</Name> <EmpNo>001</EmpNo> <DateOfBirth>2/21/1967</DateOfBirth> <SSN>892-76-0001</SSN> <Address country="US"> <Street>1 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> </Address> <Phone type="work">312-964-0001</Phone> <Phone type="home">678-181-0001</Phone> <Email>[email protected]</Email> <Salary>10000</Salary> </Employee> The IMPORT command we used to move data into the table is: IMPORT FROM impfile.txt OF DEL XML FROM . MODIFIED BY XMLCHAR REPLACE INTO db2admin.employee The XDS describing the XML data files is as shown in Example 5-62.
Example 5-62 impfile.txt
245
CREATE VIEW Db2admin.employee_a AS ( SELECT x.* FROM XMLTABLE ('db2-fn:xmlcolumn("DB2ADMIN.EMPLOYEE.EMP")/Employee' COLUMNS "NAME" VARCHAR(32) PATH 'Name', "EMPNO" VARCHAR(3) PATH 'EmpNo', "TITLE" VARCHAR(12) PATH 'Title', "DATEOFBIRTH" VARCHAR(10) PATH 'DateOfBirth', "SSN" VARCHAR(11) PATH 'SSN', "STREET" VARCHAR(64) PATH 'Address/Street', "CITY" VARCHAR(12) PATH 'Address/City', "STATE" VARCHAR(2) PATH 'Address/State', "ZIP" VARCHAR(5) PATH 'Address/Zip', "WORKPHONE" VARCHAR(12) PATH 'Phone[@type="work"]', "HOMEPHONE" VARCHAR(12) PATH 'Phone[@type="home"]', "EMAIL" VARCHAR(32) PATH 'Email', "SALARY" INTEGER PATH 'Salary' ) AS "X" ) After creating the view, grant access privilege for EMPLOYEE_A view to USERA with the following command: GRANT SELECT ON db2admin.employee_a TO USER usera
CREATE VIEW db2admin.employee_b AS ( SELECT X.* from XMLTABLE ('db2-fn:xmlcolumn("DB2ADMIN.EMPLOYEE.EMP")/Employee' COLUMNS "NAME" VARCHAR(32) PATH 'Name', "EMPNO" VARCHAR(3) PATH 'EmpNo', "TITLE" VARCHAR(12)PATH 'Title', "WORKPHONE" VARCHAR(12) PATH 'Phone[@type="work"]', "EMAIL" VARCHAR(32) PATH 'Email' ) AS "X" )
246
Grant the access privilege for EMPLOYEE_B to USERB using the following command: GRANT SELECT ON DB2ADMIN.EMPLOYEE_b TO USER userb To have a unified name for all the views based on the EMPLOYEE table, USERA and USERB can create an alias for the views using the following commands: CONNECT TO xmlrb USER usera USING usera; CREATE ALIAS employee FOR db2admin.employee_a; CONNECT TO XMLRB USER USERB USING USERB; CREATE ALIAS employee FOR db2admin.employee_b;
NAME ------------John Smith1 John Smith2 John Smith3 John Smith4 John Smith5
TITLE -----------Manager Engineer Architect Director CEO CITY -----------Los Gatos Los Gatos Los Gatos Los Gatos Los Gatos
DATEOFBIRTH ----------2/21/1967 2/22/1967 2/23/1967 2/24/1967 2/25/1967 STATE ----CA CA CA CA CA ZIP ----95034 95034 95034 95034 95034
STREET -----------------1 East Main Street 2 East Main Street 3 East Main Street 4 East Main Street 5 East Main Street
247
Connect to the database as USERB and issue the same SELECT query. connect to xmlrb user userb using userb SELECT * FROM EMPLOYEE Example 5-66 is the result set. You can see that the identical query issued by USERA and USERB have returned different result sets.
Example 5-66 Result set of selecting all the data from EMPLOYEE view by USERB
NAME ----------John Smith1 John Smith2 John Smith3 John Smith4 John Smith5
From the test result, we verify that we have successfully set up the XML node level access control by view and XMLTABLE function. USERA can see all elements in the XML documents, and USERB can only select what they are allowed to see. By using both features provided in DB2 9, you can easily control the XML data access scope for each user.
248
Chapter 6.
Application development
This chapter covers various aspects of application development using DB2. The information contained here features topics and examples that are specific to application development involving XML. The subjects covered are: The database application development environment Application development tools Accessing pureXML from application overview XML and stored procedures Web services The manuals listed here are suggested references for the DB2 application development topics that are covered in this chapter: Call Level Interface Guide and Reference, Volume 1, SC10-4224 Call Level Interface Guide and Reference, Volume 2, SC10-4225 Command Reference, SC10-4226 Developing ADO.NET and OLE DB Applications, SC10-4230 Developing Embedded SQL Applications, SC10-4232 Developing Java Applications, SC10-4233 Developing Perl and PHP Applications, SC10-4234 Developing SQL and External Routines, SC10-4373 Getting Started with Database Application Development, SC 10-4252 XML Guide, SC10-4254
249
250
Developer Workbench
The following support is provided for XML in the Developer Workbench (DWB): Stored procedure support: Create and run stored procedures that contain XML data types as input or output parameters. Data output support: View documents contained in XML columns as a tree or text. SQL builder support: Build SQL expressions with XML functions and run SQL statements that contain XML host variables. XML schema support: Manage schema documents in the XML schema repository (XSR), including registering and dropping schemas, as well as editing schema documents. XML document validation support: Perform validation of XML documents against schemas registered in the XSR. XQuery builder features: Build the XQuery statements visually by dragging and dropping nodes that represent elements in an XML schema or document. Specify predicates, expressions, clauses, and sorting preferences for each node. XQuery builder then generates the query for you. (Alternatively, you can write your own statements or modify the generated statements directly in the provided editing environment). After the query is created, it can be tested by running it directly from the Developer Workbench.
251
Figure 6-1 Using the Control Center to create a table with an XML column
Creating a database with XML support Creating indexes over XML columns using the new Create Index wizard Viewing the contents of XML documents stored in XML columns Working with the XML schemas, DTDs, and external entities required to validate and process XML documents. Collecting statistics on tables containing XML columns Using Visual Explain
252
Working with the XML schemas, DTDs, and external entities required to validate and process XML documents Reorganizing indexes over XML data and tables containing XML columns Decomposing XML documents Figure 6-2 is an example of an XQuery issued from the Command Line processor.
253
Details of the DB2 Development Add-in for Visual Studio can be found in 6.8, The DB2 .NET environment on page 321.
254
Stored procedure support: Stored procedures that contain XML data type (input or output) parameters or return XML data can be created and run. Data output view: XML data type columns can be viewed on the results page, and the content of XML columns can be visualized as a tree or document text. XML Editor: With the XML Editor, you can perform the following tasks: Create and edit XML documents Generate XML documents from an XML schema Annotated schema mapping tool Support for XML schema: Existing XML schemas and XML schema documents can be loaded from the XML schema repository in the database and properties, such as target namespace or schema location, can be viewed. New XML schemas (and corresponding XML schema documents) can be registered or dropped. XML document validation: XML value validation for XML documents against a registered XML schema can be performed. XQuery builder: With the XQuery builder, you can complete queries without understanding XQuery semantics. An XML query can be built visually by selecting sample resultant nodes from a tree representation of a schema or XML document and dragging the nodes onto a return grid. After a node is listed on the return grid, you can drill down into the query to add predicates and sorting preferences. You can drill down multiple levels in a query to specify nested predicates, clauses, and expressions. After building the query, it can be run and tested directly from Developer Workbench.
255
256
2. Choose the XML query wizard: Select File New Other. The Select a Wizard window opens. See Figure 6-3. Choose XML Query and then select NEXT.
Figure 6-3 The Select a wizard dialog box showing XML Query selected
257
3. Create a new project, or specify an already existing one: In the Specify a Project window, you can new project, or specify an already existing one. Choose New to create a new project or choose an existing project from the Project drop-down box. Click NEXT. The New Data Development Project window opens. Specify a name for the new project. In our example a new project named xmlLUW is created. See Figure 6-4.
258
4. Select a database connection: After creating a new project (or choosing an existing project), the wizard proceeds to the Select Connection window. At this point, a new database connection can be created or an existing database connection can be chosen. See Figure 6-5. Click Next.
259
5. Specify the JDK home directory: The Specify Routine Parameters window requests that the JDK home directory be specified (see Figure 6-6). Either accept the location displayed or browse to another location. Click Finish.
6. Specify a query name: The New XML Query window opens, as shown in Figure 6-7. Specify a name for the query. In this example, the query is named xmlQuery1. Click Next.
Figure 6-7 The New XML Query window; specifying a name for the query.
260
7. Add representative XML documents: As shown in Figure 6-8, the New XML Query window will reopen to the Add representative XML documents window.
To add a representative document from your local workspace, or from the database to be queried, select ADD. The following example, Figure 6-9, shows the Specify document location window after ADD has been selected. In our example, Database has been chosen as the location of the representative XML document. Click Next.
Figure 6-9 Selecting Database as the location of the representative XML document
261
8. Choose the XML document source: The XML column or schema window opens (Figure 6-10) and the INFO column of the CUSTOMER table is chosen as the source containing the XML document to be queried. Click Next.
262
In Figure 6-11, the New XML Query window reopens and displays the representative document that has been chosen in the previous step. Select the document, then click Next.
9. Associate the document with XML columns: The Associate documents with XML columns window opens (Figure 6-12). Click Finish.
263
A new query, xmlQuery1.xqm, is added to the Queries node and DWB opens to the XQuery Builder in Design View. This is the main view from which the query will be built. See Figure 6-13.
Figure 6-13 Developer Workbench with XQuery Builder open to Design view
Building an XQuery
Drag and drop the customerinfo node from the sample XML document tree to a row in the design grid. The node name will appear in the grid and a drill in button (arrow highlighted by the red circle here) will be displayed at the end of the row (Figure 6-14).
264
Figure 6-14 Drag and drop the customerinfo node to the design grid
The source code generated by the GUI can be viewed by selecting the Source tab in the Design view, see Figure 6-15.
265
FLWOR expression
You can create a more elaborate query by adding predicates to the search and ordering the returned elements. To accomplish this, while in Design view, click the Step into button at the end of the first row in the design grid. When you step into a query, the existing grid is replaced with five new grids representing the FOR, LET, WHERE, ORDERBY and RETURN parts of the FLWOR expression.
266
When these changes are made in the GUI, the results can be viewed by selecting the Source tab. Example 6-1 shows the code from the Source tab that was generated by the GUI in the previous steps.
Example 6-1 Source code for the query created by XQuery Builder
values(XMLQUERY(' declare boundary-space strip; declare namespace def0="http://posample.org"; for $customerinfo0 in db2-fn:xmlcolumn("CUSTOMER.INFO")/def0:customerinfo where $customerinfo0/@Cid = 1000 return ( $customerinfo0/def0:name, $customerinfo0/def0:addr, $customerinfo0/def0:phone ) ' RETURNING SEQUENCE))
In order to execute the query, select the Run... option in the XQuery menu, as shown in Figure 6-17.
267
268
Example 6-2 DB2 command to create a DB2 database with CODESET UTF-8
CREATE DATABASE MYDB USING CODESET UTF-8 TERRITORY US Note: Refer to the manual Command Reference, SC10-4226 for the complete syntax and options available for creating a DB2 database.
XML parsing
XML parsing is the process of converting XML data from its serialized string format to its hierarchical format. Simply stated, XML parsing converts character or binary data and produces an XML value. You can let the DB2 database manager perform parsing implicitly, or you can perform XML parsing explicitly. Implicit XML parsing occurs: When you pass data to the database server using a host variable of type XML, or use a parameter marker of type XML The database server does the parsing when it binds the value for the host variable or parameter marker for use in statement processing. You must use implicit parsing in this case. When you assign a host variable, parameter marker, or SQL expression with a string data type (character, graphic or binary) to an XML column in an INSERT, UPDATE, DELETE, or MERGE statement. The parsing occurs when the SQL compiler implicitly adds an XMLPARSE function to the statement.
1
See www.w3.org/TR for information about the XML 1.0 specification. The Extensible Markup Language (XML) fourth Edition (1.0) is the latest recommendation as of the date of this book.
269
Example 6-3 demonstrates a case of implicit parsing. In this example, the source is an XML document from a column of type VARCHAR.
Example 6-3 An example of implicit parsing
/* 1) Assume table TABLE1 has been created with the following definition: */ /* CREATE TABLE table1 (id INT, description VARCHAR(200)) */ /* 2) Assume TABLE1 has been populated as follows:*/ /* INSERT INTO table1 VALUES (22222, '<product xmlns = \"http://posample.org\" pid=\"80\"> <description><name> Plastic Casing </name> <details> Green Color </details> <price> 7.89 </price> <weight> 6.23 </weight> </description></product>', 'Last Product')" /* 3) Assume table po has been created with the following definition: */ /* CREATE TABLE po (poid BIGINT, porder XML) */ char stmt[500]; SQLRETURN cliRC = SQL_SUCCESS; strcpy(stmt, "INSERT INTO po (poid, porder) " "(SELECT id, description FROM table1 WHERE id = 22222)"); /* execute the statement */ cliRC = SQLExecDirect(hstmt, (SQLCHAR *)stmt, SQL_NTS); STMT_HANDLE_CHECK(hstmt, hdbc, cliRC);
Explicit parsing occurs: When the XMLPARSE function is invoked when inputting XML data. The XMLPARSE function takes a non-XML, character or binary data type as input. The result of the XMLPARSE function can be utilized in any context that accepts an XML data type, for example, it can be assigned to an XML column or used as a stored procedure parameter of type XML. For embedded dynamic SQL applications, you must cast the parameter marker that represents the input document for XMLPARSE to the appropriate data type.
270
Example 6-4 illustrates casting the parameter marker to BLOB(1K) for the input Document parameter of the XMLPARSE function in a dynamic CLI application.
Example 6-4 Casting the parameter marker to BLOB using XMLPARSE function
char blobdata[500]; SQLRETURN cliRC = SQL_SUCCESS; length = strlen(blobdata); /* Assume table po has been created with the following definition: */ /* CREATE TABLE po (poid BIGINT, porder XML) */ strcpy(blobdata, "<product xmlns = \"http://posample.org\" pid=\"10\"><description><name> Plastic Casing </name>" "<details> Blue Color </details><price> 2.89 </price>" "<weight> 0.23 </weight></description></product>"); strcpy(stmt, "INSERT INTO po (poid, porder) " "VALUES (323, XMLPARSE(DOCUMENT CAST(? as BLOB(1K))))"); /* prepare the statement */ cliRC = SQLPrepare(hstmt, (SQLCHAR *)stmt, SQL_NTS); /* bind Paramenter to the Insert statement */ cliRC = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_BINARY, SQL_BLOB, length, 0, &blobdata, length, NULL); /* execute the statement */ cliRC=SQLExecute(hstmt); For embedded static SQL applications, a host variable argument of the XMLPARSE function cannot be declared as an XML type (XML AS BLOB, XML AS CLOB, or XML AS DBCLOB type). Example 6-5 illustrates a static embedded SQL application; in this example the host variable argument of the XMLPARSE function is declared as BLOB.
271
EXEC SQL DECLARE SECTION; char xmldata[2000]; char parse_option[30]; short nullind = 0; static SQL TYPE IS BLOB(1k) hv_blob2 = SQL_BLOB_INIT("<init> a </init>"); EXEC SQL END DECLARE SECTION; /* Assume table PO has been created with the following definition: */ /* CREATE table PO (poid BIGINT, porder XML) */ strcpy(xmldata, "<product xmlns = \"http://posample.org\" pid=\"10\"><description><name> Plastic Casing </name>" "<details> Blue Color </details><price> 2.89 </price>" "<weight> 0.23 </weight></description></product>"); strcpy(hv_blob2.data, xmldata); EXEC SQL UPDATE PO SET porder = XMLPARSE( DOCUMENT :hv_blob2 STRIP WHITESPACE) WHERE POID = 1612;
Boundary whitespace is whitespace characters that appear between elements. For example, in the following document the spaces between <customerinfo> and <name> and between </customerinfo> and </name> are considered boundary whitespace.
<customerinfo> <name> </name> </customerinfo> With explicit invocation of XMLPARSE, you use the STRIP WHITESPACE or PRESERVE WHITESPACE option to control preservation of boundary whitespace. The default is stripping of boundary whitespace.
272
With implicit XML parsing: If the input data type is not an XML type or is not cast to an XML data type, the DB2 database manager always strips whitespace. If the input data type is an XML data type, you can use the CURRENT IMPLICIT XMLPARSE OPTION special register to control preservation of boundary whitespace. You can set this special register to STRIP WHITESPACE or PRESERVE WHITESPACE. The default is stripping of boundary whitespace. Note that this special register only applies for nonvalidating XML parsing. If the input data type is non-XML, but is CAST as XML (either explicitly or as an ambiguous parameter marker) then implicit XML parse applies and the CURRENT IMPLICIT XMLPARSE OPTION special register will do as well. Example 6-6 illustrate setting the CURRENT IMPLICIT XMLPARSE OPTION special register in various application settings.
Example 6-6 Setting the Current IMPLICIT XMPLPARSE OPTION special register
CLI: strcpy((char *)stmt, "SET CURRENT IMPLICIT XMLPARSE OPTION = 'PRESERVE WHITESPACE'"); rc = SQLExecDirect(hstmt, stmt, SQL_NTS); Embedded SQL: EXEC SQL BEGIN DECLARE SECTION; char parse_option[30]; EXEC SQL END DECLARE SECTION; strcpy(parse_option, "preserve whitespace"); /* SET the register with the option PRESERVE WHITESPACE */ EXEC SQL SET CURRENT IMPLICIT XMLPARSE OPTION = :parse_option; JAVA (SQLJ): String parse_option = "preserve whitespace"; #sql { SET CURRENT IMPLICIT XMLPARSE OPTION = :parse_option}; Note: The CurrentImplicitXMLParseOption can also be set in the db2cli.ini initialization file. Refer to Current Implicit XML Parse Option in the Call Level Interface Guide and Reference, Volume 1, SC10-4224, for details.
273
XML validation
XML validation is the process of determining whether the structure, content, and data types of an XML document are valid. XML validation also adds type annotations to element nodes, attribute nodes, and atomic values, and strips off ignorable whitespace in the XML document. Validation is optional, but highly recommended. The XMLVALIDATE function is used to validate an XML document. It is commonly used when inserting or updating an XML document in a DB2 database. XMLVALIDATE can also be invoked on an XML document that is not in a database. Before you can invoke XMLVALIDATE, all schema documents that make up an XML schema must be registered in the built-in XML schema repository (XSR). An XML schema provides the rules for a valid XML document. If you use XML validation, the DB2 database manager ignores the CURRENT IMPLICIT XMLPARSE OPTION special register and uses only the validation rules to determine stripping or preservation of whitespace in the following cases: xmlvalidate(? ACCORDING TO XMLSCHEMA ID schema name) xmlvalidate(?) xmlvalidate(:hvxml ACCORDING TO XMLSCHEMA ID schema name) xmlvalidate(:hvxml) xmlvalidate(cast(? as xml) ACCORDING TO XMLSCHEMA ID schema name) xmlvalidate(cast(? as xml)) In these cases, question mark (?) represents XML data, and :hvxml is an XML host variable. Important: The insert or update operation on which the XMLVALIDATE was specified will only occur if the validation succeeds.
274
Externally encoded data can have internal encoding. That is, the data might be sent to the database server as character data, but the data contains encoding information. The database server handles incompatibilities between internal and external encoding as follows: If the database server is DB2 Database for Linux, UNIX, and Windows, the database server generates an error if the external and internal encoding are incompatible, unless the external and internal encoding are Unicode. If the external and internal encoding are Unicode, the database server ignores the internal encoding. If internal encoding is Unicode, but external is non-Unicode, the mismatch will be flagged. If the database server is DB2 for z/OS, the database server ignores the internal encoding. Data in XML columns is stored in UTF-8 encoding. The database server handles conversion of the data from its internal or external encoding to UTF-8. When you store XML data in a DB2 table, observe the following rules: If the internal and external encoding are not Unicode encoding, for externally encoded XML data (data that is sent to the database server using character data types), any internally encoded declaration must match the external encoding. Otherwise, an error occurs, and the database manager rejects the document. If the external encoding and the internal encoding are Unicode encoding, and the encoding schemes do not match, the DB2 database server ignores the internal encoding. For internally encoded XML data (data that is sent to the database server using binary data types), the application must ensure that the data contains accurate encoding information.
275
An application program can retrieve an entire document or a fragment of a document from an XML column. However, you can store only an entire document in an XML column. When you fetch an entire XML document, you retrieve the document into an application variable. When you retrieve an XML sequence, you have several choices: Execute an XQuery expression directly. To execute an XQuery expression in an application, you add the string XQUERY to the XQuery expression, and dynamically execute the resulting string. When you execute an XQuery expression directly, the DB2 database server returns the sequence that is the result of the XQuery statement as a result table. Each row in the result table is an item in the sequence. Execute an XQuery expression within an SQL FETCH or single-row SELECT INTO operation by calling the XMLQUERY or XMLTABLE built-in functions and passing an XQuery expression as an argument. XMLQUERY is a scalar function that returns the entire sequence in an application variable. XMLTABLE is a table function that returns each item in the sequence as a row of the result table. The columns in the result table are values from the retrieved sequence item. An illustration of this is shown in Example 6-7. This technique can be used with static or dynamic SQL and any application programming language.
Example 6-7 Executing an XQuery expression within an SQL FETCH
select deptID,xmlquery('for $d in $doc/dept where $d/@bldg = 101 return $d/name' passing doc as "doc") from dept where deptID <> "PR27";
276
In order to pass application values to XQuery expressions, use the SQL/XML functions XMLQUERY and XMLTABLE. The PASSING clause of these functions allows you to use application values during the evaluation of the XQuery expression. Example 6-8 illustrates passing an application value to an XQuery expression, in a Java application, using the SQL/XML function XMLQUERY.
Example 6-8 Passing an application value to an XQuery expression in a Java application
// The table CUSTOMER exists with the following definition: // CREATE TABLE customer (cid BIGINT, info XML, history XML) private static int cid=1002; ... Statement stmt = con.createStatement(); String query="select xmlquery('declare default element namespace \"http://posample.org\";"+ " for $customer in $cust/customerinfo"+ " where ($customer/@Cid gt $id)"+ " return <customer id=\"{$customer/@Cid}\">"+ " {$customer/name} {$customer/addr} </customer>'"+ " passing by ref customer.info as \"cust\", cast(? as integer) as \"id\")"+ " from customer"; // Prepare the statement PreparedStatement pstmt = con.prepareStatement(query); // Set the value for the parameter marker pstmt.setInt(1,cid); ResultSet rs = pstmt.executeQuery();
277
Truncation can occur when conversion to the target data type results in expansion of the data. Truncation is possible because expansion can occur when UTF-8 characters are converted to UTF-16 or UCS-2 encoding. Note: Refer to the XML Guide, SC10-4254, chapter 8, XML CODING for complete details regarding XML coding considerations.
278
Note: Before you set up your CLI environment, ensure that you have set up the application development environment. Refer to the Call Level Interface Guide and Reference, Volume 1, SC10-4224 for an overview of the CLI application development environment setup. In order for a DB2 CLI application to successfully access a DB2 database: 1. Ensure that the DB2 CLI/ODBC driver was installed during the DB2 client install. 2. Catalog the DB2 database and node if the database is being accessed from a remote client. On the Windows platform, you can use the CLI/ODBC Settings GUI to catalog the DB2 database. 3. Optional: Explicitly bind the DB2 CLI/ODBC bind files to the database with the command: db2 bind ~/sqllib/bnd/@db2cli.lst blocking all sqlerror continue \ messages cli.msg grant public On the Windows platform, you can use the CLI/ODBC Settings GUI to bind the DB2 CLI/ODBC bind files to the database. 4. Optional: Change the DB2 CLI/ODBC configuration keywords by editing the db2cli.ini file, located in the sqllib directory on Windows, and in the sqllib/cfg directory on UNIX platforms. On the Windows platform, you can use the CLI/ODBC Settings GUI to set the DB2 CLI/ODBC configuration keywords. Once you have completed the foregoing steps, proceed to setting up your Windows CLI environment, or setting up your UNIX ODBC environment if you are running ODBC applications on UNIX.
279
On Windows: sqllib\samples\xml\cli sqllib\samples\xml\xquery\cli The build scripts, bldapp (on UNIX) or bldapp.bat (Windows), contain the commands to build a DB2 CLI application. It takes up to four parameters, represented inside the UNIX script file by the variables: $1, $2, $3, and $4, or inside the Windows file by the variables: %1, %2, %3, and %4. Parameter $1 (%1): This parameter specifies the name of your source file. This is the only required parameter, and the only one required for CLI applications that do not contain embedded SQL. Building embedded SQL programs requires a connection to the database, so three optional parameters are also provided. Parameter $2 (%2): This parameter specifies the name of the database to which you want to connect. Parameter $3 (%3): This parameter specifies the user ID for the database. Parameter $4 (%4): This parameter specifies the password. If the program contains embedded SQL, with a .sqc or the .sqx extension, then the embprep (UNIX) or the embprep.bat (Windows) script is called to precompile the program, producing a program file with a .c or a .cxx extension. To build the sample program tbinfo from the source file tbinfo.c, enter: bldapp tbinfo The result is an executable file, tbinfo. You can run the executable file by entering the executable name: tbinfo In addition to the sample build scripts supplied by DB2, it is possible to build all of the applications by executing the makefile that is found in the corresponding directories. On UNIX, the makefile is found in these directories: sqllib/samples/cli sqllib/samples/xml/cli sqllib/samples/xml/xquery/cli On Windows, the makefile is found in these directories: sqllib\samples\cli sqllib\samples\xml\cli sqllib\samples\xml\xquery\cli
280
Before running the makefile, modify the makefile to reflect your environment: set UID (user ID to access the sample database) set PWD (password to access the sample database) To build the file or files, execute the appropriate command for your environment in your working directory, for example: UNIX make some_parameter Windows nmake some_parameter Where some_parameter corresponds to one of the parameters specified below:
281
Note: To ensure a successful build of the sample applications, we suggest that you: Read the Prerequisites section of the header in the sample file and follow the directions and suggestions before building or running the sample. Make sure that a compatible make. or nmake, executable program is resident on your system in a directory included in your PATH variable.
282
Example 6-9 shows an INSERT of XML data into an XML column. In this example, the data buffer is bound with the recommended SQL_C_BINARY type, and the ParameterType for SQLBindParameter() is SQL_XML. Because SQL_C_BINARY is used, the data must be internally encoded in order to be interpreted correctly. In this example the internal encoding is declared as ISO-8859-1.
Example 6-9 Inserting XML data with recommended SQL_C_BINARY type binding
char xmldata[500]; int length; SQLRETURN cliRC = SQL_SUCCESS; /* Assume the table PO has been created with the following definition: */ /* CREATE table PO (poid BIGINT, porder XML) */ strcpy(xmldata, "<?xml=\1.0\ encoding=\ISO-8859-1\?><product xmlns = \"http://posample.org\" pid=\"10\"><description>" "<name> Plastic Casing </name>" "<details> Blue Color </details>" "<price> 2.89 </price>" "<weight> 0.23 </weight>" "</description></product>"); length = strlen(xmldata); /* inserting when source is from host variable of type XML */ strcpy(stmt, "INSERT INTO PO (poid, porder) " "VALUES (8956, ?)");
/* prepare the statement */ cliRC = SQLPrepare(hstmt, (SQLCHAR *)stmt, SQL_NTS); /* bind Paramenter to the Insert statement */ cliRC = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_BINARY, SQL_XML, length, 0, &xmldata, length,
283
NULL); cliRC = SQLExecute(hstmt); Example 6-10 demonstrates an INSERT of XML data into an XML column. In this example, the data buffer is bound with the SQL_C_CHAR type. The function XMLCAST is used to typecast the character data to an XML data type.
Example 6-10 Using XMLCAST to typecast data into an XML column
char xmldata[500]; SQLRETURN cliRC = SQL_SUCCESS; /* Assume the table PO exists with the following definition: */ /* CREATE table po (poid BIGINT, porder XML) */ strcpy(xmldata, "<product xmlns = \"http://posample.org\" pid=\"10\"><description>" "<name> Plastic Casing </name>" "<details> Blue Color </details>" "<price> 2.89 </price>" "<weight> 0.23 </weight>"; "</description></product>"); strcpy(stmt, "INSERT INTO PO (poid, porder) " "VALUES(125, XMLCAST(? as XML))"); cliRC = SQLPrepare(hstmt, (SQLCHAR *)stmt, SQL_NTS); /* bind Paramenter to the Insert statement */ cliRC = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR, 500, 0, &xmldata, 500, NULL); cliRC = SQLExecute(hstmt);
284
The code segment in Example 6-11 illustrates binding a parameter marker for an INSERT operation when the source is a variable of Type XML. This example also demonstrates implicit parsing.
Example 6-11 An INSERT with implicit parsing
strcpy(xmldata, "<?xml version=\1.0\ encoding=\ISO-8859-1\ ?><product xmlns = \"http://posample.org\" pid=\"10\"><description>" "<name> Plastic Casing </name>" "<details> Blue Color </details>" "<price> 2.89 </price>" "<weight> 0.23 </weight>" "</description></product>"); length = strlen(xmldata); /* Assume the table PO exists with the following definition: */ /* CREATE table po (poid BIGINT, porder XML) */ strcpy(stmt, "INSERT INTO PO (poid, porder) " "VALUES (8956, ?)"); cliRC = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_XML, length, 0, &xmldata, length, NULL); cliRC = SQLExecute(hstmt);
285
Example 6-12 illustrates performing an INSERT when the source is an XML document from a column of type VARCHAR. In this case, the description column is explicitly parsed because an SQL expression with a string data type is assigned to an XML column.
Example 6-12 INSERT an XML document from VARCHAR column with explicit parsing
/* 1) Assume table TABLE1 has been created with the following definition: */ /* CREATE TABLE table1 (id INT, description VARCHAR(500)) */ /* 2) Assume TABLE1 has been populated as follows:*/ /* INSERT INTO table1 VALUES (22222, '<product xmlns = \"http://posample.org\" pid=\"80\"> <description><name> Plastic Casing </name> <details> Green Color </details> <price> 7.89 </price> <weight> 6.23 </weight> </description></product>', 'Last Product')" /* 3) Assume table po has been created with the following definition: */ /* CREATE TABLE po (poid BIGINT, porder XML) */ char stmt[500]; SQLRETURN cliRC = SQL_SUCCESS; strcpy(stmt, "INSERT INTO po (poid, porder) " "(SELECT id, XMLPARSE(DOCUMENT description) FROM table1 WHERE id = 22222)"); /* execute the statement */ cliRC = SQLExecDirect(hstmt, (SQLCHAR *)stmt, SQL_NTS); STMT_HANDLE_CHECK(hstmt, hdbc, cliRC);
286
When retrieving a result set from an XML column, we recommend that you bind your application variable to the SQL_C_BINARY type. Binding to character types can result in possible data loss resulting from code page conversion. Data loss can occur when characters in the source code page cannot be represented in the target code page. Binding your variable to the SQL_C_BINARY C type avoids these issues. XML data is returned to the application as internally encoded data. DB2 CLI determines the encoding of the data as follows: If the C type is SQL_C_BINARY, the data is returned in the UTF-8 encoding scheme. If the C type is SQL_C_CHAR or SQL_C_DBCHAR: If the C type is SQL_C_CHAR, the data is returned in the application character code page encoding scheme. If the C type is SQL_C_DBCHAR, the data is returned in the application graphic code page encoding scheme. If the C type is SQL_C_WCHAR, the data is returned in the UCS-2 encoding scheme. When an XML value is retrieved into an application data buffer, the DB2 server performs an implicit serialization on the XML value to convert it from its stored hierarchical form to the serialized string form. For character typed buffers, the XML value is implicitly serialized to the application code page associated with the character type. By default, an XML declaration is included in the output serialized string. This default behavior can be changed by setting the Attribute and ValuePtr arguments of SQLSetStmtAttr(), respectively, to: SQL_ATTR_XML_DECLARATION SQL_XML_DECLARATION_NONE For further information about CLI connection attributes, refer to the manual: CLI Guide and Reference, volume 2. The default behavior for including an XML declaration in the output serialized string can also be altered by changing XMLDeclaration in the CLI/ODBC configuration keyword in the db2cli.ini file. Refer to the manual CLI Guide and Reference, volume 1, for more information. Example 6-13 on page 288 illustrates setting the SQL_ATTR_XML_DECLARATION attribute in the SQLSetStmtAttr() function.
287
int rc = 0; rc=SQLSetStmtAttr(hdbc, SQL_ATTR_XML_DECLARATION, (SQLPOINTER)SQL_XML_DECLARATION_NONE, SQL_NTS); The code segment in Example 6-14 illustrates binding the column of a result set to an application variable declared as a character data type SQL_C_CHAR. This example also shows an XQuery that is not preceded by the keyword XQUERY. As required, the SQL_ATTR_XQUERY_STATEMENT attribute of the SQLSetStmtAttr() function has been set to SQL_TRUE, indicating that the statement is an XQUERY.
Example 6-14 Binding the column of a result set to a character data type
SQLRETURN cliRC = SQL_SUCCESS; int rc = 0; SQLHANDLE hstmt; /* statement handle */ SQLVARCHAR xmldata[3000]; /* The table Customer exists with the following definition: */ /* CREATE table CUSTOMER ( cid BIGINT, info XML, history XML) */ /* query to be executed */ SQLCHAR *stmt = (SQLCHAR *)"declare default element namespace \"http://posample.org\";" "for $custinfo in db2-fn:xmlcolumn('CUSTOMER.INFO')" "/customerinfo[addr/@country=\"Canada\"]" " order by $custinfo/name" " return $custinfo"; cliRC = SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &hstmt); /* Set the attribute SQL_ATTR_XQUERY_STATEMENT to indicate that the query is an XQuery */ rc = SQLSetStmtAttr(hstmt, SQL_ATTR_XQUERY_STATEMENT, (SQLPOINTER)SQL_TRUE, SQL_NTS); if (rc != 0) { return rc; } cliRC = SQLExecDirect(hstmt, stmt, SQL_NTS); /* bind column 1 to variable */
288
cliRC = SQLBindCol(hstmt, 1, SQL_C_CHAR, &xmldata, 1000, NULL); /* fetch each row and display */ cliRC = SQLFetch(hstmt); ... The code segment in Example 6-15 illustrates a query that binds the result of an SQL/XML query to an application variable bound with an SQL_C_BINARY data type.
Example 6-15 An SQL/XML query with the result bound to a column of SQL_C_BINARY data type
... char xmlBuffer[10240]; /* xmlBuffer is used to hold the retrieved XML document */ integer length; /* Assume a table named dept has been created with the definition */ /* CREATE TABLE dept (id CHAR(8), deptdoc XML) */ length = sizeof (xmlBuffer); SQLExecute (hStmt, "SELECT deptdoc FROM dept WHERE id='001'", SQL_NTS); SQLBindCol (hStmt, 1, SQL_C_BINARY, xmlBuffer, &length, NULL); SQLFetch (hStmt); SQLCloseCursor (hStmt); // xmlBuffer now contains a valid XML document encoded in UTF-8 ...
289
This section focuses on various aspects of embedded SQL application programming, specifically in relation to XML. All of the specifics, and nuances, of developing embedding SQL applications are beyond the scope of this document. Note: For a complete understanding of application development using embedded SQL, refer to the manual Developing Embedded SQL Applications, SC10-4232.
On Windows:
The build scripts, bldapp (on UNIX) or bldapp.bat (Windows,) contain the commands necessary to build a DB2 CLI application. It takes up to four parameters, represented inside the UNIX script file by the variables: $1, $2, $3, and $4, or inside the Windows file by the variables: %1, %2, %3, and %4.
290
Parameter $1 (%1): Specifies the name of your source file. This is the only required parameter, and the only one required for CLI applications that do not contain embedded SQL. Building embedded SQL programs requires a connection to the database so three optional parameters are also provided. Parameter $2 (%2): Specifies the name of the database to which you want to connect. Parameter $3 (%3): Specifies the user ID for the database. Parameter $4 (%4): Specifies the password. For embedded SQL programs, the build files, bldapp or bldapp.bat, pass the parameters to the precompile and bind script, embprep (UNIX) or embprep.bat (Windows). If no database name is supplied, the default SAMPLE database is used. The user ID and password parameters are only required if the instance where the program is built is different from the instance where the database is located.
291
If accessing another database on the same instance, enter the executable name and the database name: tbmod database If accessing a database on another instance, enter the executable name, database name, and user ID and password of the database instance: tbmod database userid password
292
Where some_parameter corresponds to one of the parameters specified here: make (or nmake) <app_name> /*Builds the program designated by <app_name>*/ make (or nmake) all /* Builds all supplied sample programs */ make (or nmake) srv /*Builds sample that can only be run on the server, (stored procedure)*/ make (or nmake) all_client /* Builds all client samples (all programs in the 'call_rtn' and 'client_run' categories). */ make (or nmake) call_rtn /* Builds client programs that call stored procedure */ make (or nmake) client_run /* Builds all programs that run completely on the client (not ones that call stored procedure)*/ make (or nmake) clean /* Erases all intermediate files produced in the build process */ make (or nmake) cleanall /* Erases all files produced in the build process (all files except the original source files)*/ Note: To ensure a successful build of the sample applications, we suggest that you: Read the Prerequisites section of the header in the sample file and follow the directions/suggestions before building or running the sample. Make sure that a compatible make. or nmake, executable program is resident on your system in a directory included in your PATH variable.
293
EXEC SQL BEGIN DECLARE SECTION; ... SQL TYPE IS XML AS CLOB(10K) xmlclob; ... EXEC SQL END DECLARE SECTION; SQL TYPE IS XML AS DBCLOB(n) <hostvar_name> Where <hostvar_name> is a DBCLOB host variable that contains XML data encoded in the application graphic codepage. See Example 6-17 on page 295.
294
EXEC SQL BEGIN DECLARE SECTION; ... SQL TYPE IS XML AS DBCLOB(10K) xmldbclob; ... EXEC SQL END DECLARE SECTION; SQL TYPE IS XML AS BLOB(n) <hostvar_name> Where <hostvar_name> is a BLOB host variable that contains XML data internally encoded. See Example 6-18.
Example 6-18 BLOB SQL type
EXEC SQL BEGIN DECLARE SECTION; ... SQL TYPE IS XML AS BLOB(10K) xmlblob; ... EXEC SQL END DECLARE SECTION; SQL TYPE IS XML AS CLOB_FILE <hostvar_name> Where <hostvar_name> is a CLOB file that contains XML data encoded in the application mixed codepage. See Example 6-19.
Example 6-19 CLOB_FILE SQL type
EXEC SQL BEGIN DECLARE SECTION; ... SQL TYPE IS XML AS CLOB_FILE clob_file; ... EXEC SQL END DECLARE SECTION; SQL TYPE IS XML AS DBCLOB_FILE <hostvar_name> Where <hostvar_name> is a DBCLOB file that contains XML data encoded in the application graphic codepage. See Example 6-20.
Example 6-20 DBCLOB_FILE SQL type
EXEC SQL BEGIN DECLARE SECTION; ... SQL TYPE IS XML AS DBCLOB_FILE dbclob_file; ... EXEC SQL END DECLARE SECTION;
295
SQL TYPE IS XML AS BLOB_FILE <hostvar_name> Where <hostvar_name> is a BLOB file that contains XML data internally encoded. Example 6-21 shows an example.
Example 6-21 BLOB_FILE SQL type
EXEC SQL BEGIN DECLARE SECTION; ... SQL TYPE IS XML AS BLOB_FILE blob_file; ... EXEC SQL END DECLARE SECTION;
// The table definition for the table myTable is: // // CREATE TABLE myTable (id varchar(5), xmlCol XML) // EXEC SQL BEGIN DECLARE; SQL TYPE IS XML AS CLOB( 10K ) xmlBuf; SQL TYPE IS XML AS BLOB( 10K ) xmlblob; SQL TYPE IS CLOB( 10K ) clobBuf; EXEC SQL END DECLARE SECTION; // as XML AS CLOB EXEC SQL SELECT xmlCol INTO :xmlBuf FROM myTable WHERE id = '001'; EXEC SQL UPDATE myTable SET xmlCol = :xmlBuf WHERE id = '001'; // as XML AS BLOB EXEC SQL SELECT xmlCol INTO :xmlblob FROM myTable WHERE id = '001'; EXEC SQL UPDATE myTable SET xmlCol = :xmlblob WHERE id = '001';
296
/* as CLOB using XMLSERIALIZE to return a serialized version of CLOB data type */ // The output will be encoded in the application character codepage, // but will not contain an XML declaration EXEC SQL SELECT XMLSERIALIZE (xmlCol AS CLOB(10K)) INTO :clobBuf FROM myTable WHERE id = '001'; EXEC SQL UPDATE myTable SET xmlCol = XMLPARSE (:clobBuf PRESERVE WHITESPACE) WHERE id = '001';
297
The precompiler generates a structure tag which can be used to cast to the host variable's type. Following are generated structure tags for various data type declarations: BLOB example: Declaration: static Sql Type is Blob(2M) my_blob=SQL_BLOB_INIT("mydata"); This declaration results in the generation of the following structure: static struct my_blob_t { sqluint32 length; char data[2097152]; } my_blob=SQL_BLOB_INIT("mydata"); CLOB example: Declaration: volatile sql type is clob(125m) *var1, var2 = {10, "data5data5"}; This declaration results in the generation of the following structure: volatile struct var1_t { sqluint32 length; char data[131072000]; } * var1, var2 = {10, "data5data5"}; DBCLOB examples: Declaration: SQL TYPE IS DBCLOB(30000) my_dbclob1; When precompiled with the WCHARTYPE NOCONVERT option, this declaration results in the generation of the following structure: struct my_dbclob1_t { sqluint32 length; sqldbchar data[30000]; } my_dbclob1; Declaration: SQL TYPE IS DBCLOB(30000) my_dbclob2 = SQL_DBCLOB_INIT(L"mydbdata"); When precompiled with the WCHARTYPE CONVERT option, this declaration results in the generation of the following structure: struct my_dbclob2_t { sqluint32 length; wchar_t data[30000]; } my_dbclob2 = SQL_DBCLOB_INIT(L"mydbdata");
298
EXEC SQL BEGIN DECLARE SECTION; short nullind; static SQL TYPE IS XML AS CLOB(1k) xmlclob1=SQL_CLOB_INIT("<a> a </a>") ; static SQL TYPE IS BLOB(1k) hv_blob2 = SQL_BLOB_INIT("<init> a </init>"); static SQL TYPE IS XML AS BLOB(1k) xmlblob3 = SQL_BLOB_INIT("<init> a</init>"); EXEC SQL END DECLARE SECTION; EXEC SQL INSERT INTO purchaseorder (poid, porder) VALUES (1612, :xmlclob1:nullind); EXEC SQL INSERT INTO purchaseorder (poid, porder) VALUES (712, XMLPARSE(DOCUMENT :hv_blob2:nullind STRIP WHITESPACE)); EXEC SQL INSERT INTO purchaseorder (poid, porder) VALUES (999, :xmlclob3:nullind);
299
Example 6-24 shows an embedded XQuery statement. Observe that the statement is dynamically prepared, declared, opened, and fetched.
Example 6-24 An embedded XQuery statement
EXEC SQL INCLUDE SQLCA; EXEC SQL BEGIN DECLARE SECTION; char stmt[16384]; SQL TYPE IS XML AS BLOB( 10K ) xmlblob; EXEC SQL END DECLARE SECTION; sprintf( stmt, "XQUERY declare default element namespace \"http://posample.org\";" "db2-fn:xmlcolumn('CUSTOMER.INFO')");
PREPARE s1 FROM :stmt; DECLARE c1 CURSOR FOR s1; OPEN c1; FETCH c1 INTO :xmlblob;
while( sqlca.sqlcode == SQL_RC_OK ) { /* Display results */ xmlblob.data[xmlblob.length]='\0'; printf("\n\n\n%s",xmlblob.data); EXEC SQL FETCH c1 INTO :xmlblob; EMB_SQL_CHECK("cursor -- fetch"); } EXEC SQL CLOSE c1; The alternative to using dynamic XQuery statements, is to use the XMLQUERY function. In this way, XQuery constructs can be embedded statically in an SQL statement. The code segment in Example 6-25 shows an embedded static SQL statement in which an XQuery is called from the XMLQUERY function.
Example 6-25 XQuery called from within an XMLQUERY function.
EXEC SQL BEGIN DECLARE SECTION; char stmt[16384]; SQL TYPE IS XML AS BLOB( 10K ) xmlblob; EXEC SQL END DECLARE SECTION;
300
EXEC SQL DECLARE C2 CURSOR FOR SELECT XMLQUERY( 'declare default element namespace "http://posample.org"; $cust/customerinfo[addr/city="Toronto"]' PASSING CUSTOMER.INFO as "cust" RETURNING SEQUENCE BY REF) from customer; EXEC SQL OPEN c2; EXEC SQL FETCH c2 INTO :xmlblob; while( sqlca.sqlcode == SQL_RC_OK ) { /* Display results */ xmlblob.data[xmlblob.length]='\0'; printf("\n\n\n%s",xmlblob.data); EXEC SQL FETCH c2 INTO :xmlblob; EMB_SQL_CHECK("cursor -- fetch"); } EXEC SQL CLOSE c2;
301
For complete details concerning the SQLDA structure, refer to Chapter 3 of the manual: Developing Embedded SQL Applications, SC10-4232.
302
During the DB2 Database for Linux, UNIX, and Windows installation process, select Java support on UNIX or Linux, or JDBC support on Windows. These selections are the defaults. Selection of Java support or JDBC support causes the DB2 installation process to automatically perform the following actions: 1. Install the IBM DB2 Driver for JDBC and SQLJ class files, and to modify the CLASSPATH to include them. 2. Install IBM DB2 Driver for JDBC and SQLJ license files, and modify the CLASSPATH to include them. 3. Configure TCP/IP. In addition to these steps, the following steps must be completed: On DB2 servers on which you plan to run Java stored procedures or user-defined functions, update the database manager configuration to include the path where the SDK for Java is located. If you plan to run Java stored procedures that work with XML data on DB2 Database for Linux, UNIX, and Windows servers, you must set the IBM DB2 Driver for JDBC and SQLJ as the default JDBC driver for running stored procedures. Note: For complete information regarding the installation of the DB2 driver for JDBC and SQLJ, refer to: Developing Java Applications, SC10-4233.
303
On Windows: sqllib\samples\xml\java\jdbc sqllib\samples\xml\xquery\java\jdbc To build and run the sample JDBC applications from the command line: 1. Compile the source_filename.java (where source_filename is the name of a source file in the samples directory) to produce the file source_filename.class with this command: javac source_filename.java For example, if the file is DbInfo.java the command would be: javac DbInfo.java 2. Execute the application with this command: java source_filename For example, to execute the DbInfo.class the command would be: java DbInfo Note: You can also use the Java makefile command to build the sample programs provided. The makefile command can be found in the same directories as the source code for JDBC and SQLJ sample applications.
304
Input Data Type byte[], BLOB, CLOB, DB2Xml, InputStream, Reader, String String
Encoding considerations
XML data can be internally or externally encoded. When the encoding of XML data is derived from the data itself, it is known as internally encoded data. If the data is derived from external sources, it is known as externally encoded data. XML data that is sent to the database server as binary data is treated as internally encoded data. XML data that is sent to the database server as character data is treated as externally encoded data. External encoding for Java applications is always Unicode encoding. Externally encoded data can have internal encoding. That is, the data might be sent to the database server as character data, but the data contains encoding information. The database server handles incompatibilities between internal and external encoding as follows: If the database server is DB2 Database for Linux, UNIX, and Windows, the database server generates an error if the external and internal encoding are incompatible, unless the external and internal encoding are Unicode. If the external and internal encoding are Unicode, the database server ignores the internal encoding. If the database server is DB2 for z/OS, the database server ignores the internal encoding. Data in XML columns is stored in UTF-8 encoding. The database server handles conversion of the data from its internal or external encoding to UTF-8. Example 6-26 illustrates a technique of inserting XML data from a file into a DB2 database using the PreparedStatement.setBinaryStream method. The data is inserted as binary data, so the database accepts the encoding.
Example 6-26 Inserting XML data from a file input as binary data
// Assume the table PO exists with the following definition:// // CREATE table PO (poid BIGINT, porder XML) String sql = "INSERT INTO PO VALUES(?, ?)"; PreparedStatement stmt = connection.prepareStatement(sql); stmt.setInt(1, 5000);
305
File binFile = new File("myXmlFile.xml"); InputStream inBin = new FileInputStream(binFile); stmt.setBinaryStream(2, inBin, (int) binFile.length()); stmt.execute(); Example 6-27 shows a technique of inserting XML data from a file into a DB2 database using the PreparedStatement.setClob( ) method. The data is inserted as character data (CLOB), so it is treated as externally encoded data.
Example 6-27 Inserting XML data from a file using the setClob( ) method
int customerid = 0; String customerInfo = ""; String Data = new String(); Data=returnFileValues("myXmlFile.xml"); // Create a CLOB object java.sql.Clob clobData = com.ibm.db2.jcc.t2zos.DB2LobFactory.createClob(Data);
PreparedStatement pstmt = con.prepareStatement( "UPDATE customer " + "SET INFO=XMLPARSE(document cast(? as Clob) strip whitespace)" + " WHERE cid=1008"); System.out.println(" Set parameter value: parameter 1 = " + "clobData" ); pstmt.setClob(1, clobData); pstmt.execute(); General recommendations for input of XML data Here are some basic recommendations: If the input data is in a file, read the data in as a binary stream (setBinaryStream) so that the database manager processes it as internally encoded data. If the input data is in a Java application variable, your choice of application variable type determines whether the DB2 database manager uses any internal encoding. If you input the data as a character type (for example, setString), the database manager converts the data from UTF-16 (the application code page) to UTF-8 before storing it.
306
String sql = "SELECT POID, DESCRIPTION from PO where POID = ?"; PreparedStatement stmt = connection.prepareStatement(sql); stmt.setInt(1, 5000); ResultSet resultSet = stmt.executeQuery(); String xml = resultSet.getString("PORDER"); // also possible InputStream inputStream = resultSet.getBinaryStream("PORDER"); // also possible Reader reader = resultSet.getCharacterStream("PORDER"); Use the ResultSet.getObject method to retrieve the data, and then cast it to the DB2Xml type and assign it to a DB2Xml object. Then use a DB2Xml.getDB2XXX or DB2Xml.getDB2XmlXXX method to retrieve the data into a compatible output data type. Example 6-29 illustrates this point.
Example 6-29 Retrieving data using getObject
ResultSet rs = stmt.executeQuery("XQUERY for $i in db2-fn:" + "xmlcolumn('COMPANY.DOC') /company/"+ "emp[@id = '42366'] return $i/name "); while (rs.next()) { com.ibm.db2.jcc.DB2Xml data = (com.ibm.db2.jcc.DB2Xml) rs.getObject(1); // Print the result as an DB2 XML String System.out.println(); System.out.println(data.getDB2XmlString()); System.out.println(); }
307
Table 6-2 lists the ResultSet methods and corresponding output data types for retrieving XML data.
Table 6-2 ResultSet methods and output data types for retrieving XML data Method ResultSet.getAsciiStream ResultSet.getBinaryStream ResultSet.getBytes ResultSet.getcharacterStream ResultSet.getObject ResultSet.getString Output data type InputStream InputStream byte[] Reader DB2Xml String
Table 6-3 lists the methods and corresponding output data types for retrieving data from a DB2Xml object, as well as the type of encoding in the XML declaration that the driver adds to the output data. To summarize Table 6-3: DB2Xml.getDB2XmlXXX methods add XML declarations with encoding specifications to the output data. DB2Xml.getDB2XXX methods do not add XML declarations with encoding specifications to the output data.
Table 6-3 Methods, output data types, and encoding specifications Method
DB2Xml.getDB2AsciiStream DB2Xml.getDB2BinaryStream DB2Xml.getDB2Bytes DB2Xml.getDB2CharacterStream DB2Xml.getDB2String DB2Xml.getDB2XmlAsciiStream DB2Xml.getDB2XmlBinaryStream
DB2Xml.getDB2XmlBytes
byte[]
DB2Xml.getDB2XmlCharacterString DB2Xml.getDB2XmlString
Reader String
308
309
For example, if the file is DbAuth.sqlj the command would be: bldsqlj DbAuth OR bldsqlj,bat DbAuth 2. Execute the application with this command: java DbAuth Note: If you are running a Java application on UNIX in a 64-bit DB2 instance but the software development kit for Java is 32-bit, you have to change the DB2 library path before running the application. For example, on AIX: If using bash or Korn shell: export LIBPATH=$HOME/sqllib/lib32 If using C shell: setenv LIBPATH $HOME/sqllib/lib32
Encoding considerations
As with JDBC applications, XML data in SQLJ applications can be internally or externally encoded. When the encoding of XML data is derived from the data itself, it is known as internally encoded data. If the data is derived from external sources, it is known as externally encoded data. XML data that is sent to the database server as binary data is treated as internally encoded data. XML data that is sent to the database server as character data is treated as externally encoded data. External encoding for Java applications is always Unicode encoding.
310
Externally encoded data can have internal encoding. That is, the data might be sent to the database server as character data, but the data contains encoding information. The database server handles incompatibilities between internal and external encoding as follows: If the database server is DB2 Database for Linux, UNIX, and Windows, the database server generates an error if the external and internal encoding are incompatible, unless the external and internal encoding are Unicode. If the external and internal encoding are Unicode, the database server ignores the internal encoding. If the database server is DB2 for z/OS, the database server ignores the internal encoding. Data in XML columns is stored in UTF-8 encoding. The database server handles conversion of the data from its internal or external encoding to UTF-8.
Examples
Example 6-30 demonstrates inserting data from a String host expression, xmlData, into an XML column. The String xmlData is a character type, so external encoding is used, whether or not internal encoding is specified.
Example 6-30 Inserting data from a String host expression
String xmlData = "XMLPARSE(document '<customerinfo " + "cid=\"999\"><address country= " + "\"US\"><street>225 Brown St." + "</street><city>White Plains</city><state>"+ "NEW YORK</state></address>" + "</customerinfo>' preserve whitespace)"; #sql [ctx] {INSERT INTO CUSTOMER VALUES (1, :xmlData)}; Example 6-31 demonstrates copying data from a String, xmlString, into a byte array with CP500 encoding; the data then contains an XML declaration with an encoding declaration for CP500. In this example, the data is then inserted from the byte[] host expression into an XML column. A byte string is considered to be internally encoded data.
Example 6-31 Copying data from a String into a byte array with CP500 encoding
String xmlData = "XMLPARSE(document '<customerinfo " + "cid=\"999\"><address country= " + "\"US\"><street>225 Brown St." + "</street><city>White Plains</city><state>"+ "NEW YORK</state></address>" + "</customerinfo>' preserve whitespace)";
311
byte[] xmlBytes = xmlData.getBytes("CP500"); #sql[ctx] {INSERT INTO CUSTOMER VALUES (4, :xmlBytes)}; Example 6-32 shows an example of copying data from a String, xmlData, into a byte array with US-ASCII encoding. Following this, an sqlj.runtime.AsciiStream host expression is constructed, and data is inserted from the sqlj.runtime.AsciiStream host expression into an XML column. sqljXmlAsciiStream is a stream type, so its internal encoding is used. The data is converted from its internal encoding to UTF-8 encoding and stored in its hierarchical form on the database server.
Example 6-32 Inserting data from an sqlj.runtimeAsciiStream
String xmlData = "XMLPARSE(document '<customerinfo " + "cid=\"999\"><address country= " + "\"US\"><street>225 Brown St." + "</street><city>White Plains</city><state>"+ "NEW YORK</state></address>" + "</customerinfo>' preserve whitespace)"; byte[] b = xmlData.getBytes("US-ASCII"); java.io.ByteArrayInputStream xmlAsciiInputStream = new java.io.ByteArrayInputStream(b); sqlj.runtime.AsciiStream sqljXmlAsciiStream = new sqlj.runtime.AsciiStream(xmlAsciiInputStream, b.length); #sql[ctx] {INSERT INTO CUSTOMER VALUES (4, :sqljXmlAsciiStream)}; Example 6-33 illustrates constructing an sqlj.runtime.CharacterStream host expression, and inserting data from the sqlj.runtime.CharacterStream host expression into an XML column. sqljXmlCharacterStream is a character type, so its external encoding is used, whether or not it has an internal encoding specification.
Example 6-33 Inserting data from a sqljXmlCharacterStream host expression
String xmlData = "XMLPARSE(document '<customerinfo " + "cid=\"999\"><address country= " + "\"US\"><street>225 Brown St." + "</street><city>White Plains</city><state>"+ "NEW YORK</state></address>" + "</customerinfo>' preserve whitespace)";
312
java.io.StringReader xmlReader = new java.io.StringReader(xmlData); sqlj.runtime.CharacterStream sqljXmlCharacterStream = new sqlj.runtime.CharacterStream(xmlReader, xmlData.length()); #sql [ctx] {INSERT INTO CUSTOMER VALUES (4, :sqljXmlCharacterStream)}; Example 6-34 demonstrates retrieving a document from an XML column into a com.ibm.db2.jcc.DB2Xml host expression. The data is then inserted into an XML column, in the same table. No conversion occurs because after you retrieve the data it is still in UTF-8 encoding.
Example 6-34 Retrieving a document into a com.ibm.db2.jcc.DB2Xml host expression
java.sql.ResultSet rs = s.executeQuery ("SELECT * FROM CUSTOMER"); rs.next(); com.ibm.db2.jcc.DB2Xml xmlObject = (com.ibm.db2.jcc.DB2Xml)rs.getObject(2); #sql [ctx] {INSERT INTO CUSTOMER VALUES (6, :xmlObject)};
313
Method
Type of XML internal encoding declaration added None US-ASCII Specified by getDB2XmlBinaryStream targetEncoding parameter Specified by DB2Xml.getDB2XmlBytes targetEncoding parameter ISO-10646-UCS-2 ISO-10646-UCS-2
DB2Xml.getDB2XmlBytes
byte[]
DB2Xml.getDB2XmlCharacterString DB2Xml.getDB2XmlString
Reader String
If the application does not call the XMLSERIALIZE function before data retrieval, the data is converted from UTF-8 to the external application encoding for the character data types, or the internal encoding for the binary data types. No XML declaration is added.
Examples
The code segment in Example 6-35 is an example of retrieving data from an XML column into a String host expression. Because the String type is a character type, the data is converted from UTF-8, to the external encoding and returned without any XML declaration.
Example 6-35 Retrieving data from an XML column into a string
#sql iterator XmlStringIter (int, String); #sql [ctx] siter = {SELECT poid, porder FROM po}; #sql {FETCH :siter INTO :row, :outString}; Example 6-36 demonstrates retrieving data from an XML column into a byte [] host expression. Because the byte [] data type is a binary type, the data is converted from UTF-8 to the internal encoding, and returned without any XML declaration.
Example 6-36 Retrieving data from an XML column into a byte[] host expression
#sql iterator XmlByteArrayIter (int, byte[]); XmlByteArrayIter biter = null; #sql [ctx] biter = {SELECT poid, porder FROM po}; #sql {FETCH :biter INTO :row, :outBytes};
314
The code segment for Example 6-37 shows retrieving a document from an XML column into a com.ibm.db2.jcc.DB2Xml host expression. In this example, the data is in a byte string with an XML declaration that includes an internal encoding specification for UTF-8.
Example 6-37 Retrieving data from an XML column into a UTF-8 byte[] host expression
#sql iterator DB2XmlIter (int, com.ibm.db2.jcc.DB2Xml); DB2XmlIter db2xmliter = null; com.ibm.db2.jcc.DB2Xml outDB2Xml = null; #sql [ctx] db2xmliter = {SELECT poid, porder FROM po}; #sql {FETCH :db2xmliter INTO :row, :outDB2Xml}; byte[] byteArray = outDB2XML.getDB2XmlBytes("UTF-8"); The FETCH statement retrieves the data into the DB2Xml object in UTF-8 encoding. The getDB2XmlBytes method with the UTF-8 argument adds an XML declaration with a UTF-8 encoding specification and stores the data in a byte array.
315
Windows
There is one prerequisite for an installation of PHP on Windows: The Apache HTTP Server must be installed.
316
Here is a brief overview of the steps involved in installing PHP on Windows: 1. Download the latest version of the PHP zip package and the collection of PECL modules zip package from http://www.php.net 2. Extract the PHP zip package into an install directory. 3. Extract the collection of PECL modules zip package into the \ext\ subdirectory of your PHP installation directory. 4. Edit the php.ini file. 5. Enable PHP support in Apache HTTP Server 2.x. 6. Restart the Apache HTTP Server.
317
PDO_ODBC is a driver for the PHP Data Objects (PDO) extension that offers access to DB2 databases through the standard object-oriented database interface introduced in PHP 5.1. Despite its name, you can compile the PDO_ODBC extension directly against the DB2 libraries to avoid the communications overhead and potential interference of an ODBC driver manager.
318
A third extension, Unified ODBC, has historically offered access to DB2 database systems. We do not recommend that you write new applications with this extension because ibm_db2 and PDO_ODBC both offer significant performance and stability benefits over Unified ODBC. The ibm_db2 extension API makes porting an application that was previously written for Unified ODBC almost as easy as globally changing the odbc_ function name to db2_ throughout the source code of your application.
DB2_ATTR_CASE
For compatibility with database systems that do not follow the SQL standard, this option sets the case in which column names will be returned to the application. By default, the case is set to DB2_CASE_NATURAL, which returns column names as they are returned by DB2. You can set this parameter to DB2_CASE_LOWER to force column names to lower case, or to DB2_CASE_UPPER to force column names to upper case.
DB2_ATTR_CURSOR
This option sets the type of cursor that ibm_db2 returns for result sets. By default, ibm_db2 returns a forward-only cursor (DB2_FORWARD_ONLY) which returns the next row in a result set for every call to db2_fetch_array(), db2_fetch_assoc(), db2_fetch_both(), db2_fetch_object(), or db2_fetch_row(). You can set this parameter to DB2_SCROLLABLE to request a scrollable cursor so that the ibm_db2 fetch functions accept a second argument specifying the absolute position of the row that you want to access within the result set.
319
The value returned by db2_exec() will indicate if the SQL statement succeeded or failed. The significance of the values returned can be explained this way: If the value is FALSE, the SQL statement failed. You can retrieve diagnostic information through the db2_stmt_error() and db2_stmt_errormsg() functions. If the value is not FALSE, the SQL statement succeeded and returned a statement resource that can be used in subsequent function calls related to this query. Example 6-38 illustrates an example of a PHP program that executes an XQuery statement and returns a result set.
Example 6-38 PHP program that executes an XQuery and returns a result set
$conn = db2_connect($database, $user, $password); if ($conn) { $xml = "XQUERY db2-fn:sqlquery(\"select info from customer\")"; $stmt = db2_exec($conn,$xml); while ($row = db2_fetch_array($stmt)) { printf ("%100s\n", $row[0]); } db2_close($conn); } else { echo "Connection failed."; } ?>
320
Figure 6-20 shows the output of the execution of the preceding PHP source.
6.8.1 Building sample applications for the DB2 .NET data provider
DB2 provides a batch file, bldapp.bat, for compiling and linking DB2 Visual Basic or DB2 C# .NET applications. The Visual Basic .NET samples are located in: sqllib\samples\.NET\vb
321
The DB2 C# .Net samples are located in: sqllib\samples\.NET\cs directory Along with these files are the sample programs that can be built with these files. The batch file (bldapp.bat), takes one parameter, %1, for the name of the source file to be compiled (without the .vb or .cs extension). Refer to XML and XQuery support in C# .NET CLR routines on page 343 in this document for information regarding XML and C# applications.
322
Figure 6-21 Creating an XML column using the IBM Table Designer
323
324
325
XML Designer
Choosing XML Designer opens the DB2 XML Designer window. The XML Designer window contains three tabs: TextView; Grid View; and Sample XML.
Text View
The editor section, the top portion, of the Text View window allow you to enter XML manually. The editor also provides intellisense, word completion, and syntactical colorization. Alternately, you can choose an XML file from the file system by selecting Open File from the lower portion of the window. See Figure 6-24.
Figure 6-24 The Text View from the DB2 XML Designer
326
Grid view
When you select the Grid view tab, the XML document is shown in grid form. From this view, you can enter values inside the XML navigation grid cell. See Figure 6-25.
327
If you select an element from the Grid View, it is possible to drill-down into the child elements and attributes of that element. An example of this is seen in Figure 6-26. In this example the customerinfo element was chosen. From this view, it is also possible to modify the current cell or add a new row.
If the content is changed in the Text View, the changes will be synchronized and shown in the grid view, and vice versa.
328
HTML Visualizer
When you choose the HTML Visualizer from the IBM Data Designer, an embedded browser is launched, as in Figure 6-27. The XML content is shown in the browser.
Clear Data
When you choose Clear Data from the IBM Data Designer, data is deleted from the XML column.
329
Index Designer
After you have created a table, you can add an index to an XML column. To complete this, right-click on an existing table and select Open Definition to start the IBM Table Designer. See Figure 6-28.
330
To launch the XML Index view, click the XML indexes toolbar button, highlighted in red in Figure 6-29.
Figure 6-29 Select the XML indexes toolbar button to launch the XML Index view
331
The XML Index designer consists of two panes: Index Properties Grid XML Pattern Selection From the Property Grid on the Index Properties Grid Pane it is possible to add or remove indices by selecting the (+) or (-) symbols. In the Index properties, you can set or unset index properties. See Figure 6-30.
332
When you choose Select on the Build XML Pattern source button, it launches the XML Pattern Source dialog box (Figure 6-31). The following options are available from this dialog box: Use XSR object as source Use column value as XML pattern source Use a document from file system
Details for the options are as follows: Use registered XML schema: Use this option if you want to use an XML Schema from XSR as the XML source. Use document from the column: Use this option if the selected table contains at least one row and the XML column is already populated with an XML document. Use schema/XML document on disk: Use this option if neither of the other options applies. Select file of type xml or xsd from your file system.
333
334
Figure 6-33 shows Script Designer after it has been opened. In this figure, an example of an XQuery statement has been entered. From this tool it is possible to enter single or multiple SQL, SQLXML or Xqueries and return single or multiple result sets.
335
To execute the query, select Execute Script. The button for Execute Script is highlighted in red in Figure 6-34.
336
When the script has been executed, the results can be seen in the Result Data window. XML data will appear with an ellipsis (...). To view the data, either expand the column or click the ellipsis (...). See Figure 6-35.
Figure 6-35 To view XML data expand the column or click the ellipsis
337
When you view the data by clicking the ellipsis, the HTML Visualizer window opens to the selected row, as shown in Figure 6-36.
338
Assigned to other variables using the following statements: SELECT...INTO statement VALUES...INTO statement FETCH...INTO statement CALL statement EXECUTE ...INTO statement SET statement
/* Assume table T1 exists with the following definition */ /* CREATE TABLE T1(col1 XML) */ CREATE PROCEDURE simpleProc (IN parm1 XML, IN parm2 VARCHAR(32000)) LANGUAGE SQL BEGIN DECLARE var1 XML; /* check if the value of XML parameter parm1 contains an item with a value less than 200 */ IF(XMLEXISTS($x/ITEM[value < 200] passing by ref parm1 as "x"))THEN /* if it does, insert the value of parm1 into table T1 */ INSERT INTO T1 VALUES(parm1);
339
END IF; /* parse the parameter and assign it to the XML variable */ SET var1 = XMLPARSE(document parm preserve whitespace); /* insert variable var1 into table T1 */ INSERT INTO T1 VALUES(var1); END
SET city = 'Toronto'; -- find out all the customers from Toronto SET stmt_text = 'XQUERY declare default element namespace "http://posample.org"; for $cust in db2-fn:xmlcolumn("CUSTOMER.INFO")/customerinfo/addr[city= "' || city ||'"] return <Customer>{$cust/../@Cid}{$cust/../name}</Customer>';
340
PREPARE stmt FROM stmt_text; OPEN cur1; END Note: When a commit or rollback is enacted during the execution of an SQL procedure, the values assigned to XML parameters and XML variables will no longer be available. After a commit or rollback, any attempt to reference these variables or parameters will cause an error (SQL1354N, 560CE) to be raised. To successfully reference XML parameters and variables after a commit or rollback, new values must be assigned to them. The code segment in Example 6-41 shows a CLI stored procedure that utilizes an XQuery to return a result set to the caller.
Example 6-41 A CLI stored procedure utilizing XQuery to return a result set
SQL_API_RC SQL_API_FN my_simple_proc ( char sqlstate[6], char qualName[28], char specName[19], char diagMsg[71]) { SQLHANDLE henv; SQLHANDLE hdbc = 0; SQLHANDLE hstmt5; SQLRETURN cliRC; SQLCHAR stmt5[1024]; SQLINTEGER custid,quantity,count; char city[100];
cliRC = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &henv); SRV_HANDLE_CHECK(SQL_HANDLE_ENV, henv, cliRC, henv, hdbc); /* allocate the database handle */ cliRC = SQLAllocHandle(SQL_HANDLE_DBC, henv, &hdbc); SRV_HANDLE_CHECK(SQL_HANDLE_ENV, henv, cliRC, henv, hdbc); /* set AUTOCOMMIT off */ cliRC = SQLSetConnectAttr(hdbc, SQL_ATTR_AUTOCOMMIT, SQL_AUTOCOMMIT_OFF, SQL_NTS); SRV_HANDLE_CHECK(SQL_HANDLE_DBC, hdbc, cliRC, henv, hdbc);
341
/* issue NULL Connect, required and thus a A connection is not connection from the
because in CLI a statement handle is connection handle and environment handle. established; rather the current calling application is used. */
/* connect to a data source */ cliRC = SQLConnect(hdbc, NULL, SQL_NTS, NULL, SQL_NTS, NULL, SQL_NTS); SRV_HANDLE_CHECK(SQL_HANDLE_DBC, hdbc, cliRC, henv, hdbc); /* allocate the statement handle */ cliRC = SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &hstmt5); SRV_HANDLE_CHECK(SQL_HANDLE_DBC, hdbc, cliRC, henv, hdbc);
/* The query will find customers from Toronto... */ strcpy((char *)city, "Toronto"); /* XQuery to find all the customers from Toronto and return to caller */ strcpy((char *)stmt5,"XQUERY declare default element namespace " "\"http://posample.org\"; for $cust in db2-fn:xmlcolumn" "(\"CUSTOMER.INFO\")/customerinfo/addr[city=\""); strcat((char *)stmt5, city); strcat((char *)stmt5, "\"] return <Customer>{$cust/../@Cid}{$cust/../name}</Customer>"); cliRC = SQLPrepare(hstmt5, stmt5, SQL_NTS); SRV_HANDLE_CHECK_SETTING_SQLST_AND_MSG(SQL_HANDLE_STMT, hstmt5, cliRC, henv, hdbc, sqlstate, diagMsg, "XQUERY statement failed."); cliRC = SQLExecute(hstmt5); SRV_HANDLE_CHECK(SQL_HANDLE_STMT, hstmt5, cliRC, henv, hdbc); ... return (0); }
342
CREATE PROCEDURE Simple_XML_Proc_C( IN inXML XML as CLOB(5000), OUT outXML XML as CLOB(5000)) LANGUAGE C PARAMETER STYLE SQL FENCED DYNAMIC RESULT SETS 1 PARAMETER CCSID UNICODE EXTERNAL NAME 'simple_xmlproc!simple_proc'
343
using IBM.Data.DB2Types; namespace bizLogic { class empOps { ... // C# procedures ... } } XML data type values are represented in .NET routines in the same way as in other external routines, that is, the routines must specify that the XML data type is to be stored as a CLOB data type. Example 6-44 shows the correct parameter designation for input and output parameters of type XML in a CREATE PROCEDURE statement for a C# application.
Example 6-44 A CREATE PROCEDURE statement for a C# routine
CREATE PROCEDURE xmlProc1 ( IN inNUM INTEGER, IN inXML XML as CLOB (1K), OUT inXML XML as CLOB (1K), OUT inXML XML as CLOB (1K) ) LANGUAGE CLR PARAMETER STYLE GENERAL DYNAMIC RESULT SETS 0 FENCED THREADSAFE DETERMINISTIC NO DBINFO MODIFIES SQL DATA PROGRAM TYPE SUB EXTERNAL NAME gwenProc.dll:bizLogic.empOps!xmlProc1 ;
using System; import java.lang.*; import java.io.*; import java.sql.*; import java.util.*; import com.ibm.db2.jcc.DB2Xml; public class stpclass { ...
344
// Java procedure implementations ... } XML data type values are represented in JAVA routines in the same way as in other external routines. That is, the routines must specify that the XML data type is to be stored as a CLOB data type. Example 6-46 shows the correct parameter designation for input and output parameters of type XML.
Example 6-46 Using XML input and output parameters
CREATE PROCEDURE xmlProc1 ( IN inNUM INTEGER, IN inXML XML as CLOB (1K), OUT out1XML XML as CLOB (1K), OUT out2XML XML as CLOB (1K) ) DYNAMIC RESULT SETS 0 DETERMINISTIC LANGUAGE JAVA PARAMETER STYLE JAVA MODIFIES SQL DATA FENCED THREADSAFE PROGRAM TYPE SUB NO DBINFO EXTERNAL NAME myJar:stpclass.xmlProc1@
// Declare nput, output, and inout parameters com.ibm.db2.jcc.DB2Xml in_xml = xmlvar; com.ibm.db2.jcc.DB2Xml out_xml = null; com.ibm.db2.jcc.DB2Xml inout_xml = xmlvar; ... Connection con; CallableStatement cstmt; ResultSet rs;
345
... // Create a CallableStatement object cstmt = con.prepareCall("CALL SP_xml(?,?,?)"); // Set input parameter as type com.ibm.db2.jcc.DB2Xml cstmt.setObject (1, in_xml); // Register output parms as type com.ibm.db2.jcc.DB2Types.XML cstmt.registerOutParameter (2, com.ibm.db2.jcc.DB2Types.XML); cstmt.registerOutParameter (3, com.ibm.db2.jcc.DB2Types.XML); // Call the stored procedure cstmt.executeUpdate(); System.out.println("Parameter values from SP_xml call: "); System.out.println("Output parameter value "); // Use the DB2-only method getBytes to // convert the value to bytes for printing printBytes(out_xml.getDB2String()); System.out.println("Input/output parameter value "); printBytes(inout_xml.getDB2String()); ... When you call a stored procedure that has XML parameters, a compatible data type must be used in the invoking statement. For JDBC applications, when calling a routine with XML input parameters, use parameters of the com.ibm.db2.jcc.DB2Xml type. To register XML output parameters, use parameters as the com.ibm.db2.jcc.DB2Types.XML type. For additional information regarding the retrieval of output parameters, refer toRetrieving XML data in JDBC applications on page 307. For considerations of retrieving output parameters when invoking Java/SQLJ stored procedures, refer to Retrieving XML data in SQLJ applications on page 313. Example 6-48 demonstrates the invocation of a Java stored procedure that has two XML type INPUT parameters and one INTEGER OUTPUT parameter.
346
Example 6-48 An invocation of a Java Stored Procedure with XML input parameters
public static void callSimple_Proc(Connection con) { try { // prepare the CALL statement String procName = "Simple_XML_Proc_Java"; String sql = "CALL " + procName + "( ?, ?, ?)"; CallableStatement callStmt = con.prepareCall(sql); // input data String inXml = "<customerinfo xmlns=\"http://posample.org\" Cid=\"5002\">" + "<name>Kathy Smith</name><addr country=\"Canada\"><street>25 EastCreek" +"</street><city>Markham</city><prov-state>Ontario</prov-state><pcode-z ip>" + "N9C-3T6</pcode-zip></addr><phone type=\"work\">905-566-7258" + "</phone></customerinfo>"; callStmt.setString (1, inXml ) ; // register the output parameters // the XML output parm is registered as com.ibm.db2.jcc.DB2Types.XML
type
callStmt.registerOutParameter(2, com.ibm.db2.jcc.DB2Types.XML); callStmt.registerOutParameter(3, Types.INTEGER); // call the stored procedure System.out.println(); System.out.println("Calling stored procedure " + procName); callStmt.execute(); System.out.println(procName + " called successfully"); // retrieve output parameters using type com.ibm.db2.jcc.DB2Xml // The com.ibm.db2.jcc.DB2Xml outXML = (DB2Xml) callStmt.getObject(2); System.out.println("\n \n Location is :\n " + outXML.getDB2String()); ResultSet rs = callStmt.getResultSet(); Fetch...
347
To call a routine with XML parameters from an SQLJ program, use parameters of the com.ibm.db2.jcc.DB2Xml type. Example 6-49 shows an SQLJ program that calls a stored procedure that takes three XML parameters: an IN parameter, an OUT parameter, and an INOUT parameter.
Example 6-49 Call a routine from an SQLJ program
com.ibm.db2.jcc.DB2Xml in_xml = xmlvar; com.ibm.db2.jcc.DB2Xml out_xml = null; com.ibm.db2.jcc.DB2Xml inout_xml = xmlvar; // Declare an input, output, and // input/output XML parameter ... #sql [myConnCtx] { CALL SP_xml(:IN in_xml, :OUT out_xml, :INOUT inout_xml) }; // Call the stored procedure System.out.println("Parameter values from SP_xml call: "); System.out.println("Output parameter value "); printBytes(out_xml.getDB2String()); // Use the DB2-only method getBytes to // convert the value to bytes for printing System.out.println("Input/output parameter value "); printBytes(inout_xml.getDB2String());
348
The XSR object registration steps can be performed by any of the following methods: Stored procedures Command line processor Java applications
349
Web services and SOAs are dedicated to reducing or eliminating impediments to interoperable integration of applications, regardless of their operating system platform or language of implementation. The following list summarizes and highlights the most compelling characteristics of Web services and SOA: Componentization: SOA encourages an approach to systems development in which software is encapsulated into components called services. Services interact through the exchange of messages that conform to published interfaces. The interface supported by a service is all that concerns any prospective consumers; implementation details of the service itself are hidden from all consumers of the service. Platform independence: In an SOA, the implementation details are hidden. Therefore, services can be combined and orchestrated regardless of programming language, platform, and other implementation details. Web services provide access to software components through a wide variety of transport protocols, increasing the number of channels through which software components can be accessed. Investment preservation: As a benefit of componentization and encapsulation, existing software assets can be exposed as services within an SOA using Web services technologies. When existing software assets are exposed in this way, they can be extended, refactored, and adapted into appropriate services to participate within an SOA. This reuse reduces costs and preserves the investment. The evolutionary approach enabled by Web services eliminates the necessity to rip and replace existing solutions. Loose coupling: As another benefit of componentization, the SOA approach encourages loose coupling between services, which is a reduction of the assumptions and requirements shared between services. Implementations of individual services can be replaced and evolved over time without disrupting the normal activities of the SOA system as a whole. Therefore, loosely coupled systems tend to reduce overall development and maintenance costs by isolating the impact of changes to the internal implementation of components and encouraging reuse of components. Distributed computing standardization: Web services are the focal point of many, if not most, of the current standardization initiatives related to advancement of distributed computing technology. Additionally, much of the computer industry's research and development effort related to distributed computing is centered on Web services.
350
Broad industry support: Core Web services standards (SOAP, WSDL, XML, and XML Schema) are universally supported by all major software vendors. This universal support provides a broad choice of middleware and tooling products with which to build service-oriented applications. Composability: Web services technologies are planned to enable designers to mix and match different capabilities through composition. For example, systems that require message-level security can leverage the Web services Security standard. Any system that does not require message-level security is not forced to deal with the complexity and overhead of signing and encrypting its messages. This approach to composability applies to all the various qualities of service, such as reliable delivery of messages and transactions. Composability enables Web services technologies to be applied consistently in a broad range of usage scenarios, such that only the required functionality has to be implemented.
SOAP
SOAP is a simple, flexible, and extendable mechanism for exchanging structured data. Web Services uses SOAP as a communication protocol. Different from HTTP protocol which uses text strings for GET/POST methods and URL, SOAP is an XML-based messaging protocol. SOAP encodes messages as XML documents for sending requests and receiving responses.
SOAP consists of two parts: Protocol binding header: SOAP can use HTTP, SMTP and FTP as the underneath protocol. The SOAP library generates the protocol binding header based on the protocol specified. When a Web server reads a protocol binding header, it understands that the following message is a SOAP message.
351
SOAP envelope: A SOAP envelope contains a header and a body. A SOAP header is optional, and contains information such as security information and routing information. A SOAP body contains call and response information. It Includes method names and arguments if it is a Remote Procedure Call (RPC). Example 6-50 shows a sample SOAP message:
Example 6-50 A SOAP message
(1) Protocol Binding Header POST /services/weather/QueryWeather.dadx/SOAP HTTP/1.0 Host: localhost Content-Type: text/xml; charset=utf-8 SOAPAction: http://tempuri.org/weather/QueryWeather.dadx (2) SOAP Envelope <?xml version=g1.0h encoding=gUTF-8h ?> <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> (a) SOAP Header <soap:Header> <t:Transaction xmlns:t="http://tempuri.org/transaction" soap:mustUnderstand="1"> 5 </t:Transaction> </soap:Header> (b) SOAP Body <soap:Body> <m:getWeather xmlns:m="http://tempuri.org/weather/QueryWeather.dadx"> <wDate xsi:type="xsd:date">2003-02-25</wDate> <prefName xsi:type="xsd:string">TOKYO</prefName> </m:getWeather> </soap:Body> </soap:Envelope>
WSDL
WSDL is a standardized XML interface description used to define a Web Service
interface. The interface includes the information about how to structure content request messages, how to interpret response messages, and which transport protocol to use to invoke the Web service.
352
The Web Services provider provides an interface description. Based on the description, the Web Services requesters create applications to request Web services.
UDDI
UDDI is an open framework for describing, publishing, and finding Web services
on the Internet. UDDI is similar to a phone book where companies can list the Web services they provide. Web Services requesters can search for UDDI to locate the Web Service they require.
DB2 V9
as Web Services Consumers
DB2 V9
as Web Services Providers
DB2 9
Web Application Server or Tomcat
INTERNET
WORF
Web Service UDFs
SOAP Router
SOAP
SO AP
DB2 Client
SOAP Client
353
db2xml.soaphttpv ( endpoint_url VARCHAR(256), soap_action VARCHAR(256), soap_body VARCHAR(3072)) | CLOB(1M)) RETURNS VARCHAR(3072) db2xml.soaphttpc ( endpoint_url VARCHAR(256), soapaction VARCHAR(256), soap_body VARCHAR(3072) | CLOB(1M)) RETURNS CLOB(1M) db2xml.soaphttpcl( endpoint_url VARCHAR(256), soapaction VARCHAR(256), soap_body varchar(3072)) RETURNS CLOB(1M) as locator
354
Example 6-52 shows the coding for invoking a SOAP UDF. Note that we have specified the method name getLastName and the argument wEmpno for that method as the third argument.
Example 6-52 an example of SOAP UDF
values db2xml.soaphttpv( -- (1) ENDPOINT 'http://localhost:8080/services/emp/getLastName.dadx/SOAP', -- (2) ACTION 'http://tempuri.org/emp/getLastName.dadx', -- (3) SOAP BODY '<m:getLastName xmlns:m="http://tempuri.org/emp/getLastName.dadx"> <wEmpno xsi:type="xsd:string">000100</wEmpno> </m:getWeather>Ae); )
355
DB2 9
as Web Services Providers
DB2 9
WORF
SELECT name from emp George
<DADX> <operation name="getName"> <query> <SQL_query> select name from emp </SQL_query> </query> </operation> </DADX>
DADX
SOAP Request
SOAP Response
Receive George
SOAP Client
WORF The Web services Object Runtime Framework (WORF) provides a user-friendly
environment for creating simple XML based Web services to access DB2 data and stored procedures. The WORF application environment works on WebSphere Application Server and Apache Tomcat. By using the framework, the application developers can avoid writing programs to handle the details of creating the Web services. The easy-to-use programming environment provided by WORF includes functions for the application to connect to the database, execute SQL statements, and call the stored procedures. You also can use WORF to generate WSDL, test pages, and XML Schema for Web services. In addition, WORF provides an automatic documentation feature and resource-based deployment. For constructing a simple WORF Web service, you only have to edit the DADX file and property file group.properties.
356
DADX file DADX is an XML file describing Web service definitions. When the Web
application server receives a SOAP request, WORF reads the DADX file to discern the method name that is being called, then executes the SQL statements or stored procedures corresponding to that method. Example 6-53 shows a DADX file, in which a Web service method name, an argument, and a SELECT statement are defined. The getLastName method takes an employee number as an argument and returns the last name of the employee whose employee number matches with the argument.
Example 6-53 getLastName.dadx
<?xml version="1.0"?> <DADX xmlns="http://schemas.ibm.com/db2/dxx/dadx" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:dtd1="http://schemas.myco.com/sales/getstart.dtd" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> <operation name="getLastName"> <wsdl:documentation>uranai</wsdl:documentation> <query> <SQL_query> select lastname from employee where empno=:wEmpno </SQL_query> <parameter name="wEmpno" type="xsd:string"/> </query> </operation> </DADX> Example 6-54 shows another use of DADX. This doXQuery method returns the result set of an SQL/XML query.
Example 6-54 doXQuery.dadx
<?xml version="1.0"?> <DADX xmlns="http://schemas.ibm.com/db2/dxx/dadx" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:dtd1="http://schemas.myco.com/sales/getstart.dtd" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> <operation name="doXQuery"> <wsdl:documentation>uranai</wsdl:documentation> <query> <SQL_query>
357
SELECT xmlserialize(xmlquery('$c/Application/Customer/Name' passing APPL_DOC as "c") as varchar(128)) FROM DB2ADMIN.LOAN_APPLICATION WHERE xmlexists('$i/Application/Customer/Name[FirstName = "Ippei" or FirstName="Ichiro" ]' passing APPL_DOC as "i") </SQL_query> </query> </operation> </DADX>
group.properties file
In the group.properties file there is information that WORF must have to access DB2, such as the JDBC driver, database name, user ID, password. Example 6-55 shows a group.properties file for getLastName.dadx.
Example 6-55 group.properties
# /dadx group properties dbDriver=COM.ibm.db2.jdbc.app.DB2Driver dbURL=jdbc:db2:sample userID=xxxxx password=xxxxxx parserClass=org.apache.xerces.parsers.SAXParser autoReload=true reloadIntervalSeconds=5 initialContextFactory=com.ibm.websphere.naming.WsnInitialContextFactory datasourceJNDI=jdbc/sample groupNamespaceUri=http://schemas.ibm.com/employee
358
Figure 6-40 shows a page resulting from executing the doXQuery methods. When you click the Invoke button in the right pane, the doXQuery method is executed and the result is displayed on the bottom pane. Note that the result set from the SELECT statement in the doXQuery.dadx is wrapped in the SOAP envelope.
359
360
Appendix A.
Sample data
This appendix provides the following sample materials: DDLs for creating database and tables used in sample application XMLoan. The sample XML data used in Chapter 5, Managing XML data on page 173.
361
362
CREATE TABLE "DB2ADMIN"."APPLICATION_STATUS" "STATUS_ID" INTEGER NOT NULL , "STATUS_DESC" CHAR(50) NOT NULL );
ALTER TABLE "DB2ADMIN"."APPLICATION_STATUS" ADD CONSTRAINT "CC1155357143051" PRIMARY KEY ("STATUS_ID"); Example A-3 shows creating LOAN_APPLICATION, and APPL_ID is the primary key. The column APPL_DOC has an XML type.
Example: A-3 Create table LOAN_APPLICATION
CREATE TABLE "DB2ADMIN"."LOAN_APPLICATION" "APPL_ID" BIGINT NOT NULL GENERATED START WITH +0 INCREMENT BY +1 MINVALUE +0 MAXVALUE +9223372036854775807 NO CYCLE NO CACHE NO ORDER ) , "APPL_DOC" XML , "APPL_STATUS" INTEGER NOT NULL WITH "PROD_ID" INTEGER WITH DEFAULT NULL
( ALWAYS AS IDENTITY (
DEFAULT 1 , );
ALTER TABLE "DB2ADMIN"."LOAN_APPLICATION" ADD CONSTRAINT "CC1155685528218" PRIMARY KEY ("APPL_ID"); Example A-4 shows creating table CAMPAIGN, and CAMP_ID is the primary key.
Example: A-4 Create table CAMPAIGN
CREATE TABLE "DB2ADMIN"."CAMPAIGN" ( "CAMP_ID" INTEGER NOT NULL , "CAMP_DESC" CHAR(30) NOT NULL ); ALTER TABLE "DB2ADMIN"."CAMPAIGN" ADD CONSTRAINT "CC1155685220956" PRIMARY KEY ("CAMP_ID");
363
Example A-5 shows creating table LOAN, and LOAN_ID is the primary key.
Example: A-5 Create table LOAN
CREATE TABLE "DB2ADMIN"."LOAN" ( "START_DATE" DATE NOT NULL , "LOAN_ID" INTEGER NOT NULL , "PYMT_STATUS" CHAR(10) , "PYMT_COUNT" INTEGER ); ALTER TABLE "DB2ADMIN"."LOAN" ADD CONSTRAINT "CC1155686490131" PRIMARY KEY ("LOAN_ID"); Example A-6 shows creating table PAYMENT.
Example: A-6 Create table command for table PAYMENT
CREATE TABLE "DB2ADMIN"."PAYMENT" ( "APPL_ID" INTEGER NOT NULL , "PYMT_DATE" DATE NOT NULL ); Example A-7 shows creating table PRODUCT, and PROD_ID is the primary key.
Example: A-7 Create table PRODUCT
CREATE TABLE "DB2ADMIN"."PRODUCT" ( "PROD_ID" INTEGER NOT NULL , "PROD_DESC" CHAR(50) NOT NULL , "RATE" DECIMAL(6,3) , "AMOUNT" DECIMAL(14,2) , "TERM" INTEGER ); ALTER TABLE "DB2ADMIN"."PRODUCT" ADD CONSTRAINT "CC1155685582066" PRIMARY KEY ("PROD_ID"); Example A-8 shows creating table FEEDBACK. The column COMMENT has an XML type.
Example: A-8 Create table command for FEEDBACK
364
ALTER TABLE "DB2ADMIN"."LOAN_APPLICATION" ADD CONSTRAINT "CC1155687456882" FOREIGN KEY ("APPL_STATUS") REFERENCES "DB2ADMIN"."APPLICATION_STATUS" ("STATUS_ID") ON DELETE NO ACTION ON UPDATE NO ACTION ENFORCED ENABLE QUERY OPTIMIZATION; ALTER TABLE "DB2ADMIN"."LOAN_APPLICATION" ADD CONSTRAINT "CC1155687530557" FOREIGN KEY ("PROD_ID") REFERENCES "DB2ADMIN"."PRODUCT" ("PROD_ID") ON DELETE NO ACTION ON UPDATE NO ACTION ENFORCED ENABLE QUERY OPTIMIZATION; Example A-10 shows that column LOAN_ID in table LOAN is a foreign key that refers to column APPL_ID in table LOAN_APPLICATION.
Example: A-10 Define foreign key for table LOAN
ALTER TABLE "DB2ADMIN"."LOAN" ADD CONSTRAINT "CC1155686626047" FOREIGN KEY ("LOAN_ID") REFERENCES "DB2ADMIN"."LOAN_APPLICATION" ("APPL_ID") ON DELETE NO ACTION ON UPDATE NO ACTION ENFORCED ENABLE QUERY OPTIMIZATION;
365
Example A-11 shows that column APPL_ID in table PAYMENT is a foreign key that refers to column LOAN_ID in table LOAN.
Example: A-11 Define foreign key for table PAYMENT
ALTER TABLE "DB2ADMIN"."PAYMENT" ADD CONSTRAINT "CC1155686683650" FOREIGN KEY ("APPL_ID") REFERENCES "DB2ADMIN"."LOAN" ("LOAN_ID") ON DELETE NO ACTION ON UPDATE NO ACTION ENFORCED ENABLE QUERY OPTIMIZATION; Example A-12 shows that column APPL_ID in table FEEDBACK is a foreign key that refers to column APPL_ID in table APPLICATION_APPLICATION.
Example: A-12 Define foreign key for table FEEDBACK
ALTER TABLE "DB2ADMIN"."FEEDBACK" ADD CONSTRAINT "CC1155687388774" FOREIGN KEY ("APPL_ID") REFERENCES "DB2ADMIN"."LOAN_APPLICATION" ("APPL_ID") ON DELETE NO ACTION ON UPDATE NO ACTION ENFORCED ENABLE QUERY OPTIMIZATION;
A.2 contactInfo.xsd
Example A-13 shows the XML schema used in 5.3.1, IMPORT on page 208.
366
<?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="work" type="xsd:string"/> <xsd:element name="mobile" type="xsd:string"/> <xsd:element name="State" type="xsd:string"/> <xsd:element name="ContactInfo"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Address" minOccurs="1" maxOccurs="2"/> <xsd:element ref="Phone" minOccurs="1" maxOccurs="3"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Zip" type="zipType"/> <xsd:element name="Address"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Street"/> <xsd:element ref="City"/> <xsd:element ref="State"/> <xsd:element ref="Zip"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="City" type="xsd:string"/> <xsd:element name="Street" type="xsd:string"/> <xsd:element name="Phone"> <xsd:complexType> <xsd:sequence> <xsd:element ref="work"/> <xsd:element ref="home"/> <xsd:element ref="mobile"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="home" type="xsd:string"/> <xsd:simpleType name="zipType"> <xsd:restriction base="xsd:string"> <xsd:pattern value="[0-9]{5}(-[0-9]{4})?"></xsd:pattern> </xsd:restriction> </xsd:simpleType> </xsd:schema>
367
<?xml version="1.0"?> <Employee> <Name>John Smith2</Name> <EmpNo>002</EmpNo> <Title>Engineer</Title> <DateOfBirth>2/22/1967</DateOfBirth> <SSN>892-76-0002</SSN> <Address country="US"> <Street>2 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> </Address> <Phone type="work">312-964-0002</Phone> <Phone type="home">678-181-0002</Phone> <Email>[email protected]</Email> <Salary>20000</Salary> </Employee> Example A-15 is an XML file used in 5.4.3, Node-level access control on page 241.
Example: A-15 employee003.xml
<?xml version="1.0"?> <Employee> <Name>John Smith3</Name> <EmpNo>003</EmpNo> <Title>Architect</Title> <DateOfBirth>2/23/1967</DateOfBirth> <SSN>892-76-0003</SSN> <Address country="US"> <Street>3 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> </Address> <Phone type="work">312-964-0003</Phone> <Phone type="home">678-181-0003</Phone>
368
<Email>[email protected]</Email> <Salary>30000</Salary> </Employee> Example A-16 is an XML file used in 5.4.3, Node-level access control on page 241
Example: A-16 employee004.xml
<?xml version="1.0"?> <Employee> <Name>John Smith4</Name> <EmpNo>004</EmpNo> <Title>Director</Title> <DateOfBirth>2/24/1967</DateOfBirth> <SSN>892-76-0004</SSN> <Address country="US"> <Street>4 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> </Address> <Phone type="work">312-964-0004</Phone> <Phone type="home">678-181-0004</Phone> <Email>[email protected]</Email> <Salary>40000</Salary> </Employee> Example A-17 is an XML file used in 5.4.3, Node-level access control on page 241.
Example: A-17 employee005.xml
<?xml version="1.0"?> <Employee> <Name>John Smith5</Name> <EmpNo>005</EmpNo> <Title>CEO</Title> <DateOfBirth>2/25/1967</DateOfBirth> <SSN>892-76-0005</SSN> <Address country="US"> <Street>5 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip>
369
</Address> <Phone type="work">312-964-0005</Phone> <Phone type="home">678-181-0005</Phone> <Email>[email protected]</Email> <Salary>50000</Salary> </Employee> Example A-18 shows the INSERT statements for the EMP table, which is used in 5.4.3, Node-level access control on page 241.
Example: A-18 Insert statement for John Smith2 - John Smith5
INSERT INTO EMP VALUES ('002', XMLPARSE( DOCUMENT '<?xml version="1.0"?> <Employee> <Name>John Smith2</Name> <EmpNo>002</EmpNo> <Title>Engineer</Title> <Phone type="work">312-964-0002</Phone> <Email>[email protected]</Email> </Employee>'), XMLPARSE( DOCUMENT '<?xml version="1.0"?> <Employee> <Name>John Smith2</Name> <EmpNo>002</EmpNo> <DateOfBirth>2/22/1967</DateOfBirth> <SSN>892-76-0002</SSN> <Address country="US"> <Street>2 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> <Phone type="home">678-181-0002</Phone> </Address> <Salary>20000</Salary> </Employee>'), SECLABEL_BY_NAME('EMP_POLICY', 'PUBLIC')); INSERT INTO EMP VALUES ('003', XMLPARSE( DOCUMENT '<?xml version="1.0"?> <Employee> <Name>John Smith3</Name> <EmpNo>003</EmpNo> <Title>Architect</Title>
370
<Phone type="work">312-964-0003</Phone> <Email>[email protected]</Email> </Employee>'), XMLPARSE( DOCUMENT '<?xml version="1.0"?> <Employee> <Name>John Smith3</Name> <EmpNo>003</EmpNo> <DateOfBirth>2/23/1967</DateOfBirth> <SSN>892-76-0003</SSN> <Address country="US"> <Street>3 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> <Phone type="home">678-181-0003</Phone> </Address> <Salary>30000</Salary> </Employee>'), SECLABEL_BY_NAME('EMP_POLICY', 'PUBLIC')); INSERT INTO EMP VALUES ('004', XMLPARSE( DOCUMENT '<?xml version="1.0"?> <Employee> <Name>John Smith4</Name> <EmpNo>004</EmpNo> <Title>Director</Title> <Phone type="work">312-964-0004</Phone> <Email>[email protected]</Email> </Employee>'), XMLPARSE( DOCUMENT '<?xml version="1.0"?> <Employee> <Name>John Smith4</Name> <EmpNo>004</EmpNo> <DateOfBirth>2/24/1967</DateOfBirth> <SSN>892-76-0004</SSN> <Address country="US"> <Street>4 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> <Phone type="home">678-181-0004</Phone> </Address> <Salary>40000</Salary>
371
</Employee>'), SECLABEL_BY_NAME('EMP_POLICY', 'HR_ONLY')); INSERT INTO EMP VALUES ('005', XMLPARSE( DOCUMENT '<?xml version="1.0"?> <Employee> <Name>John Smith5</Name> <EmpNo>005</EmpNo> <Title>Manager</Title> <Phone type="work">312-964-0005</Phone> <Email>[email protected]</Email> </Employee>'), XMLPARSE( DOCUMENT '<?xml version="1.0"?> <Employee> <Name>John Smith5</Name> <EmpNo>005</EmpNo> <DateOfBirth>2/25/1967</DateOfBirth> <SSN>892-76-0005</SSN> <Address country="US"> <Street>5 East Main Street</Street> <City>Los Gatos</City> <State>CA</State> <Zip>95034</Zip> <Phone type="home">678-181-0005</Phone> </Address> <Salary>50000</Salary> </Employee>'), SECLABEL_BY_NAME('EMP_POLICY', 'HR_ONLY'));
372
Appendix B.
Additional material
This redbook refers to additional material that can be downloaded from the Internet as described below.
373
374
SMTP SOA SOAP UDDI UDF UML URI URL UTC WSDL XDA XDM XDS XML XRS XSD XSR |WORF
Simple Mail Transfer Protocol Service Oriented Architecture Simple Object Access Protocol
ANSI API BLOB CLI CLOB CLP DADX DBMS DDL DMS DTD DWB FLWOR HADR HTML HTTP IXF JDK LBAC LOB ODBC RSS SMS
Coordinated Universal Time Web Services Description Language XML Data Area XQuery/XPath Data Model XML Data Specifier
eXtensible Markup Language
XML Schema Repository XML Schema Definition XML Schema Repository Web services Object Runtime Framework
375
376
Related publications
The publications listed in this section are considered particularly suitable for a more detailed discussion of the topics covered in this redbook.
IBM Redbooks
For information about ordering these publications, see How to get IBM Redbooks on page 380. Note that some of the documents referenced here may be available in softcopy only. DB29: pureXML Overview and Fast Start, SG24-7298 DB2 Express-C: The Developer Handbook for XML, PHP, C/C++, Java, and .NET, SG24-7301
Other publications
These publications are also relevant as further information sources:
IBM - DB2 9
What's New, SC10-4253 Administration Guide: Implementation, SC10-4221 Administration Guide: Planning, SC10-4223 Administrative API Reference, SC10-4231 Administrative SQL Routines and Views, SC10-4293 Call Level Interface Guide and Reference, Volume 1, SC10-4224 Call Level Interface Guide and Reference, Volume 2, SC10-4225 Command Reference, SC10-4226 Data Movement Utilities Guide and Reference, SC10-4227 Data Recovery and High Availability Guide and Reference, SC10-4228 Developing ADO.NET and OLE DB Applications, SC10-4230 Developing Embedded SQL Applications, SC10-4232 Developing Java Applications, SC10-4233
377
Developing Perl and PHP Applications, SC10-4234 Getting Started with Database Application Development, C10-4252 Getting started with DB2 installation and administration on Linux and Windows, GC10-4247 Message Reference Volume 1, SC10-4238 Message Reference Volume 2, SC10-4239 Migration Guide, GC10-4237 Performance Guide, SC10-4222 Query Patroller Administration and User's Guide, GC10-4241 Quick Beginnings for DB2 Clients, GC10-4242 Quick Beginnings for DB2 Servers, GC10-4246 Spatial Extender and Geodetic Data Management Feature User's Guide and Reference, SC18-9749 SQL Guide, SC10-4248 SQL Reference, Volume 1, SC10-4249 SQL Reference, Volume 2, SC10-4250 System Monitor Guide and Reference, SC10-4251 Troubleshooting Guide, GC10-4240 Visual Explain Tutorial, SC10-4319 XML Extender Administration and Programming, SC18-9750 XML Guide, SC10-4254 XQuery Reference, SC18-9796 DB2 Connect User's Guide, SC10-4229 Quick Beginnings for DB2 Connect Personal Edition, GC10-4244 Quick Beginnings for DB2 Connect Servers, GC10-4243
378
Application Development Guide: Programming Client Applications V8, SC09-4826-01 Application Development Guide: Programming Server Applications V8, SC09-4827-01 Call Level Interface Guide and Reference, Volume 1, V8, SC09-4849-01 Call Level Interface Guide and Reference, Volume 2, V8, SC09-4850-01 Command Reference V8, SC09-4828-01 Data Movement Utilities Guide and Reference V8, SC09-4830-01 Data Recovery and High Availability Guide and Reference V8, SC09-4831-01 Guide to GUI Tools for Administration and Development, SC09-4851-01 Installation and Configuration Supplement V8, GC09-4837-01 Quick Beginnings for DB2 Clients V8, GC09-4832-01 Quick Beginnings for DB2 Servers V8, GC09-4836-01 Replication and Event Publishing Guide and Reference, SC18-7568 SQL Reference, Volume 1, V8, SC09-4844-01 SQL Reference, Volume 2, V8, SC09-4845-01 System Monitor Guide and Reference V8, SC09-4847-01 Data Warehouse Center Application Integration Guide Version 8 Release 1, SC27-1124-01 DB2 XML Extender Administration and Programming Guide Version 8 Release 1, SC27-1234-01 Federated Systems PIC Guide Version 8 Release 1, GC27-1224-01
Online resources
These Web sites are also relevant as further information sources: DB2 XML wiki http://www.ibm.com/developerworks/wikis/display/db2xml/Home DB2 Information Center http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp DB2 Express-C http://www.ibm.com/software/data/db2/udb/db2express/
Related publications
379
C. M. Saracco. Managing XML for Maximum Return, IBM Whitepaper, October 2005. ftp://ftp.software.ibm.com/software/data/pubs/papers/managingxml.pdf Matthias Nicola and Bert Van der Linden. Native XML Support in DB2 Universal Database, Proceedings of the 31st Annual VLDB, 2005. http://www.vldb2005.org/program/paper/thu/p1164-nicola.pdf Matthias Nicola. 15 best practices for pureXML performance in DB2 9, IBM developerWorks, October 2006. http://www.ibm.com/developerworks/db2/library/techarticle/dm-0610nic ola/ C. M. Saracco. What's New in DB2 Viper: XML to the Core, IBM developerWorks, February 2006. http://www.ibm.com/developerworks/db2/library/techarticle/dm-0602sar acco/ Holger Seubert and Sabine Perathoner-Tschaffler. XML full-text search in DB2, IBM developerWorks article, June 2006. http://www.ibm.com/developerworks/db2/library/techarticle/dm-0606seu bert/index.html Hardeep Singh. XML application migration from DB2 8.x to DB2 Viper, Part 1: Partial updates to XML documents in DB2 Viper, IBM deVelperWorks, May 2006. http://www.ibm.com/developerworks/db2/library/techarticle/dm-0605sin gh/
380
ibm.com/services
Related publications
381
382
Index
Symbols
!= 91 >= 91 character type 282, 287, 306, 311312, 314 class 4, 51, 53, 68, 222, 303304, 319, 344 CLI 11, 268, 271, 273, 278280, 282, 286287, 290291, 341343 CLI driver 278 CLOB 5, 43, 45, 149, 174, 271, 294299, 304306, 343345, 354 COBOL 268, 289, 338, 343 code page 67, 70, 145, 217, 223, 275, 282, 287, 306, 362 colon 80 column path Index 182 column path index 183 Command Line Processor 253 comment() 80 communication 351 complex data type 51, 53, 55 complex type element 50 components 71, 145, 233, 349350 constraint 180 constructor 8990, 93, 142, 150 CONTAIN() 159
A
access control 248 access plan 10, 186, 189190, 192, 194195 Add-In 250, 253, 321322 ADO.NET 249, 321322 API 250, 290, 318319, 322 APIs 6, 317 application 6, 11, 15, 2129, 3138, 40, 4748, 5253, 55, 57, 61, 63, 66, 77, 145, 156, 160, 162, 164, 167, 171, 176, 207, 217, 223, 249250, 268269, 271, 273, 275280, 286291, 294296, 299, 301302, 304, 306, 309310, 314319, 321, 342, 344, 349, 354356, 361362 application programming interfaces 268 argument 8889, 97, 99101, 126, 144, 150151, 159, 161, 271, 276, 282, 315, 319, 354, 357 array 237, 311312, 315, 319 attribute 45, 50, 64, 67, 74, 7680, 86, 9394, 110, 122, 137138, 140, 150, 158, 161, 167168, 170171, 201, 206208, 210, 213, 217, 219220, 223, 225, 243, 266, 274, 287 axis 80
D
data access 10, 243, 248, 322 data model 1314, 19, 23, 76, 149, 222, 236, 362 data security 234 data source 342 data type 67, 12, 21, 40, 4344, 50, 53, 55, 66, 6869, 8889, 95, 97, 100, 127, 132, 142, 144145, 149150, 178181, 196, 198199, 208, 212, 216, 222, 226, 229, 233, 251255, 269270, 273, 278, 282, 284, 288289, 294, 297299, 302, 307308, 313314, 322, 324, 338339, 343346 database objects 42, 70, 198, 253254 DB2 Developer Workbench 12, 254 DB2 Express-C 37 db2cli.ini 273, 279 db2cli.lst 279 DB2Connection 349 db2-fn sqlcolumn 103 sqlquery 103
B
bind-in 43 bind-out 43 BLOB 5, 149150, 204, 271272, 294297, 299300, 304305 Boolean operators 162 boundary whitespace 272 B-Tree index 176 buffer 69, 282, 284, 287 business logic 349
C
Call Level Interface 249, 273, 278279 carriage returns 272 character string 180, 297
383
decimal 179 decomposition 11, 146, 148 default namespace 98, 137, 139, 141, 151, 166 delimiters 297 descendant 79 distribution 229, 232 document model 167, 169, 171 document-node() 80 dynamic 17, 77, 96, 270, 276, 278, 299300, 315, 340
E
e-business 21 element 1011, 47, 49, 51, 5354, 61, 64, 67, 74, 7678, 80, 8385, 87, 9394, 101102, 110, 113114, 116117, 120, 122, 124, 136138, 140, 142, 147148, 150151, 154, 159162, 164167, 169170, 186188, 190, 192, 196, 199200, 206207, 215, 229, 237238, 243, 274, 277, 288, 300301, 328, 340, 342, 367 embedded SQL 73, 126127, 133134, 136, 271272, 278, 280, 289291, 299 environment 5, 14, 21, 3637, 87, 155, 158, 189, 250251, 278279, 281, 292, 302, 316, 321, 342, 355356 eq 91 explicit validation 65 export 222224, 226227, 252253, 310, 322, 325 expression 10, 47, 74, 78, 8081, 84, 8687, 9092, 9496, 103, 105108, 112, 115, 118119, 126127, 129130, 161, 168169, 178, 197, 229, 266, 269, 276, 297, 299, 302, 310315, 319, 340 extensions 6, 278, 315, 318
implicit-timezone 100 last 101 local-name 102 lower-case 98 matches 98 max 100 min 100 node-name 102 normalize-space 98 number 100 remove 101 round 100 starts-with 98 string-join 98 string-length 99 substring-before 99 tokenize 99 translate 99 upper-case 99 forward-only 319 function 43, 48, 61, 68, 78, 81, 86, 8890, 9698, 115, 122, 126, 128130, 133, 136, 138, 140141, 143144, 149153, 155, 159160, 170, 197, 233, 243, 245, 248, 269271, 274, 277278, 282, 284, 286287, 294, 297, 299300, 302, 314, 319320, 339
G
ge 91 global catalog path table 176 GRANT 237239, 246
H
handle 2, 17, 40, 180, 288, 341342, 356 host language 289 host variable 269, 271, 274, 276, 283, 294295, 298
F
fetch 276, 289, 300301, 319 FLWOR 90, 103, 106108, 112, 119, 126, 266 fn abs 99 ceiling 100 codepoints-to-string 98 compare 98 concat 98 contains 98 empty 101 ends-with 98 exist 101 floor 100
I
IBM.Data.DB2 322, 343 ibm_db2 316, 318319 implicit parsing 269, 285 implicit validation 61, 63, 65 import 4243, 51, 6061, 164, 200202, 208213, 216217, 221, 322, 325, 344 index access 176 installation 37, 250, 302303, 316317, 321
384
J
Java 11, 38, 122, 249, 254, 268, 274, 277, 302306, 310, 338, 343347, 349 JDBC 1112, 268, 302304, 307, 310, 345346, 349, 358
O
ODBC 11, 278279, 318319 OLE DB 249, 321 optimizer 40, 155, 186, 229 options 10, 39, 70, 123, 133134, 143, 185, 208, 216, 221, 223224, 226, 231, 268269, 297, 319, 333 order by 103 overhead 5, 7, 45, 4748, 60, 199, 282, 318, 351
K
keyword 94, 105, 252, 288, 299
L
Label-based access control 234 LBAC 234 le 91 let 103 LOB 4244, 149, 222224, 294, 297, 299, 301 local complex type 50 local name 75, 79, 102, 137, 140141 location 84 logical index 181 lt 91
P
package 316317 parameter 123, 135136, 170, 209, 212213, 219220, 228, 268270, 276, 280, 282, 285, 291, 306, 308, 314, 319, 322, 324, 338340, 343346, 348, 357 parameter marker 269270, 277 parser 218 path ID 176 PDO_ODBC 318319 performance 5, 7, 10, 17, 19, 40, 44, 4749, 63, 65, 69, 173, 176, 197, 199, 230, 319 Perl 249, 316 PHP 11, 249, 268, 315, 317321 physical index 181 precompile 280, 291, 297 precompiler 278, 297 predicate 86, 186 predicate expression 86 prefix 75, 79, 88, 97, 102, 123, 137138, 140141, 166 PREPARE 299300, 340 primary schema 62 privilege 71, 239, 246 procedures 251252, 254255, 268, 303, 338340, 343344, 346, 348349, 356 processing-instruction() 80 programming interface 250, 278, 318 programming language 250, 268, 276, 315, 349350 protocol 43, 57, 349, 351352 pureXML 1, 4, 6, 1113, 1719, 21, 40, 43, 66, 68, 154, 173174, 253, 268, 353 pureXML storage 67
M
metadata 302, 318 method 12, 15, 18, 155, 159, 282, 305307, 315, 346, 348349, 352, 354355, 357, 359 modular 315, 349 monitor 6869
N
namespace 50, 6162, 64, 67, 73, 75, 79, 8788, 9798, 102, 123, 137141, 150151, 166, 169, 200, 203, 255, 267, 277, 288, 300301, 322, 340, 342, 344 namespace prefix 102, 137, 139 native data type 40 NCName 79 ne 91 Net Search Extender 11, 73, 154155, 157, 159, 162, 167 node 7, 67, 7480, 8384, 86, 90, 9293, 96, 102, 113, 122123, 140141, 144, 150, 152, 154, 168, 174, 178, 181, 185, 196, 229, 244, 247248, 251, 255, 264265, 279 node value 180 NSE 154155, 157, 159160, 164, 167, 169 numeric attribute 159
Q
QName 75, 77, 79, 102, 137, 139, 141 qualified name 75, 78, 102, 137, 166, 169
Index
385
query 176, 178179, 183, 185187, 189190, 192, 194196, 203204, 240, 248
R
Redbooks Web site 380 Contact us xv registry 351 repository 10, 18, 57, 71, 198199, 202, 251, 255, 348 result set 49, 97, 192, 240, 247248, 286, 288, 319320, 341, 357, 359 ResultSet 277, 307308, 313, 345, 347 return 103 runtime 10, 229, 253, 278, 301, 310, 312313, 355
S
schema 1011, 13, 1516, 19, 42, 4445, 47, 4954, 56, 5964, 6667, 70, 75, 138, 144145, 147148, 173, 180, 185, 198208, 210, 212213, 215, 219220, 222, 225, 251, 253, 255, 262, 319, 333, 348349, 366367 schema document 51, 53, 55, 61, 63, 144, 146, 199, 201202, 204, 348 schema validation 208, 212213, 219220 scripts 253254, 279280, 290, 292, 319 SECADM authority 233 security label 237238 security label component 237 security policy 237 security structure 237 serialized string 149, 151, 269, 275, 282, 286287, 304, 307, 310, 313 setup 21, 3637, 122, 233, 237, 243244, 247248, 278279, 289, 321 shredding 5, 7, 15, 146147, 149 simple type element 50 source file 210, 280, 291, 304, 309, 322, 344 SQL 1, 6, 8, 1011, 38, 40, 43, 45, 47, 49, 6163, 71, 73, 97, 103, 110, 115, 117, 120121, 123124, 126, 129130, 132135, 142143, 155, 159, 173, 176, 178181, 186190, 192, 194, 196, 201203, 205, 213, 226, 234, 249, 251, 253254, 269271, 273, 276278, 280, 289291, 294302, 319320, 334335, 338341, 343345, 353357 SQL statement 127, 300, 320 SQL_C_BINARY 282283, 285, 287, 289 SQL_HANDLE_DBC 341342 SQL_HANDLE_ENV 341
SQL_HANDLE_STMT 288, 342 SQLCA 289, 300 SQLCODE 215 SQLJ 11, 38, 122, 254, 268, 273, 302304, 309310, 313, 346, 348349 sqlj 309310, 312313 sqlname.data 301 SQLSTATE 202, 215, 241 SQLVAR 301 square bracket 86 statement 810, 43, 45, 61, 63, 65, 68, 124, 126127, 129, 142143, 157, 177, 179, 181, 183, 197, 234, 238240, 246247, 269, 272, 276277, 283284, 288289, 299301, 315, 319320, 335, 339340, 342347, 354, 357, 359, 370 static 271, 276277, 298300, 347, 362 static embedded SQL 272 statistics 12, 34, 59, 185186, 189, 229231, 252 stemming search 159, 163 storage model 66 stored procedure 3738, 122, 124125, 270, 281, 293, 324, 341, 345348 string value 77, 99, 143 SYSCAT.INDEXES. Even though the XML column path Index and the XML regions Index 182
T
tables 251254, 268, 299, 323 text node 80, 124, 153 text() 80 thesaurus search 159 toolbar 331 tree structure 40, 43, 67
U
Unicode 7, 43, 98 Uniform Resource Identifier 137 URI 75, 79, 102, 137138, 140141 UTF-8 217, 223
V
VARCHAR 5, 38, 4245, 122, 130, 159, 179181, 183, 187188, 190, 192, 196, 203204, 208, 246, 286, 294, 339340, 354 variable 37, 43, 78, 90, 95, 103108, 152, 158, 269, 271, 274, 276, 282283, 285, 287288, 293295, 298, 302, 306, 338341
386
varying-length string 180 views 43, 4849, 7071, 199, 243244, 247, 253, 323
W
well-formed XML 10, 43, 66, 143, 208, 269, 282 whitespace 48, 62, 65, 98, 143, 157, 218, 272, 274, 294, 306, 311312, 340 wildcard search 164
X
XDA 174 XML 12, 48, 1013, 1521, 24, 28, 31, 38, 4051, 53, 5560, 6264, 6667, 6970, 7375, 7778, 8081, 8385, 88, 90, 93, 97, 103, 109110, 112, 114118, 120124, 126, 128130, 132136, 138139, 141145, 147149, 151152, 154155, 158161, 164167, 169, 173179, 181, 184187, 189190, 195199, 201206, 208210, 212213, 215225, 227, 229231, 233, 235236, 239, 242, 244, 247252, 254255, 257, 260264, 266, 268271, 273275, 277279, 282285, 287290, 294296, 299308, 310, 312315, 322323, 325329, 331, 333, 337341, 343349, 351353, 356358, 361364, 366, 368369 XML column path index 176 XML Data Area 174 XML data descriptor 174 XML index 174, 176, 178179, 181, 183, 187188, 192, 195197 XML schema 10, 13, 42, 4445, 47, 4951, 56, 60, 6263, 6667, 71, 75, 138, 144, 146, 148, 180, 198199, 201204, 206, 208, 210, 212213, 221222, 225, 251, 255, 348349 XML schema repository 10, 198199, 251, 255, 348 XML value index 195 XMLATTRIBUTES 150 XMLCONCAT 152 XMLELEMENT 150 XMLEXISTS 127 XMLFOREST 151 xmlns 137, 139, 147148, 152, 155156, 165166 XMLPARSE 43, 81, 109, 111, 121, 139, 143144, 218, 220221, 239, 269273, 282, 294, 297, 299, 306, 311312, 339340, 370372 xmlpattern 10, 174, 177, 179, 181, 183, 197 XMLQUERY, 127
XMLSERIALIZE 43, 149, 297, 314 XMLTABLE 127, 151 XMLUPDATE 122 XPath 10, 74, 78, 8687, 123, 134, 149, 160, 167169, 176, 178, 190, 229, 353 XQUERY 81, 8487, 93, 104108, 112117, 119121, 126127, 129, 140141, 144, 160, 165, 178, 188, 192, 196, 252, 276, 288, 299300, 302, 307, 320, 340, 342, 344 XSCAN 186 XSR 60, 70, 148, 199, 202203, 205, 251, 333, 348349
Index
387
388
Back cover
Learning SQL/XML, XQuery, XPath with working examples Developing XML applications with DB2 pureXML Managing XML for maximum return
IBM DB2 9 for Linux, UNIX, and Windows marks a new stage in the evolution of data servers. IBM has continually led the data management industry with the release of innovative technology. DB2 9 is a new generation data server with revolutionary pureXML technology. This technology in DB2 9 fundamentally transforms the way XML information is managed for maximum return while seamlessly integrating XML with relational data. In this IBM Redbook we discuss the pureXML data store, hybrid database design, and administration. We describe XML schemas, industry standards, and how to manage schemas. We also cover SQL/XML, XQuery, and XPath using easy-to-understand examples. Lastly, we show how to use XML technology efficiently in business applications.
BUILDING TECHNICAL INFORMATION BASED ON PRACTICAL EXPERIENCE IBM Redbooks are developed by the IBM International Technical Support Organization. Experts from IBM, Customers and Partners from around the world create timely technical information based on realistic scenarios. Specific recommendations are provided to help you implement IT solutions more effectively in your environment.