adv_xml_and_web_srv
adv_xml_and_web_srv
RelaxNG
Document Type
Definition (DTD)
validation/courses-dtd.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE courses [
<!ELEMENT courses (course+)>
<!ELEMENT course (title, description, credits, lastmodified)>
<!ATTLIST course cid ID #REQUIRED>
<!ELEMENT title (#PCDATA)>
<!ELEMENT description (#PCDATA)>
<!ELEMENT credits (#PCDATA)>
<!ELEMENT lastmodified (#PCDATA)>
]>
<courses>
<course cid="c1">
<title>Basic Languages</title>
<description>Introduction to Languages</description>
<credits>1.5</credits>
<lastmodified>2004-09-01T11:13:01</lastmodified>
</course>
<course cid="c2">
...
</course>
</courses>
DTD and IDs
validation/course-id.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE courses [
<!ATTLIST course cid ID #REQUIRED>
]>
<courses>
<course cid="c1">
<title xml:id="t1">Basic Languages</title>
<description>Introduction to Languages</description>
</course>
<course cid="c2">
<title xml:id="t3">French I</title>
<description>Introduction to French</description>
</course>
<course cid="c3">
<title xml:id="t3">French II</title>
<description>Intermediate French</description>
</course>
</courses>
XML Schema
validation/course.xsd
<?xml version="1.0"?>
<xsd:schema
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="courses">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="course" minOccurs="1"
maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="title" type="xsd:string"/>
<xsd:element name="description"
type="xsd:string"/>
<xsd:element name="credits"
type="xsd:decimal"/>
<xsd:element name="lastmodified"
type="xsd:dateTime"/>
</xsd:sequence>
<xsd:attribute name="cid" type="xsd:ID"/>
</xsd:complexType>
</xsd:element>
RelaxNG
validation/course.rng
grammar xmlns="http://relaxng.org/ns/structure/1.0"
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
<start>
<element name="courses">
<zeroOrMore>
<element name="course">
<attribute name="cid"><data type="ID"/></attribute>
<element name="title"><text/></element>
<element name="description"><text/></element>
<element name="credits"><data type="decimal"/></element>
<element name="lastmodified"><data type="dateTime"/></element>
</element>
</zeroOrMore>
</element>
</start>
/grammar>
XPath
Language to locate and retrieve
information from an XML document
A foundation for XSLT
An XML document is a tree
containing nodes
The XML document is the root node
Locations are addressable similar to
the syntax for a filesystem
XPath Reference
Document
xpath/courses.xml
courses xmlns:t="http://www.example.com/title">
<course xml:id="c1">
<t:title>Basic Languages</t:title>
<description>Introduction to Languages</description>
</course>
<course xml:id="c2">
<t:title>French I</t:title>
<description>Introduction to French</description>
</course>
<course xml:id="c3">
<t:title>French II</t:title>
<description>Intermediate French</description>
<pre-requisite cref="c2" />
<?phpx A PI Node ?>
<defns xmlns="urn:default">content</defns>
</course>
courses>
XPath Location Example
xpath/location.php
Expression:
/courses/course/description
//description
/courses/*/description
//description[ancestor::course]
Resulting Nodset:
<description>Introduction to Languages</description>
<description>Introduction to French</description>
<description>Intermediate French</description>
XPath Function Example
xpath/function.php
string(/courses/course/pre-
requisite[@cref="c2"]/..)
French II
Intermediate French
content
XPath and Namespaces
xpath/namespaces.php
title
Empty NodeSet
t:title
<t:title>Basic Languages</t:title>
<t:title>French I</t:title>
<t:title>French II</t:title>
defns
Empty NodeSet
*[local-name()="defns"]
<defns xmlns="urn:default">content</defns>
PHP and XML
PHP 5 introduced numerous
interfaces for working with XML
The libxml2 library
(http://www.xmlsoft.org/) was
chosen to provide XML support
The sister library libxslt provides
XSLT support
I/O is handled via PHP streams
XML Entensions for PHP
5
ext/libxml
ext/xml (SAX push parser)
ext/dom
ext/simplexml
ext/xmlreader (pull parser)
ext/xmlwriter
ext/xsl
ext/wddx
ext/soap
Libxml
Contains common functionality
shared across extensions.
Defines constants to modify parse
time behavior.
Provides access to streams context.
Allows modification of error
handling behavior for XML based
extensions.
Libxml: Parser Options
LIBXML_NOENT Substitute entities with
replacement content
LIBXML_DTDLOAD Load subsets but do not perform
validation
LIBXML_DTDATTR Create defaulted attributes defined
in DTD
LIBXML_DTDVALID Loads subsets and perform
validation
LIBXML_NOERROR Suppress parsing errors from
libxml2
LIBXML_NOWARNI Suppress parser warnings from
NG libxml2
LIBXML_NOBLANK Remove insignificant whitespace
Libxml: Error Handling
LibXMLError::code Values:
LIBXML_ERR_NONE
LIBXML_ERR_WARNING
LIBXML_ERR_ERROR
LIBXML_ERR_FATAL
LibXMLError Example
libxml/error.php
<?php
/* Regular Error Handling */
$dom = new DOMDocument();
$dom->loadXML('<root>');
if (! $dom->loadXML('root')) {
$arrError = libxml_get_errors();
foreach ($arrError AS $xmlError) {
var_dump($xmlError);
}
} else {
print "Document Loaded";
}
?>
LibXMLError Result
PHP Warning: DOMDocument::loadXML(): Premature end of data in tag
root line 1 in Entity, line: 1 in
/home/rrichards/workshop/libxml/error.php on line 4
$arNodeSet = array();
if ($root->hasChildNodes()) { locateDescription($root->childNodes); }
$nodelist = $dom->getElementsByTagName('description');
locateDescription($root->firstChild);
$root = $doc->createElement("tree");
$doc->appendChild($root);
$attr2 = $doc->createAttribute("att2");
$attr2->appendChild($doc->createTextNode("att2 value"));
$root->setAttributeNode($attr2);
$child = $root->appendChild($doc->createElement("child"));
$doc->formatOutput = TRUE;
print $doc->saveXML();
DOM: Creating an Atom
Feed Result (initial structure)
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Example Atom Feed</title>
<subtitle>Example Atom Feed</subtitle>
<id>http://www.example.org/</id>
<updated>2006-03-23T01:39:40-05:00</updated
>
</feed>
DOM: Creating an Atom
Feed
dom/atom_feed_creation.php
$entry = create_append_Atom_elements($doc, 'entry', NULL, $feed);
$doc->formatOutput = TRUE;
print $doc->saveXML();
DOM: Creating an Atom
Feed
Result
dom/atomoutput.xml
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Example Atom Feed</title>
<subtitle>Example Atom Feed</subtitle>
<id>http://www.example.org/</id>
<updated>2006-03-23T01:53:59-05:00</updated>
<entry>
<title type="text">My first entry</title>
<link type="text/html" rel="alternate"
href="http://www.example.org/entry-url" title="My first
entry"/>
<author>
<name>Rob</name>
</author>
<id>http://www.example.org/entry-guid</id>
<updated>2006-03-23T01:53:59-05:00</updated>
<published>2006-03-23T01:53:59-05:00</published>
<content><![CDATA[This is my first Atom entry!<br />More to
follow]]></content>
</entry>
</feed>
DOM: Document Editing
dom/editing.php
$dom->load('atomoutput.xml');
$child = $dom->documentElement->firstChild;
while($child && $child->nodeName != "entry") { $child = $child-
>nextSibling; }
while($child) {
if ($child->nodeName == "updated") {
$text = $child->firstChild;
$text->nodeValue = date('c');
break;
}
$child = $child->nextSibling;
}
}
}
print $dom->saveXML();
DOM: Editing
dom/new_atomoutput.xml
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Example Atom Feed</title>
<subtitle>Example Atom Feed</subtitle>
<id>http://www.example.org/</id>
<updated>2006-03-23T01:53:59-05:00</updated>
<entry>
<title type="html"><em>My first entry</em></title>
<link type="text/html" rel="alternate"
href="http://www.example.org/entry-url" title="My first
entry"/>
<author>
<name>Rob</name>
</author>
<id>http://www.example.org/entry-guid</id>
<updated>2006-03-23T02:29:22-05:00</updated>
<published>2006-03-23T01:53:59-05:00</published>
<content><![CDATA[This is my first Atom entry!<br />More to
follow]]></content>
</entry>
</feed>
DOM: Document
Modification
dom/modify.php
/* Assume $entry refers to the first /* These will work */
entry element within the Atom
document */
$children = $entry->childNodes;
$length = $children->length - 1;
while ($entry->hasChildNodes()) {
$entry->removeChild($entry- for ($x=$length; $x >=0; $x--) {
>firstChild); $entry->removeChild($children-
}
>item($x));
OR }
$node = $entry->lastChild; OR
while($node) {
$prev = $node->previousSibling;
$entry->removeChild($node); $elem = $entry->cloneNode(FALSE);
$node = $prev; $entry->parentNode-
} >replaceChild($elem,
/* This Will Not Work! */
foreach($entry->childNodes AS $entry);
$node) {
$entry->removeChild($node);
}
DOM and Namespaces
<xsd:complexType
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"
name="ArrayOfint">
<xsd:complexContent>
<xsd:restriction base="soapenc:Array">
<xsd:attribute ref="soapenc:arrayType"
wsdl:arrayType="xsd:int[ ]"/>
</xsd:restriction>
</xsd:complexContent>
</xsd:complexType>
Dom and Namepsaces
dom/namespace.php
define("SCHEMA_NS", "http://www.w3.org/2001/XMLSchema");
define("WSDL_NS", "http://schemas.xmlsoap.org/wsdl/");
$dom = new DOMDocument();
$root->setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:wsdl",
WSDL_NS);
$root->setAttribute("name", "ArrayOfint");
$nodelist = $xpath->query("//name");
print "Last Book Title: ".$nodelist->item($nodelist->length - 1)-
>textContent."\n";
$nodelist = $xpath->query("//name[ancestor::rare]");
print "Last Rare Book Title: ".$nodelist->item($nodelist->length - 1)-
>nodeValue."\n";
$inventory = $xpath->evaluate("sum(//book/@qty)");
print "Total Books: ".$inventory."\n";
$inventory = $xpath->evaluate("sum(//classics/book/@qty)");
print "Total Classic Books: ".$inventory."\n";
$inventory = $xpath->evaluate("count(//book[parent::classics])");
print "Distinct Classic Book Titles: ".$inventory."\n";
DOM and Xpath Results
/* $nodelist = $xpath->query("//name")
$nodelist->item($nodelist->length - 1)->textContent */
Last Book Title: Of Mice and Men
/* $xpath->query("//name[ancestor::rare]");
$nodelist->item($nodelist->length - 1)->nodeValue */
Last Rare Book Title: Cannery Row
/* $xpath->evaluate("sum(//book/@qty)") */
Total Books: 54
/* $xpath->evaluate("sum(//classics/book/@qty)") */
Total Classic Books: 50
/* $xpath->evaluate("count(//book[parent::classics])") */
Distinct Classic Book Titles: 2
DOM and Xpath w/Namespaces
dom/xpath/dom-xpathns.xml
<store xmlns="http://www.example.com/store"
xmlns:bk="http://www.example.com/book">
<books>
<rare>
<bk:book qty="4">
<bk:name>Cannery Row</bk:name>
<bk:price>400.00</bk:price>
<bk:edition>1</bk:edition>
</bk:book>
</rare>
<classics>
<bk:book qty="25">
<bk:name>Grapes of Wrath</bk:name>
<bk:price>12.99</bk:price>
</bk:book>
<bk:book qty="25" xmlns:bk="http://www.example.com/classicbook">
<bk:name>Of Mice and Men</bk:name>
<bk:price>9.99</bk:price>
</bk:book>
</classics>
<classics xmlns="http://www.example.com/ExteralClassics">
<book qty="33">
<name>To Kill a Mockingbird</name>
<price>10.99</price>
</book>
</classics>
</books>
</store>
DOM and Xpath
w/Namespaces
dom/xpath/dom-xpathns.php
$nodelist = $xpath->query("//name");
print "Last Book Title: ".$nodelist->item($nodelist->length - 1)->textContent."\
n";
// Last Book Title: /* Why empty? */
$nodelist = $xpath->query("//bk:name");
print "Last Book Title: ".$nodelist->item($nodelist->length - 1)->textContent."\
n";
// Last Book Title: Grapes of Wrath /* Why not "Of Mice and Men" */
$nodelist = $xpath->query("//bk:name[ancestor::rare]");
print "Last Rare Book Title: ".$nodelist->item($nodelist->length - 1)-
>nodeValue."\n";
// Last Rare Book Title: /* Why empty? */
$xpath->registerNamespace("rt", "http://www.example.com/store");
$nodelist = $xpath->query("//bk:name[ancestor::rt:rare]");
print "Last Rare Book Title: ".$nodelist->item($nodelist->length - 1)-
>nodeValue."\n";
// Last Rare Book Title: Cannery Row
$xpath->registerNamespace("ext", "http://www.example.com/ExteralClassics");
$nodelist = $xpath->query("(//bk:name) | (//ext:name)");
print "Last Book Title: ".$nodelist->item($nodelist->length - 1)->textContent."\
n";
// Last Book Title: To Kill a Mockingbird
DOM and Xpath
w/Namespaces
dom/xpath/dom-xpathns.php
$xpath->registerNamespace("bk2",
"http://www.example.com/classicbook");
$nodelist = $xpath->query("//bk2:name");
print "Last Book Title (bk2): "
print $nodelist->item($nodelist->length - 1)->textContent."\n";
// Last Book Title (bk2): Of Mice and Men
Complete Results:
Last Book Title:
Last Book Title: Grapes of Wrath
Last Rare Book Title:
Last Rare Book Title: Cannery Row
Last Book Title: To Kill a Mockingbird
Last Book Title (bk2): Of Mice and Men
Performing Validation
dom/validation/validate.php
$doc->load('course.xml');
print "\nXML Schema Validation:\n";
if ($doc->schemaValidate('course.xsd') ) { print " Document
is valid\n"; }
$doc->load('course.xml');
print "\nRelaxNG Validation:\n";
if ($doc->relaxNGValidate('course.rng') ) { print " Document
is valid\n"; }
Performing Validation
Results
DTD Validation:
Document Is Valid
RelaxNG Validation:
Document is valid
Extending DOM Classes
Overriding the constructor requires
the parent constructor to be called.
Properties built into the DOM
classes cannot be overridden.
Methods built into the DOM classes
may can be overridden.
The lifespan of an extended object is
that of the object itself.
Extending DOM Classes
dom/extending/extending.php
class customElement extends DOMElement { }
class customDoc extends DOMDocument {
public $nodeName = "customDoc";
function __construct($rootName) {
parent::__construct();
if (! empty($rootName)) {
$element = $this->appendChild(new DOMElement($rootName)); }
}
function changeit($doc) {
$myelement = new customElement("custom", "element2");
$doc->replaceChild($myelement, $doc->documentElement);
print "Within changeit function: ".get_class($doc-
>documentElement)."\n";
}
unset($myelement);
print "After unset: ".get_class($doc->documentElement)."\n";
changeit($doc);
print "Outside changeit(): ".get_class($doc->documentElement)."\n";
/* Complete the URL adding App ID, limit to 5 results and only English results */
$url .= "&appid=zzz&results=5&language=en";
$sxe = simplexml_load_file($url);
/* Check for number of results returned */
if ((int)$sxe['totalResultsReturned'] > 0) {
/* Loop through each result and output title, url and modification
date */
foreach ($sxe->Result AS $result) {
print 'Title: '.$result->Title."\n";
print 'Url: '.$result->Url."\n";
print 'Mod Date: '.date ('M d Y', (int)$result-
>ModificationDate)."\n\n";
}
}
SimpleXML: Consuming Yahoo
WebSearch
RESULTS
Title: Zend Technologies - PHP 5 In Depth - XML in PHP 5 -
What's New?
Url: http://www.zend.com/php5/articles/php5-xmlphp.php
Mod Date: Mar 22 2006
$x = 0;
foreach ($books->classics AS $classics) {
if ($x++ == 0) {
$children =
$classics->children("http://www.example.com/classicbook");
/* Print name for the books where book element resides in a prefixed
namespace */
print $classics->children("http://www.example.com/book")->book-
>name."\n";
print $children->book->name."\n";
} else
print $classic->book->name."\n";
}
SimpleXML: Namespaces
Results
To Kill a Mockingbird
Grapes of Wrath
Of Mice and Men
To Kill a Mockingbird
SimpleXML: Xpath
simplexml/simplexml-xpathns.php
$sxe = simplexml_load_file('simplexml-xpathns.xml');
$nodelist = $sxe->xpath("//bk:name");
print "Last Book Title: ".$nodelist[count($nodelist) - 1]."\n";
$sxe->registerXPathNamespace("rt",
"http://www.example.com/store");
$nodelist = $sxe->xpath("//bk:name[ancestor::rt:rare]");
print "Last Rare Book Title: ".$nodelist[count($nodelist) - 1]."\n";
$sxe->registerXPathNamespace("ext",
"http://www.example.com/ExteralClassics");
$nodelist = $sxe->xpath("(//bk:name) | (//ext:name)");
print "Last Book Title: ".$nodelist[count($nodelist) - 1]."\n";
$sxe->registerXPathNamespace("bk2",
"http://www.example.com/classicbook");
$nodelist = $sxe->xpath("//bk2:name");
print "Last Book Title (bk2): ".$nodelist[count($nodelist) - 1]."\n";
SimpleXML: XPath
Results
Last Book Title: Grapes of Wrath
Last Rare Book Title: Cannery Row
Last Book Title: To Kill a Mockingbird
Last Book Title (bk2): Of Mice and
Men
SimpleXML: Advanced
Editing
simplexml/editing.php
$data = array(array('title'=>'Result 1', 'descript'=>'Res1 description'),
array('title'=>'Result 2', 'descript'=>'description of
Res2'),
array('title'=>'Result 3', 'descript'=>'This is result 3'));
class webservice extends simpleXMLElement {
public function appendElement($name, $value=NULL) {
$node = dom_import_simplexml($this);
$newnode = $value ? new DOMElement($name, $value) : new
DOMElement($name);
$node->appendChild($newnode);
return simplexml_import_dom($newnode, 'webservice');
}}
print $results->asXML();
?>
SimpleXML: Removing
data
RESULTS
<?xml version="1.0"?>
<results num="3">
<result>
<description>Res1 description</description>
</result>
<result>
<title>Result 3</title>
<description>This is result 3</description>
</result>
</results>
Simple API for XML
(SAX)
Event based push parser
Low memory usage
Works using function callbacks
Almost completely compatible with
ext/xml from PHP 4
Default encoding is UTF-8 rather
than ISO-8859-1 as it was in
PHP 4
SAX: Source Document
xml/xml_simple.xml
<?xml version='1.0'?>
<chapter xmlns:a="http://www.example.com/namespace-a"
xmlns="http://www.example.com/default">
<a:title>ext/xml</a:title>
<para>
First Paragraph
</para>
<a:section a:id="about">
<title>About this Document</title>
<para>
<!-- this is a comment -->
<?php echo 'Hi! This is PHP version ' . phpversion(); ?>
</para>
</a:section>
</chapter>
SAX: Simple Example
xml/xml_simple.php
<?php
function startElement($parser, $elementname, $attributes) {
print "* Start Element: $elementname \n";
foreach ($attributes as $attname => $attvalue) {
print " $attname => $attvalue \n";
}
}
function charDataHandler($parser,$data) {
if (trim($data) != "") print $data."\n";
}
$parser = xml_parser_create();
/* Disable as case is significant in XML */
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING,
false);
xml_set_element_handler($parser,"startElement","endElement");
xml_set_character_data_handler($parser, "charDataHandler");
xml_set_processing_instruction_handler ($parser,
"PIhandler");
xml_set_default_handler ($parser, "DefaultHandler");
First Paragraph
* End Element: para
* Start Element: a:section
a:id => about
* Start Element: title
About this Document
* End Element: title
* Start Element: para
Default: <!-- this is a comment -->
PI: php -> echo 'Hi! This is PHP version ' . phpversion();
* End Element: para
* End Element: a:section
* End Element: chapter
SAX: Error Handling
xml/xml_error.php
<?php
/* Malformed document */
$data = "<root>";
$parser = xml_parser_create();
object(LibXMLError)#1 (6) {
["level"]=>
int(3)
["code"]=>
int(5)
["column"]=>
int(7)
["message"]=>
string(41) "Extra content at the end of the document
"
["file"]=>
string(0) ""
["line"]=>
int(1)
}
SAX: Advanced Example
xml/xml_advanced.php
class cSax {
function startElement($parser, $elementname, $attributes) {
list($namespaceURI,$localName)= split("@",$elementname);
if (! $localName) {
$localName = $namespaceURI;
$namespaceURI = "";
}
print "* Start Element: $localName".
($namespaceURI ? " in $namespaceURI" : "")."\n";
foreach ($attributes as $attname => $attvalue) {
print " $attname => $attvalue \n";
}
}
$parser = xml_parser_create_ns("ISO-8859-1","@");
xml_set_object($parser, $objcSax);
xml_set_element_handler($parser,"startElement","endElement");
while($reader->read()) {
switch ($reader->nodeType) {
case XMLReader::ELEMENT:
print "Element: ".$reader->name."\n";
if ($reader->hasAttributes && $reader-
>moveToFirstAttribute()) {
do {
print " ".$reader->name."=".$reader->value."\n";
} while($reader->moveToNextAttribute());
}
break;
case XMLReader::PI:
print "PI Target: ".$reader->name."\n PI Data: ".$reader->value."\n";
}
}
XMLReader: Simple
Example
RESULTS
Local Name for Element: title
Namespace URI for Element:
http://www.example.com/namespace-a
Element: para
Element: a:section
a:id=about
Element: title
Element: para
PI Target: php
PI Data: echo 'Hi! This is PHP version ' .
phpversion();
XMLReader: Consuming Yahoo
Shopping
<?xml version="1.0" encoding="ISO-8859-1"?>
<ResultSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="urn:yahoo:prods"
xsi:schemaLocation="urn:yahoo:prods
http://api.shopping.yahoo.com/shoppingservice/v1/productsearch.xsd"
totalResultsAvailable="8850" firstResultPosition="2"
totalResultsReturned="2">
<Result>
<Catalog ID="1991433722">
<Url><![CDATA[http://shopping.yahoo.com/p:Linksys. . .2]]></Url>
<ProductName><![CDATA[Linksys WRT5. . .r Broadband
Router]]></ProductName>
<PriceFrom>59.99</PriceFrom>
<PriceTo>100.00</PriceTo>
<Thumbnail /><!-- child elements Url (CDATA), Height, Width -->
<Description><![CDATA[The Wireless-G . . .ces.]]></Description>
<Summary><![CDATA[IEEE 802.3, ...]]></Summary>
<UserRating /><!-- Rating sub elements -->
<SpecificationList /><!-- 0+ Specification child elements -->
</SpecificationList>
</Catalog>
</Result>
</ResultSet>
XMLReader: Consuming Yahoo
Shopping
xmlreader/rest_yahoo_shopping.php
function getTextValue($reader) { ... }
function processCatalog($reader) { ... }
function processResult($reader) { ... }
/* Complete the URL with App ID, limit to 1 result and start at second record */
$url .= "&appid=zzz&results=2&start=2";
function processResult($reader) {
$depth = $reader->depth;
if ($reader->isEmptyElement || ($reader->read() &&
$reader->nodeType == XMLReader::END_ELEMENT))
return;
libxml_use_internal_errors(TRUE);
while ($objReader->read()) {
if (! $objReader->isValid()) {
print "NOT VALID\n";
break;
}
}
$arErrors = libxml_get_errors();
foreach ($arErrors AS $xmlError) {
print $xmlError->message;
}
XMLReader: DTD
Validation
RESULTS
NOT VALID
Element section was declared #PCDATA but
contains non text nodes
XMLReader: Relax NG
Validation
xmlreader/validation/reader.rng
<?xml version="1.0" encoding="utf-8" ?>
<element name="chapter"
xmlns="http://relaxng.org/ns/structure/1.0">
<element name="title">
<text/>
</element>
<element name="para">
<text/>
</element>
<element name="section">
<attribute name="id" />
<text/>
</element>
</element>
XMLReader: Relax NG
Validation
xmlreader/validation/reader-rng.php
$objReader = XMLReader::open('reader.xml');
$objReader->setRelaxNGSchema('reader.rng');
libxml_use_internal_errors(TRUE);
while ($objReader->read()) {
if (! $objReader->isValid()) {
print "NOT VALID\n";
break;
}
}
$arErrors = libxml_get_errors();
foreach ($arErrors AS $xmlError) {
print $xmlError->message;
}
XMLReader: Relax NG
Validation RESULTS
NOT VALID
Did not expect element title there
XSL
Used to transform XML data
XSLT based on XPath
Works with DOM and SimpleXML, although
the DOM extension is required.
Provides the capability of calling PHP
functions during a transformation
DOM nodes may be returned from PHP
functions
The LIBXML_NOCDATA and LIBXML_NOENT
constants are your friends.
libxslt 1.1.5+ is recommended to avoid
problems when using xsl:key
XSL: XML Input Data
xsl/sites.xml
<?xml version="1.0"?>
<sites>
<site xml:id="php-gen">
<name>PHP General</name>
<url>http://news.php.net/group.php?
group=php.general&format=rss</url>
</site>
<site xml:id="php-pear">
<name>PHP Pear Dev</name>
<url>http://news.php.net/group.php?
group=php.pear.dev&format=rss</url>
</site>
<site xml:id="php-planet">
<name>Planet PHP</name>
<url>http://www.planet-php.org/rss/</url>
</site>
</sites>
XSL: Simple
Transformation
xsl/simple_stylesheet.xsl
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="html"/>
<xsl:template match="/">
<html>
<body>
<xsl:apply-templates select="/sites/site"/>
</body>
</html>
</xsl:template>
<xsl:template match="/sites/site">
<p><xsl:value-of select="./name"/> : <xsl:value-of
select="./url"
disable-output-
escaping="yes"/></p>
</xsl:template>
XSL: Simple
Transformation
xsl/simple_stylesheet.php
/* Load Stylesheet */
$stylesheet = new DOMDocument();
$stylesheet->load('simple_stylesheet.xsl');
print $proc->transformToXML($dom);
XSL: Simple
Transformation
RESULTS
<html>
<body>
<p>PHP General : http://news.php.net/group.php?
group=php.general&format=rss</p>
<p>PHP Pear Dev : http://news.php.net/group.php?
group=php.pear.dev&format=rss</p>
<p>Planet PHP :
http://www.planet-php.org/rss/</p>
</body>
</html>
XSL: Advanced
Transformation
xsl/advanced_stylesheet.php
function initReader($url) {
$GLOBALS['reader'] = new XMLReader();
if ($GLOBALS['reader']->open($url)) {
while ($GLOBALS['reader']->read() && $GLOBALS['reader']->name != 'item')
{}
if ($GLOBALS['reader']->name == 'item')
return 1;
}
$GLOBALS['reader'] = NULL;
return 0;
}
function readNextItem() {
if ($GLOBALS['reader'] == NULL)
return NULL;
if ($GLOBALS['beingProc'])
$GLOBALS['reader']->next('item');
else
$GLOBALS['beingProc'] = TRUE;
if ($GLOBALS['reader']->name == 'item')
return $GLOBALS['reader']->expand();
return NULL;
XSL: Advanced
Transformation
xsl/advanced_stylesheet.php
$beingProc = FALSE;
$reader = NULL;
/* Load Stylesheet */
$stylesheet = new DOMDocument();
$stylesheet->load('advanced_stylesheet.xsl');
<xsl:template match="/">
<html><body>
<xsl:apply-templates select="id($siteid)"/>
</body></html>
</xsl:template>
<xsl:template match="/sites/site">
<xsl:variable name="itemnum"
select="php:functionString('initReader', ./url)" />
<xsl:if test="number($itemnum) > 0">
<xsl:call-template name="itemproc" />
</xsl:if>
</xsl:template>
XSL: Advanced
Transformation
xsl/advanced_stylesheet.xsl
<xsl:template match="item">
<p>
Title: <b><xsl:value-of select="./title" /></b><br/><br/>
URL: <xsl:value-of select="./link" /><br/>
Published: <xsl:value-of select="./pubDate" /><br/>
</p>
</xsl:template>
<xsl:template name="itemproc">
<xsl:variable name="nodeset"
select="php:functionString('readNextItem')" />
<xsl:if test="boolean($nodeset)">
<xsl:apply-templates select="$nodeset"/>
<xsl:call-template name="itemproc" />
</xsl:if>
</xsl:template>
</xsl:stylesheet>
XSL: Advanced
Transformation
Results viewed through a browser
xsl/advanced_stylesheet.html
Title: Re: How to ping a webserver with php?
URL: http://news.php.net/php.general/232552
Published: Fri, 24 Mar 2006 11:42:04 -0500
URL: http://news.php.net/php.general/232553
Published: Fri, 24 Mar 2006 11:52:14 -0500
URL: http://news.php.net/php.general/232554
Published: Fri, 24 Mar 2006 12:58:51 -0500
XMLWriter
Lightweight and forward-only API for
generating well formed XML
Automatically escapes data
Works with PHP 4.3+ available at
http://pecl.php.net/package/xmlwriter
Object Oriented API available for PHP
5+
Part of core PHP distribution since PHP
5.1.2
XMLWriter: Simple
Example
<?php xmlwriter/simple.php
$xw = new XMLWriter();
$xw->openMemory();
$xw->writeComment("this is a comment");
$xw->text(" ");
$xw->writePi("php", "echo 'Hi! This is PHP version ' . phpversion(); ");
$xw->text("\n ");
$xw->endDocument();
$xw->writeComment("this is a comment");
$xw->text(" ");
$xw->writePi("php", "echo 'Hi! This is PHP version ' . phpversion();
");
$xw->text("\n ");
$xw->endDocument();
$xw->startDocument('1.0', 'UTF-8');
$xw->startElement('Results');
$xw->startDocument('1.0', 'UTF-8');
$xw->startElement('Results');
Memory Usage:
DOM SimpleXML ext/xml XMLReader
85.6MB 85.6MB 26KB 177KB
<portType>
<!-- a set of abstract operations refrring to input and output
messages -->
</portType>
</definitions>
SOAP: Basic Message
Structure
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Header>
<!-- Information to extend message -->
<!-- For example WS-Security or transaction information -->
</soap:Header>
<soap:Body>
<!-- Either the message contents or soap:Fault -->
<soap:Fault>
<!-- SOAP Fault structure and data -->
</soap:Fault>
</soap:Body>
</soap:Envelope>
SOAP: The SoapClient
SoapClient::__construct ( mixed wsdl [, array options] )
try {
/* Create the Soap Client */
$client = new
SoapClient('http://api.google.com/GoogleSearch.wsdl');
$cached = $client->doGetCachedPage($key,
'http://www.google.com/');
try {
/* Create the Soap Client with debug option */
$client = new SoapClient('http://api.google.com/GoogleSearch.wsdl',
$client_options);
$cached = $client->doGetCachedPage($key,
'http://www.google.com/');
} catch (SoapFault $e) {
print "Last Request Headers: \n".$client-
>__getLastRequestHeaders()."\n\n";
print "Last Request: \n".$client->__getLastRequest()."\n\n";
print "Last Response Headers: \n".$client-
>__getLastRequestHeaders()."\n\n";
print "Last Response: \n".$client->__getLastResponse()."\n";
}
SOAP: Debugging Client
Request
RESULT
Last Request Headers:
POST /search/beta2 HTTP/1.1
Host: api.google.com
Connection: Keep-Alive
User-Agent: PHP SOAP 0.1
Content-Type: text/xml; charset=utf-8
SOAPAction: "urn:GoogleSearchAction"
Content-Length: 554
Last Request:
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:ns1="urn:GoogleSearch"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-
ENC="http://schemas.xmlsoap.org/soap/encoding/" SOAP-
ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"><SOAP-
ENV:Body><ns1:doGetCachedPage><key
xsi:type="xsd:string"></key><url
xsi:type="xsd:string">http://www.google.com/</url></ns1:doGetCachedPa
ge></SOAP-ENV:Body></SOAP-ENV:Envelope>
SOAP: Debugging Client
Requests
Last Response Headers: RESULT Continued
POST /search/beta2 HTTP/1.1
Host: api.google.com
Connection: Keep-Alive
User-Agent: PHP SOAP 0.1
Content-Type: text/xml; charset=utf-8
SOAPAction: "urn:GoogleSearchAction"
Content-Length: 554
Last Response:
<?xml version='1.0' encoding='UTF-8'?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<SOAP-ENV:Body>
<SOAP-ENV:Fault>
<faultcode>SOAP-ENV:Server</faultcode>
<faultstring>Exception from service object: Invalid authorization key: </faultstring>
<faultactor>/search/beta2</faultactor>
<detail>
<stackTrace>com.google.soap.search.GoogleSearchFault: Invalid authorization key:
at
com.google.soap.search.QueryLimits.lookUpAndLoadFromINSIfNeedBe(QueryLimits.jav
a:213)
. . .</stackTrace>
</detail>
</SOAP-ENV:Fault>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
SOAP: Client and
Document/Literal
soap/docliteral/compsearch_funcs.php
<?php
$wsdl = "http://ws.invesbot.com/companysearch.asmx?WSDL";
try {
/* Create the Soap Client to a Company Search */
$client = new SoapClient($wsdl);
Types:
struct Search { struct
string Keyword; SearchResponse {
string Field; SearchResult
string CurrentPage; SearchResult;
string PageSize; }
}
struct SearchResult {
<anyXML> any;
}
...
SOAP: Client and
Document/Literal
soap/docliteral/compsearch.php
class Search {
public $Keyword;
public $CurrentPage = 1;
public $PageSize = 10;
}
$wsdl = "http://ws.invesbot.com/companysearch.asmx?WSDL";
/* Create the Soap Client to a Company Search */
$client = new SoapClient($wsdl);
var_dump($results);
SOAP: Client and
Document/Literal
RESULTS (soap/docliteral/compsearch.php)
object(stdClass)#2 (1) {
["SearchResult"]=>
object(stdClass)#3 (1) {
["any"]=>
string(6392) "<SearchResults
xmlns=""><ResultsInfo><TotalResults>777</TotalResults><CurrentPa
ge>1</CurrentPage><PageSize>10</PageSize></
ResultsInfo><SearchResult><cik>1002607</cik><symbol/
><comname>ATARI INC</comname><accessnumber>0000950123-05-
013898</accessnumber><formtype>10-K/A</
formtype><filingdate>20051121</filingdate><result>, 2005 (the
"Amendment Effective Date") by and between
<B>Microsoft</B> Licensing, GP, a Nevada general
partnership ("<B>Microsoft</B>"), and Atari, Inc. ("Licensee"
or "Publisher"), and supplements that certain ... (the "PLA). RECITALS A.
<B>Microsoft</B> and Publisher entered into the PLA to
establish the terms under which
</result><time/><price/><change/><marketcap/></SearchResult><S
earchResult><cik>865570</cik><symbol/><comname>THQ . . . .
SOAP: Server WSDL (using
Document/Literal)
soap/server/exampleapi.wsdl
<xsd:element name="getPeopleByFirstLastName">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="first" type="xsd:string"/>
<xsd:element name="last" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:complexType name="Person">
<xsd:all>
<xsd:element name="id" type="xsd:int"/>
<xsd:element name="lastName" type="xsd:string"/>
<xsd:element name="firstName" type="xsd:string"/>
</xsd:all>
</xsd:complexType>
<xsd:element name="getPeopleByFirstLastNameResponse"
type="tns:ArrayOfPerson"/>
<message name="getPeopleByFirstLastName">
<part name="parameters" element="tns:getPeopleByFirstLastName"/>
</message>
<message name="getPeopleByFirstLastNameResponse">
<part name="result" element="tns:getPeopleByFirstLastNameResponse"/>
</message>
SOAP: Server
soap/server/soap_server.php
<?php
/* System status - TRUE indicates normal operation /
FALSE indicates down for maintenance */
$SYS_STATUS = TRUE;
/* Create the server using WSDL and specify the actor URI */
$sServer = new SoapServer("exampleapi.wsdl",
array('actor'=>'urn:ExampleAPI'));
<?php
try {
$sClient = new SoapClient('exampleapi.wsdl');
var_dump($response);
array(1) {
[0]=>
object(stdClass)#2 (3) {
["id"]=>
int(1)
["lastName"]=>
string(5) "Smith"
["firstName"]=>
string(4) "John"
}
}
Questions?
XML Security
Provides possible detection of altered
documents
Can be used to prevent attacks that
alter a document yet maintain integrity
Depending upon cryptographic
algorithms used can also provide
authenticity of document author
Provides the capabilities of protecting
data from unauthorized access using
encryption
XML Security:Basic
Integrity
xmlsecurity/basic_message_integrity.php
/* Generate SHA1 and MD5 hash */
$sha1hash = sha1_file('xmlsec.xml');
$md5hash = md5_file('xmlsec.xml');
/* Print resulting hashes */
print $sha1hash."\n";
print $md5hash."\n";
/* Create and store a new hash for the next time document is accessed */
$sha1hash = sha1_file('xmlsec.xml');
print 'New Hash: '.$sha1hash."\n";
} else {
print 'File has been altered!';
}
XML Security:Basic
Integrity HMAC
xmlsecurity/basic_message_integrity_hmac.php
$secret_key = 'secret';
/* Encrypt Data */
$orderxml = file_get_contents('order.xml');
$enc_document = $dom->saveXML();
XML Security: Basic
Decryption
xmlsecurity/basic_encryption.php
/* De-Crypt Data */
$dom = new DOMDocument();
$dom->loadXML($enc_document);
$order = $dom->documentElement;
foreach ($order->childNodes AS $node) {
if ($node->nodeName == 'encrypted') {
/* Get Initialization Vector */
$iv = base64_decode($node->getAttribute('iv'));
/* Get data, and decode it */
$data = base64_decode($node->nodeValue);
$frag = $dom->createDocumentFragment();
/* Functionality available in PHP 5.1 */
$frag->appendXML($decrypted_data);
/* Replacement node */
$order->replaceChild($frag, $node);
break;
}
}
print $dom->saveXML();