SlideShare a Scribd company logo
Simple API for XML (SAX) XML http://yht4ever.blogspot.com [email_address] B070066 - NIIT Quang Trung 08/2007
Contents Events SAX-based Parsers DOM vs. SAX Introduction
Introduction Simple API for XML Another method for accessing XML document’s contents Developed by XML-DEV mailing-list members Uses event-based model Notifications (events) are raised as document is parsed Originally designed as Java API  "Others (C++, Python, Perl) are now supported
DOM vs. SAX DOM Tree-based model Stores document data in node hierarchy Data is accessed quickly Provides facilities for adding and removing nodes SAX Invoke methods when markup (specific tag) is encountered Greater performance than DOM Less memory overhead than DOM Typically used for reading documents (not modifying them) (see more on slide note)
SAX-based Parsers Available for variety of programming languages e.g., Java, Python, etc. Some SAX-based parsers.
SAX-based Parsers SAX parser Invokes certain methods when events occur Programmers  override  these methods to process data
Example: Tree Diagram 1 // Fig. 9.3 : Tree.java 2 // Using the SAX Parser to generate a tree diagram. 3 4 import  java.io.*; 5 import  org.xml.sax.*;  // for HandlerBase class 6 import  javax.xml.parsers.SAXParserFactory; 7 import  javax.xml.parsers.ParserConfigurationException; 8 import  javax.xml.parsers.SAXParser; 9 10 public class  Tree  extends  HandlerBase { 11   private int  indent = 0;  // indentation counter 12 13   // returns the spaces needed for indenting 14   private  String spacer(  int  count ) 15   { 16   String temp = &quot;&quot;; 17 18   for  (  int  i = 0; i < count; i++ ) 19   temp += &quot;  &quot;; 20 21   return  temp; 22   } 23 24   // method called before parsing 25   // it provides the document location 26   public void  setDocumentLocator( Locator loc ) 27   { 28   System.out.println( &quot;URL: &quot; + loc.getSystemId() ); 29   } 30 1 // Fig. 9.3 : Tree.java 2 // Using the SAX Parser to generate a tree diagram. 3 4 import  java.io.*; 5 import  org.xml.sax.*;  // for HandlerBase class 6 import  javax.xml.parsers.SAXParserFactory; 7 import  javax.xml.parsers.ParserConfigurationException; 8 import  javax.xml.parsers.SAXParser; 9 10 public class  Tree  extends  HandlerBase { 11   private int  indent = 0;  // indentation counter 12 13   // returns the spaces needed for indenting 14   private  String spacer(  int  count ) 15   { 16   String temp = &quot;&quot;; 17 18   for  (  int  i = 0; i < count; i++ ) 19   temp += &quot;  &quot;; 20 21   return  temp; 22   } 23 24   // method called before parsing 25   // it provides the document location 26   public void  setDocumentLocator( Locator loc ) 27   { 28   System.out.println( &quot;URL: &quot; + loc.getSystemId() ); 29   } 30 import  specifies location of classes needed by application Assists in formatting Override method to output parsed document’s URL
31   // method called at the beginning of a document 32   public void  startDocument()  throws  SAXException 33   { 34   System.out.println( &quot;[ document root ]&quot; ); 35   } 36 37   // method called at the end of the document 38   public void  endDocument()  throws  SAXException 39   { 40   System.out.println( &quot;[ document end ]&quot; ); 41   } 42 43   // method called at the start tag of an element 44   public void  startElement( String name, 45   AttributeList attributes )  throws  SAXException 46   { 47   System.out.println( spacer( indent++ ) + 48   &quot;+-[ element : &quot; + name + &quot; ]&quot;); 49 50   if  ( attributes !=  null  ) 51 52   for  (  int  i = 0; i < attributes.getLength(); i++ ) 53   System.out.println( spacer( indent ) + 54   &quot;+-[ attribute : &quot; + attributes.getName( i ) + 55   &quot; ] \&quot;&quot; + attributes.getValue( i ) + &quot;\&quot;&quot; ); 56   } 57 Overridden method called when root node encountered Overridden method called when end of document is encountered Overridden method called when start tag is encountered Output each attribute’s name and value (if any)
58   // method called at the end tag of an element 59   public void  endElement( String name )  throws  SAXException 60   { 61   indent--; 62   } 63 64   // method called when a processing instruction is found 65   public void  processingInstruction( String target, 66   String value )  throws  SAXException 67   { 68   System.out.println( spacer( indent ) + 69   &quot;+-[ proc-inst : &quot; + target + &quot; ] \&quot;&quot; + value + &quot;\&quot;&quot; ); 70   } 71 72   // method called when characters are found 73   public void  characters(  char  buffer[],  int  offset, 74   int  length )  throws  SAXException 75   { 76   if  ( length > 0 ) { 77   String temp =  new  String( buffer, offset, length ); 78 79   System.out.println( spacer( indent ) + 80   &quot;+-[ text ] \&quot;&quot; + temp + &quot;\&quot;&quot; ); 81   } 82   } 83 Overridden method called when end of element is encountered Overridden method called when processing instruction is encountered Overridden method called when character data is encountered
84   // method called when ignorable whitespace is found 85   public void  ignorableWhitespace(  char  buffer[], 86   int  offset,  int  length ) 87   { 88   if  ( length > 0 ) { 89   System.out.println( spacer( indent ) + &quot;+-[ ignorable ]&quot; ); 90   } 91   } 92 93   // method called on a non-fatal (validation) error 94   public void  error( SAXParseException spe )  95   throws  SAXParseException 96   { 97   // treat non-fatal errors as fatal errors 98   throw  spe; 99   } 100 101   // method called on a parsing warning 102   public void  warning( SAXParseException spe ) 103   throws  SAXParseException 104   { 105   System.err.println( &quot;Warning: &quot; + spe.getMessage() ); 106   } 107 108   // main method 109   public static void   main( String args[] ) 110   { 111   boolean  validate =  false ; 112 Overridden method called when ignorable whitespace is encountered Overridden method called when error (usually validation) occurs Overridden method called when problem is detected (but not considered error) Method  main  starts application
113   if  ( args.length != 2 ) { 114   System.err.println( &quot;Usage: java Tree [validate] &quot; + 115   &quot;[filename]\n&quot; ); 116   System.err.println( &quot;Options:&quot; ); 117   System.err.println( &quot;  validate [yes|no] : &quot; + 118   &quot;DTD validation&quot; ); 119   System.exit( 1 ); 120   } 121 122   if  ( args[ 0 ].equals( &quot;yes&quot; ) ) 123   validate =  true ; 124 125   SAXParserFactory saxFactory = 126   SAXParserFactory.newInstance(); 127 128   saxFactory.setValidating( validate ); 129 Allow command-line arguments (if we want to validate document) SAXParserFactory  can instantiate SAX-based parser
130   try  { 131   SAXParser saxParser = saxFactory.newSAXParser(); 132   saxParser.parse(  new  File( args[ 1 ] ),  new  Tree() ); 133   } 134   catch  ( SAXParseException spe ) { 135   System.err.println( &quot;Parse Error: &quot; + spe.getMessage() ); 136   } 137   catch  ( SAXException se ) { 138   se.printStackTrace(); 139   } 140   catch  ( ParserConfigurationException pce ) { 141   pce.printStackTrace(); 142   } 143   catch  ( IOException ioe ) { 144   ioe.printStackTrace(); 145   } 146 147   System.exit( 0 ); 148   } 149 } Instantiate SAX-based parser Handles errors (if any)
URL: file:C:/Tree/spacing1.xml [ document root ] +-[ element : test ]   +-[ attribute : name ] &quot;  spacing 1  &quot;   +-[ text ] &quot; &quot;   +-[ text ] &quot;  &quot;   +-[ element : example ]   +-[ element : object ]   +-[ text ] &quot;World&quot;   +-[ text ] &quot; &quot; [ document end ]   1 <?xml version =  &quot;1.0&quot; ?> 2 3 <!-- Fig. 9.4 : spacing1.xml  --> 4 <!-- Whitespaces in nonvalidating parsing --> 5 <!-- XML document without DTD  --> 6 7 <test name =  &quot;  spacing 1  &quot; > 8   <example><object> World </object></example> 9 </test> Root element  test  contains attribute  name  with value  “ spacing 1 ” XML document with elements  test ,  example  and  object XML document does not reference DTD Note that whitespace is preserved: attribute value (line 7), line feed (end of line 7), indentation (line 8) and line feed (end of line 8)
URL: file:C:/Tree/spacing2.xml [ document root ] +-[ element : test ]   +-[ attribute : name ] &quot;  spacing 2  &quot;   +-[ ignorable ]   +-[ ignorable ]   +-[ element : example ]   +-[ element : object ]   +-[ text ] &quot;World&quot;   +-[ ignorable ] [ document end ]   1 <?xml version =  &quot;1.0&quot; ?> 2 3 <!-- Fig. 9.5 : spacing2.xml  --> 4 <!-- Whitespace and nonvalidated parsing --> 5 <!-- XML document with DTD  --> 6 7 <!DOCTYPE  test  [ 8 <!ELEMENT  test (example) > 9 <!ATTLIST  test name  CDATA #IMPLIED> 10 <!ELEMENT  element (object*) > 11 <!ELEMENT  object   ( #PCDATA ) > 12 ]> 13 14 <test name =  &quot;  spacing 2  &quot; > 15   <example><object> World </object></example> 16 </test> DTD checks document’s characters, so any “removable” whitespace is ignorable Line feed at line 14, spaces at beginning of line 15 and line feed at line 15 are ignored
URL: file:C:/Tree/notvalid.xml [ document root ] +-[ element : test ]   +-[ ignorable ]   +-[ ignorable ]   +-[ proc-inst : test ] &quot;message&quot;   +-[ ignorable ]   +-[ ignorable ]   +-[ element : example ]   +-[ element : item ]   +-[ text ] &quot;Hello & Welcome!&quot;   +-[ ignorable ] [ document end ]   1 <?xml version =  &quot;1.0&quot; ?> 2 3 <!-- Fig. 9.6 : notvalid.xml  --> 4 <!-- Validation and non-validation --> 5 6 <!DOCTYPE  test  [ 7 <!ELEMENT  test (example) > 8 <!ELEMENT  example ( #PCDATA ) > 9 ]> 10 11 <test> 12   <?test message?> 13   <example><item><![CDATA[ Hello & Welcome! ]]></item></example> 14 </test> Invalid document because element  example  cannot contain element  item Validation disabled, so document parses successfully Parser does not process text in  CDATA  section and returns character data
URL: file:C:/Tree/notvalid.xml [ document root ] +-[ element : test ]   +-[ ignorable ]   +-[ ignorable ]   +-[ proc-inst : test ] &quot;message&quot;   +-[ ignorable ]   +-[ ignorable ]   +-[ element : example ] Parse Error: Element &quot;example&quot; does not allow &quot;item&quot; Parsing terminates when fatal error occurs at element  item Validation enabled
URL: file:C:/Tree/valid.xml [ document root ] +-[ element : test ]   +-[ text ] &quot; &quot;   +-[ text ] &quot;  &quot;   +-[ element : example ]   +-[ text ] &quot;Hello &quot;   +-[ text ] &quot;&&quot;   +-[ text ] &quot; Welcome!&quot;   +-[ text ] &quot; &quot; [ document end ]   URL: file:C:/Tree/valid.xml [ document root ] Warning: Valid documents must have a <!DOCTYPE declaration. Parse Error: Element type &quot;test&quot; is not declared.   1 <?xml version =  &quot;1.0&quot; ?> 2 3 <!-- Fig. 9.7 : valid.xml  --> 4 <!-- DTD-less document  --> 5 6 <test> 7   <example> Hello &amp; Welcome! </example> 8 </test> Validation disabled in first output, so document parses successfully Validation enabled in second output, and parsing fails because DTD does not exist
To be continued… To be continued…
Reference XML How to program Sang Sin Presentation (sang.sin@sun.com)
Q&A Feel free to post questions at  http://yht4ever.blogspot.com or email to:  [email_address]  or  [email_address]
http://yht4ever.blogspot.com Thank You !

More Related Content

What's hot (20)

PDF
Tugas 2
Novi_Wahyuni
 
PPTX
14 thread
Bayarkhuu
 
PDF
Base de-datos
ferney1428
 
PDF
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
julien.ponge
 
PPTX
What’s new in C# 6
Fiyaz Hasan
 
PPTX
드로이드 나이츠 2018: RxJava 적용 팁 및 트러블 슈팅
재춘 노
 
PDF
Java 7 LavaJUG
julien.ponge
 
PDF
Lab4
siragezeynu
 
ODT
Java practical
william otto
 
PDF
merged_document_3
tori hoff
 
PDF
Spock: A Highly Logical Way To Test
Howard Lewis Ship
 
PPT
Network
phanleson
 
PDF
Internet Technology (Practical Questions Paper) [CBSGS - 75:25 Pattern] {Mast...
Mumbai B.Sc.IT Study
 
PPTX
Java practice programs for beginners
ishan0019
 
PDF
The Ring programming language version 1.9 book - Part 91 of 210
Mahmoud Samir Fayed
 
RTF
Easy Button
Adam Dale
 
KEY
Djangocon11: Monkeying around at New Relic
New Relic
 
PDF
The Ring programming language version 1.5.1 book - Part 75 of 180
Mahmoud Samir Fayed
 
PPTX
2 презентация rx java+android
STEP Computer Academy (Zaporozhye)
 
PDF
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
James Clause
 
Tugas 2
Novi_Wahyuni
 
14 thread
Bayarkhuu
 
Base de-datos
ferney1428
 
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
julien.ponge
 
What’s new in C# 6
Fiyaz Hasan
 
드로이드 나이츠 2018: RxJava 적용 팁 및 트러블 슈팅
재춘 노
 
Java 7 LavaJUG
julien.ponge
 
Java practical
william otto
 
merged_document_3
tori hoff
 
Spock: A Highly Logical Way To Test
Howard Lewis Ship
 
Network
phanleson
 
Internet Technology (Practical Questions Paper) [CBSGS - 75:25 Pattern] {Mast...
Mumbai B.Sc.IT Study
 
Java practice programs for beginners
ishan0019
 
The Ring programming language version 1.9 book - Part 91 of 210
Mahmoud Samir Fayed
 
Easy Button
Adam Dale
 
Djangocon11: Monkeying around at New Relic
New Relic
 
The Ring programming language version 1.5.1 book - Part 75 of 180
Mahmoud Samir Fayed
 
2 презентация rx java+android
STEP Computer Academy (Zaporozhye)
 
Advanced Dynamic Analysis for Leak Detection (Apple Internship 2008)
James Clause
 

Viewers also liked (20)

PPT
6 xml parsing
gauravashq
 
PPT
Java XML Parsing
srinivasanjayakumar
 
PPTX
Xml processors
Saurav Mawandia
 
PPT
Xml parsers
Manav Prasad
 
PDF
XML DOM
Hoang Nguyen
 
PPT
Java Web Service - Summer 2004
Danny Teng
 
PDF
Jaxp Xmltutorial 11 200108
nit Allahabad
 
PDF
Java Web Services [2/5]: Introduction to SOAP
IMC Institute
 
PDF
Java Web Services [5/5]: REST and JAX-RS
IMC Institute
 
PPT
Xml Java
cbee48
 
PDF
Java Web Services [3/5]: WSDL, WADL and UDDI
IMC Institute
 
PPTX
java API for XML DOM
Surinder Kaur
 
PDF
Java Web Services [1/5]: Introduction to Web Services
IMC Institute
 
PDF
Web Technologies in Java EE 7
Lukáš Fryč
 
PDF
Java Web Services
Jussi Pohjolainen
 
PDF
Java Web Services [4/5]: Java API for XML Web Services
IMC Institute
 
PDF
Community and Java EE @ DevConf.CZ
Markus Eisele
 
PPTX
Writing simple web services in java using eclipse editor
Santosh Kumar Kar
 
PPTX
Dom parser
sana mateen
 
6 xml parsing
gauravashq
 
Java XML Parsing
srinivasanjayakumar
 
Xml processors
Saurav Mawandia
 
Xml parsers
Manav Prasad
 
XML DOM
Hoang Nguyen
 
Java Web Service - Summer 2004
Danny Teng
 
Jaxp Xmltutorial 11 200108
nit Allahabad
 
Java Web Services [2/5]: Introduction to SOAP
IMC Institute
 
Java Web Services [5/5]: REST and JAX-RS
IMC Institute
 
Xml Java
cbee48
 
Java Web Services [3/5]: WSDL, WADL and UDDI
IMC Institute
 
java API for XML DOM
Surinder Kaur
 
Java Web Services [1/5]: Introduction to Web Services
IMC Institute
 
Web Technologies in Java EE 7
Lukáš Fryč
 
Java Web Services
Jussi Pohjolainen
 
Java Web Services [4/5]: Java API for XML Web Services
IMC Institute
 
Community and Java EE @ DevConf.CZ
Markus Eisele
 
Writing simple web services in java using eclipse editor
Santosh Kumar Kar
 
Dom parser
sana mateen
 
Ad

Similar to Simple API for XML (20)

PPT
Sax Dom Tutorial
vikram singh
 
PPT
XML SAX PARSING
Eviatar Levy
 
PPT
JSR 172: XML Parsing in MIDP
Jussi Pohjolainen
 
PPT
Xm lparsers
Suman Lata
 
PDF
SAX, DOM & JDOM parsers for beginners
Hicham QAISSI
 
PPT
Processing XML with Java
BG Java EE Course
 
PPTX
Sax parser
Mahara Jothi
 
PPT
5 xml parsing
gauravashq
 
PDF
Parsing XML Data
Mu Chun Wang
 
PDF
ApacheCon 2000 Everything you ever wanted to know about XML Parsing
Ted Leung
 
PDF
Service Oriented Architecture - Unit II - Sax
Roselin Mary S
 
PDF
24sax
Adil Jafri
 
PPT
SAX PARSER
Saranya Arunprasath
 
PDF
Web Technologies (8/12): XML & HTML Data Processing. Simple API for XML. Simp...
Sabin Buraga
 
ODP
SCDJWS 6. REST JAX-P
Francesco Ierna
 
PDF
Understanding Sax
LiquidHub
 
PPTX
XML-Free Programming
Stephen Chin
 
PPT
XML
thotasrinath
 
DOCX
Unit 2.3
Abhishek Kesharwani
 
PDF
XML-Motor
Abhishek Kumar
 
Sax Dom Tutorial
vikram singh
 
XML SAX PARSING
Eviatar Levy
 
JSR 172: XML Parsing in MIDP
Jussi Pohjolainen
 
Xm lparsers
Suman Lata
 
SAX, DOM & JDOM parsers for beginners
Hicham QAISSI
 
Processing XML with Java
BG Java EE Course
 
Sax parser
Mahara Jothi
 
5 xml parsing
gauravashq
 
Parsing XML Data
Mu Chun Wang
 
ApacheCon 2000 Everything you ever wanted to know about XML Parsing
Ted Leung
 
Service Oriented Architecture - Unit II - Sax
Roselin Mary S
 
24sax
Adil Jafri
 
SAX PARSER
Saranya Arunprasath
 
Web Technologies (8/12): XML & HTML Data Processing. Simple API for XML. Simp...
Sabin Buraga
 
SCDJWS 6. REST JAX-P
Francesco Ierna
 
Understanding Sax
LiquidHub
 
XML-Free Programming
Stephen Chin
 
XML-Motor
Abhishek Kumar
 
Ad

Recently uploaded (20)

PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Digital Circuits, important subject in CS
contactparinay1
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 

Simple API for XML

  • 1. Simple API for XML (SAX) XML http://yht4ever.blogspot.com [email_address] B070066 - NIIT Quang Trung 08/2007
  • 2. Contents Events SAX-based Parsers DOM vs. SAX Introduction
  • 3. Introduction Simple API for XML Another method for accessing XML document’s contents Developed by XML-DEV mailing-list members Uses event-based model Notifications (events) are raised as document is parsed Originally designed as Java API &quot;Others (C++, Python, Perl) are now supported
  • 4. DOM vs. SAX DOM Tree-based model Stores document data in node hierarchy Data is accessed quickly Provides facilities for adding and removing nodes SAX Invoke methods when markup (specific tag) is encountered Greater performance than DOM Less memory overhead than DOM Typically used for reading documents (not modifying them) (see more on slide note)
  • 5. SAX-based Parsers Available for variety of programming languages e.g., Java, Python, etc. Some SAX-based parsers.
  • 6. SAX-based Parsers SAX parser Invokes certain methods when events occur Programmers override these methods to process data
  • 7. Example: Tree Diagram 1 // Fig. 9.3 : Tree.java 2 // Using the SAX Parser to generate a tree diagram. 3 4 import java.io.*; 5 import org.xml.sax.*; // for HandlerBase class 6 import javax.xml.parsers.SAXParserFactory; 7 import javax.xml.parsers.ParserConfigurationException; 8 import javax.xml.parsers.SAXParser; 9 10 public class Tree extends HandlerBase { 11 private int indent = 0; // indentation counter 12 13 // returns the spaces needed for indenting 14 private String spacer( int count ) 15 { 16 String temp = &quot;&quot;; 17 18 for ( int i = 0; i < count; i++ ) 19 temp += &quot; &quot;; 20 21 return temp; 22 } 23 24 // method called before parsing 25 // it provides the document location 26 public void setDocumentLocator( Locator loc ) 27 { 28 System.out.println( &quot;URL: &quot; + loc.getSystemId() ); 29 } 30 1 // Fig. 9.3 : Tree.java 2 // Using the SAX Parser to generate a tree diagram. 3 4 import java.io.*; 5 import org.xml.sax.*; // for HandlerBase class 6 import javax.xml.parsers.SAXParserFactory; 7 import javax.xml.parsers.ParserConfigurationException; 8 import javax.xml.parsers.SAXParser; 9 10 public class Tree extends HandlerBase { 11 private int indent = 0; // indentation counter 12 13 // returns the spaces needed for indenting 14 private String spacer( int count ) 15 { 16 String temp = &quot;&quot;; 17 18 for ( int i = 0; i < count; i++ ) 19 temp += &quot; &quot;; 20 21 return temp; 22 } 23 24 // method called before parsing 25 // it provides the document location 26 public void setDocumentLocator( Locator loc ) 27 { 28 System.out.println( &quot;URL: &quot; + loc.getSystemId() ); 29 } 30 import specifies location of classes needed by application Assists in formatting Override method to output parsed document’s URL
  • 8. 31 // method called at the beginning of a document 32 public void startDocument() throws SAXException 33 { 34 System.out.println( &quot;[ document root ]&quot; ); 35 } 36 37 // method called at the end of the document 38 public void endDocument() throws SAXException 39 { 40 System.out.println( &quot;[ document end ]&quot; ); 41 } 42 43 // method called at the start tag of an element 44 public void startElement( String name, 45 AttributeList attributes ) throws SAXException 46 { 47 System.out.println( spacer( indent++ ) + 48 &quot;+-[ element : &quot; + name + &quot; ]&quot;); 49 50 if ( attributes != null ) 51 52 for ( int i = 0; i < attributes.getLength(); i++ ) 53 System.out.println( spacer( indent ) + 54 &quot;+-[ attribute : &quot; + attributes.getName( i ) + 55 &quot; ] \&quot;&quot; + attributes.getValue( i ) + &quot;\&quot;&quot; ); 56 } 57 Overridden method called when root node encountered Overridden method called when end of document is encountered Overridden method called when start tag is encountered Output each attribute’s name and value (if any)
  • 9. 58 // method called at the end tag of an element 59 public void endElement( String name ) throws SAXException 60 { 61 indent--; 62 } 63 64 // method called when a processing instruction is found 65 public void processingInstruction( String target, 66 String value ) throws SAXException 67 { 68 System.out.println( spacer( indent ) + 69 &quot;+-[ proc-inst : &quot; + target + &quot; ] \&quot;&quot; + value + &quot;\&quot;&quot; ); 70 } 71 72 // method called when characters are found 73 public void characters( char buffer[], int offset, 74 int length ) throws SAXException 75 { 76 if ( length > 0 ) { 77 String temp = new String( buffer, offset, length ); 78 79 System.out.println( spacer( indent ) + 80 &quot;+-[ text ] \&quot;&quot; + temp + &quot;\&quot;&quot; ); 81 } 82 } 83 Overridden method called when end of element is encountered Overridden method called when processing instruction is encountered Overridden method called when character data is encountered
  • 10. 84 // method called when ignorable whitespace is found 85 public void ignorableWhitespace( char buffer[], 86 int offset, int length ) 87 { 88 if ( length > 0 ) { 89 System.out.println( spacer( indent ) + &quot;+-[ ignorable ]&quot; ); 90 } 91 } 92 93 // method called on a non-fatal (validation) error 94 public void error( SAXParseException spe ) 95 throws SAXParseException 96 { 97 // treat non-fatal errors as fatal errors 98 throw spe; 99 } 100 101 // method called on a parsing warning 102 public void warning( SAXParseException spe ) 103 throws SAXParseException 104 { 105 System.err.println( &quot;Warning: &quot; + spe.getMessage() ); 106 } 107 108 // main method 109 public static void main( String args[] ) 110 { 111 boolean validate = false ; 112 Overridden method called when ignorable whitespace is encountered Overridden method called when error (usually validation) occurs Overridden method called when problem is detected (but not considered error) Method main starts application
  • 11. 113 if ( args.length != 2 ) { 114 System.err.println( &quot;Usage: java Tree [validate] &quot; + 115 &quot;[filename]\n&quot; ); 116 System.err.println( &quot;Options:&quot; ); 117 System.err.println( &quot; validate [yes|no] : &quot; + 118 &quot;DTD validation&quot; ); 119 System.exit( 1 ); 120 } 121 122 if ( args[ 0 ].equals( &quot;yes&quot; ) ) 123 validate = true ; 124 125 SAXParserFactory saxFactory = 126 SAXParserFactory.newInstance(); 127 128 saxFactory.setValidating( validate ); 129 Allow command-line arguments (if we want to validate document) SAXParserFactory can instantiate SAX-based parser
  • 12. 130 try { 131 SAXParser saxParser = saxFactory.newSAXParser(); 132 saxParser.parse( new File( args[ 1 ] ), new Tree() ); 133 } 134 catch ( SAXParseException spe ) { 135 System.err.println( &quot;Parse Error: &quot; + spe.getMessage() ); 136 } 137 catch ( SAXException se ) { 138 se.printStackTrace(); 139 } 140 catch ( ParserConfigurationException pce ) { 141 pce.printStackTrace(); 142 } 143 catch ( IOException ioe ) { 144 ioe.printStackTrace(); 145 } 146 147 System.exit( 0 ); 148 } 149 } Instantiate SAX-based parser Handles errors (if any)
  • 13. URL: file:C:/Tree/spacing1.xml [ document root ] +-[ element : test ] +-[ attribute : name ] &quot; spacing 1 &quot; +-[ text ] &quot; &quot; +-[ text ] &quot; &quot; +-[ element : example ] +-[ element : object ] +-[ text ] &quot;World&quot; +-[ text ] &quot; &quot; [ document end ] 1 <?xml version = &quot;1.0&quot; ?> 2 3 <!-- Fig. 9.4 : spacing1.xml --> 4 <!-- Whitespaces in nonvalidating parsing --> 5 <!-- XML document without DTD --> 6 7 <test name = &quot; spacing 1 &quot; > 8 <example><object> World </object></example> 9 </test> Root element test contains attribute name with value “ spacing 1 ” XML document with elements test , example and object XML document does not reference DTD Note that whitespace is preserved: attribute value (line 7), line feed (end of line 7), indentation (line 8) and line feed (end of line 8)
  • 14. URL: file:C:/Tree/spacing2.xml [ document root ] +-[ element : test ] +-[ attribute : name ] &quot; spacing 2 &quot; +-[ ignorable ] +-[ ignorable ] +-[ element : example ] +-[ element : object ] +-[ text ] &quot;World&quot; +-[ ignorable ] [ document end ] 1 <?xml version = &quot;1.0&quot; ?> 2 3 <!-- Fig. 9.5 : spacing2.xml --> 4 <!-- Whitespace and nonvalidated parsing --> 5 <!-- XML document with DTD --> 6 7 <!DOCTYPE test [ 8 <!ELEMENT test (example) > 9 <!ATTLIST test name CDATA #IMPLIED> 10 <!ELEMENT element (object*) > 11 <!ELEMENT object ( #PCDATA ) > 12 ]> 13 14 <test name = &quot; spacing 2 &quot; > 15 <example><object> World </object></example> 16 </test> DTD checks document’s characters, so any “removable” whitespace is ignorable Line feed at line 14, spaces at beginning of line 15 and line feed at line 15 are ignored
  • 15. URL: file:C:/Tree/notvalid.xml [ document root ] +-[ element : test ] +-[ ignorable ] +-[ ignorable ] +-[ proc-inst : test ] &quot;message&quot; +-[ ignorable ] +-[ ignorable ] +-[ element : example ] +-[ element : item ] +-[ text ] &quot;Hello & Welcome!&quot; +-[ ignorable ] [ document end ] 1 <?xml version = &quot;1.0&quot; ?> 2 3 <!-- Fig. 9.6 : notvalid.xml --> 4 <!-- Validation and non-validation --> 5 6 <!DOCTYPE test [ 7 <!ELEMENT test (example) > 8 <!ELEMENT example ( #PCDATA ) > 9 ]> 10 11 <test> 12 <?test message?> 13 <example><item><![CDATA[ Hello & Welcome! ]]></item></example> 14 </test> Invalid document because element example cannot contain element item Validation disabled, so document parses successfully Parser does not process text in CDATA section and returns character data
  • 16. URL: file:C:/Tree/notvalid.xml [ document root ] +-[ element : test ] +-[ ignorable ] +-[ ignorable ] +-[ proc-inst : test ] &quot;message&quot; +-[ ignorable ] +-[ ignorable ] +-[ element : example ] Parse Error: Element &quot;example&quot; does not allow &quot;item&quot; Parsing terminates when fatal error occurs at element item Validation enabled
  • 17. URL: file:C:/Tree/valid.xml [ document root ] +-[ element : test ] +-[ text ] &quot; &quot; +-[ text ] &quot; &quot; +-[ element : example ] +-[ text ] &quot;Hello &quot; +-[ text ] &quot;&&quot; +-[ text ] &quot; Welcome!&quot; +-[ text ] &quot; &quot; [ document end ] URL: file:C:/Tree/valid.xml [ document root ] Warning: Valid documents must have a <!DOCTYPE declaration. Parse Error: Element type &quot;test&quot; is not declared. 1 <?xml version = &quot;1.0&quot; ?> 2 3 <!-- Fig. 9.7 : valid.xml --> 4 <!-- DTD-less document --> 5 6 <test> 7 <example> Hello &amp; Welcome! </example> 8 </test> Validation disabled in first output, so document parses successfully Validation enabled in second output, and parsing fails because DTD does not exist
  • 18. To be continued… To be continued…
  • 19. Reference XML How to program Sang Sin Presentation ([email protected])
  • 20. Q&A Feel free to post questions at http://yht4ever.blogspot.com or email to: [email_address] or [email_address]