首页 > 代码库 > 解析XML的4种方式及优缺点比较
解析XML的4种方式及优缺点比较
dom, sax是解析xml的底层接口
而jdom和dom4j则是基于底层api的更高级封装
dom是通用的,而jdom和dom4j则是面向java语言的
(方法一). DOM解析说明:为 XML 文档的已解析版本定义了一组接口。解析器读入整个文档,然后构建一个驻留内存的树结
构,然后代码就可以使用 DOM 接口来操作这个树结构。优点:整个文档树在内存中,便于操作;支持删除、修改、重新排列等多种功能;缺点:将整个文档调入内存(包括无用的节点),浪费时间和空间;使用场合:一旦解析了文档还需多次访问这些数据;硬件资源充足(内存、CPU)。
1 import java.io.File; 2 3 import javax.xml.parsers.DocumentBuilder; 4 import javax.xml.parsers.DocumentBuilderFactory; 5 6 import org.w3c.dom.Attr; 7 import org.w3c.dom.Document; 8 import org.w3c.dom.Element; 9 import org.w3c.dom.NamedNodeMap;10 import org.w3c.dom.Node;11 import org.w3c.dom.NodeList;12 13 public class DomDemo {14 15 private final static String xmlPath = "D:\\WHS01.xml";16 17 /**18 * 递归输出节点19 * 20 * @param element21 */22 public static void parseElement(Element element) {23 24 System.out.print("<" + element.getTagName());25 NamedNodeMap map = element.getAttributes();26 if (null != map) {27 for (int i = 0; i < map.getLength(); i++) {28 Attr attr = (Attr) map.item(i);29 System.out.print(" " + attr.getName() + "=\"" + attr.getValue()30 + "\"");31 }32 }33 System.out.print(">");34 35 NodeList childList = element.getChildNodes();36 37 for (int i = 0; i < childList.getLength(); i++) {38 Node node = childList.item(i);39 40 switch (node.getNodeType()) {41 case Node.ELEMENT_NODE:42 parseElement((Element) node);43 break;44 case Node.TEXT_NODE:45 System.out.print(node.getNodeValue());46 break;47 default:48 break;49 }50 }51 System.out.print("</" + element.getTagName() + ">");52 }53 54 /**55 * 根据指定路径获取Document对象56 * 57 * @param xmlPath58 * @return59 * @throws Exception60 */61 public static Document getDocument(String xmlPath) throws Exception {62 63 Document document = null;64 if (null == xmlPath || "".equals(xmlPath.trim()))65 return document;66 67 File file = new File(xmlPath);68 if (file.canRead() && file.exists()) {69 DocumentBuilderFactory factory = DocumentBuilderFactory70 .newInstance();71 DocumentBuilder builder = factory.newDocumentBuilder();72 document = builder.parse(file);73 }74 return document;75 }76 77 public static void main(String[] args) {78 Document document = null;79 try {80 document = getDocument(xmlPath);81 } catch (Exception e) {82 e.printStackTrace();83 }84 // 传入根节点85 parseElement(document.getDocumentElement());86 }87 }
(方法二)SAX解析说明:为解决DOM的问题,出现了SAX。SAX ,事件驱动。当解析器发现元素开始、元素结束、文本、文档的
开始或结束等时,发送事件,程序员编写响应这些事件的代码,保存数据。优点:不用事先调入整个文档,占用资源少;SAX解析器代码比DOM解析器代码小,适于Applet,下载。缺点:不是持久的;事件过后,若没保存数据,那么数据就丢了;无状态性;从事件中只能得到文本,但不知该文本属于哪个元素;使用场合:Applet;只需XML文档的少量内容,很少回头访问;机器内存少
1 <?xml version="1.0" encoding="UTF-8"?> 2 <BOM Code="LM4029"> 3 <Child Code="LM4029MC"> 4 <Quantity>2.000000</Quantity> 5 </Child> 6 <Child Code="LM4029D"> 7 <Quantity>1.000000</Quantity> 8 </Child> 9 <Child Code="LM4029PH">10 <Quantity>1.000000</Quantity>11 </Child>12 <Child Code ="LM4029PS">13 <Quantity>1.000000</Quantity>14 </Child>15 <Child Code="LM4029SB">16 <Quantity>1.000000</Quantity>17 </Child>18 </BOM>
1 import java.util.Stack; 2 3 import javax.xml.parsers.SAXParser; 4 import javax.xml.parsers.SAXParserFactory; 5 6 import org.xml.sax.Attributes; 7 import org.xml.sax.SAXException; 8 import org.xml.sax.helpers.DefaultHandler; 9 10 public class SAXDemo {11 12 private final static String uri = "D:\\bom.xml";13 14 public static void main(String[] args) {15 SAXParserFactory factory = SAXParserFactory.newInstance();16 try {17 SAXParser parser = factory.newSAXParser();18 parser.parse(uri, new MyHandler());19 } catch (Exception e) {20 e.printStackTrace();21 }22 }23 }24 25 class Child {26 private String code;27 private double quantity;28 29 public String getCode() {30 return code;31 }32 33 public void setCode(String code) {34 this.code = code;35 }36 37 public double getQuantity() {38 return quantity;39 }40 41 public void setQuantity(double quantity) {42 this.quantity = quantity;43 }44 45 }46 47 class MyHandler extends DefaultHandler {48 49 Stack<String> stack = null;50 Child child = null;51 52 @Override53 public void startDocument() throws SAXException {54 System.out.println("start document");55 stack = new Stack<String>();56 }57 58 @Override59 public void endDocument() throws SAXException {60 System.out.println("end document");61 stack = null;62 }63 64 @Override65 public void startElement(String uri, String localName, String qName,66 Attributes attributes) throws SAXException {67 stack.push(qName);68 if ("Child".equals(qName)) {69 child = new Child();70 child.setCode(attributes.getValue("Code"));71 }72 }73 74 @Override75 public void endElement(String uri, String localName, String qName)76 throws SAXException {77 stack.pop();78 if ("Child".equals(qName)) {79 System.out.println("Code -> " + child.getCode() + ", Quantity -> "80 + child.getQuantity());81 child = null;82 }83 }84 85 @Override86 public void characters(char[] ch, int start, int length)87 throws SAXException {88 if ("Quantity".equals(stack.peek())) {89 child.setQuantity(Double.parseDouble(new String(ch, start, length)));90 }91 }92 }
(方法三)JDOM解析说明:为减少DOM、SAX的编码量,出现了JDOM;优点:20-80原则,极大减少了代码量。使用场合:要实现的功能
简单,如解析、创建等,但在底层,JDOM还是使用SAX(最常用),DOM
1 import java.io.File; 2 import java.io.IOException; 3 import java.util.List; 4 5 import org.jdom2.Document; 6 import org.jdom2.Element; 7 import org.jdom2.JDOMException; 8 import org.jdom2.input.SAXBuilder; 9 10 public class JdomDemo {11 12 private final static String xmlPath = "D:\\bom.xml";13 14 public static void readXml() throws JDOMException, IOException {15 SAXBuilder builder = new SAXBuilder();16 Document document = builder.build(new File(xmlPath));17 18 Element element = document.getRootElement();19 List<Element> childList = element.getChildren("Child");20 21 for (int i = 0; i < childList.size(); i++) {22 String code = childList.get(i).getAttributeValue("Code");23 double quantity = Double.parseDouble(childList.get(i).getChildText(24 "Quantity"));25 System.out.println("Code -> " + code + ", Quantity -> " + quantity);26 }27 }28 }
(方法四)DOM4J解析说明:DOM4J 是一个非常非常优秀的Java XML API,具有性能优异、功能强大和极端易用使用的特点,同
时它也是一个开放源代码的软件。如今你可以看到越来越多的 Java 软件都在使用 DOM4J 来读写 XML
1 import java.io.File; 2 import java.io.IOException; 3 import java.util.Iterator; 4 5 import javax.xml.parsers.DocumentBuilder; 6 import javax.xml.parsers.DocumentBuilderFactory; 7 import javax.xml.parsers.ParserConfigurationException; 8 9 import org.dom4j.Attribute;10 import org.dom4j.Document;11 import org.dom4j.DocumentException;12 import org.dom4j.Element;13 import org.dom4j.io.DOMReader;14 import org.dom4j.io.SAXReader;15 import org.xml.sax.SAXException;16 17 public class Dom4jDemo {18 19 private final static String xmlPath = "D:\\bom.xml";20 21 public static void parseElement() throws DocumentException,22 ParserConfigurationException, SAXException, IOException {23 // 方式一 DOM 24 // DocumentBuilderFactory factory =25 // DocumentBuilderFactory.newInstance();26 // DocumentBuilder builder = factory.newDocumentBuilder();27 // org.w3c.dom.Document domDoc = builder.parse(xmlPath);28 // DOMReader domReader = new DOMReader();29 // Document document = domReader.read(domDoc);30 31 //方式二 SAX(常用)32 SAXReader saxReader = new SAXReader();33 Document document = saxReader.read(new File(xmlPath));34 Element rootEl = document.getRootElement();35 36 for (Iterator iterator = rootEl.elementIterator("Child"); iterator37 .hasNext();) {38 Element e = (Element) iterator.next();39 System.out.print("Code -> " + e.attributeValue("Code"));40 System.out.println(" Quantity -> " + e.elementText("Quantity"));41 }42 }43 44 }
总结:
1.DOM, JDOM在性能上不如SAX, DOM4J, 在小文档情况下还值得考虑使用 DOM 和 JDOM。
2.DOM 实现广泛应用于多种编程语言。它还是许多其它与 XML 相关的标准的基础,因为它正式获得 W3C 推荐(与基于非标准的 Java 模型相对),所以在某些类型的项目中可能也需要它(如在 javascript 中使用 DOM)。
3.SAX 的高效取决于特定的解析方式,不用事先调入整个文档,占用资源少。
4.如果不考虑可移植性, 首先考虑使用DOM4J。