What is it?
XPath (XML Path Language) is a query language for selecting nodes from an XML document. It is used to navigate through elements and has implemented the functions of the XSLT language for selecting values or nodes from the XML document.
Web.xPath()
XPath is used to fetch information from XML documents (hence the 'x' in xPath) or web pages. The code below is an example of how to fetch all links of the http://www.w3schools.com website. Each link is then visited and again with xPath the title is retrieved and printed out. By using the "//a/@href"
use String; use System; use Web; var page = Web.loadPage("http://www.w3schools.com/"); var links = Web.xPath(page, "//a/@href"); foreach(link in links) { if (!String.matches(link,".*javascript\\:void.*")) { var newpage = Web.loadPage(link); var title = Web.xPath(newpage, "//title/text()"); System.print(title); } }
XML.xPath()
To use xPaths on an XML file, we will load a variable with a bookstore example. To be concise, we load the XML inline into a variable.
use System, XML; var xml = " <bookstore> <book category=\"cooking\"> <title lang=\"en\">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category=\"children\"> <title lang=\"en\">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category=\"web\"> <title lang=\"en\">XQuery Kick Start</title> <author>James McGovern</author> <author>Per Bothner</author> <author>Kurt Cagle</author> <author>James Linn</author> <author>Vaidyanathan Nagarajan</author> <year>2003</year> <price>49.99</price> </book> <book category=\"web\"> <title lang=\"en\">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore> ";
We want to calculate the sum of all the prices of the books. If you want to buy all books, you need enough money to pay. To extend the example, we add an xPath query which will retrieve the price of all the books and sums them all up.
xml = XML.fromString(xml); // Retrieve the price of every the books var prices = XML.xPath(xml, "/bookstore//book/price/text()"); System.print(prices); var totalprice = 0; foreach(price in prices) { totalprice += price; } System.print(totalprice);
We are extending the example, by displaying the title of the last book. To retrieve this information, we use the last()-function as a where-statement (between square brackets []) as a condition on the query. The XPath query for extracting this information, is the following: "/bookstore/book[last()]/title/text()".
// Retrieve the title of the last book var titleOfLastBook = XML.xPath(xml, "/bookstore/book[last()]/title/text()"); System.print("Title of the last book is '" :: titleOfLastBook :: "'");
To effectualize the returned data set of the xPath query, we add calculations and filters. Less information returned from the xPath query, means less processing time. This selection of data could be handled by Xill IDE, but it's quicker to prevent non-important information to be loaded into memory of the computer system.
// Retrieve the authors of all books written before 2004 var authors = XML.xPath(xml, "/bookstore/book[year < 2004]//author/text()"); foreach(author in authors) { System.print(author :: " wrote a book before 2004"); }
For more information about XPath, please visit one of the mentioned resources.
Resources
- W3C's definition of XPath: W3C is an established international community that works on Web standards.
- W3Schools tutorial of XPath: W3C's tutorial about XPath