Publishing with XProc

Nic Gibson

doi:doi:10.14337/XMLLondon15.Gibson01

Publishing with XProc

Transforming documents through progressive refinement

Nic Gibson (Corbas Consulting and LexisNexis)

Abstract

Over the last few years, we, as a community, have spent a great deal of time writing code to convert Microsoft Word documents into XML. This is a common task with fairly predictable stages to it. We need to read the .Docx or WordML file and and transform the flat, formatting-rich XML in a well structured XML document.

One approach to this problem is to create a pipeline that uses a progressive refinement technique to achieve a simple sequence of transformations from one format to another. Given that this approach requires the ability to chain multiple transformations together, we decided to build a framework to enable that.

This paper explores the implementation of this kind of pipelining through XProc and examine the pipeline processing used. We discuss the use of progressive enhancement to convert Microsoft Word files to an intermediate format, considering the challenges involved in converting Word in context. We look at the features of XProc which enable this sort of processing.

Download Paper
Download Slides

How to cite this

Nic Gibson. "Publishing with XProc" Presented at XML London 2015, June 6-7th, 2015. doi:10.14337/XMLLondon15.Gibson01.

Video

SPARQL

​x
 
## Example SPARQL Query (Thanks to William Holmes)
## -- Find me all People that XML London knows about
## -- who are a member of the XML Guild
​
PREFIX org: <http://www.w3.org/ns/org#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
​
SELECT ?xml_london_id ?person_name
where {
  ?xml_london_id org:memberOf <http://xmlguild.org> .
  ?xml_london_id foaf:name ?person_name . 
  ?xml_london_id a foaf:Person
}
​

Browse

About

XML London - RDF triple store

All information about the XML London conference is open and available in Linked RDF format.

SPARQL Endpoint: http://xmllondon.com/sparql
Graph Store Protocol: http://xmllondon.com/data

Data Contributions and Thanks

Thanks go to Charles Foster and William Holmes for their contributions to the XML London dataset.

If you would like to contribute to the XML London dataset, please submit a Git Pull Request to https://github.com/cfoster/xmllondon-rdf

Please contact us if you find a bug or think something could be improved.

Contact Details

Address:
XML London, 103 High Street, Evesham, WR11 4DN, UK
Phone:
+44 (0) 1386 871 904
E-mail:
info@xmllondon.com
Social

XML London 2015

Publishing with XProc

Transforming documents through progressive refinement

Nic Gibson (Corbas Consulting and LexisNexis)

Abstract

Download Paper

Download Slides

How to cite this

Video

SPARQL

Browse

About

Send a Message

Contact Details