This edition, like many modern digital editions, is based on XML (eXtensible Markup Language) documents. XML is a standard for the creation of documents in which the document text is interspersed with ‘tags’, brief labels that describe the nature and properties of the text fragments that they surround. The Text Encoding Initiative (TEI) has proposed guidelines for the names and types of the tags to be employed in humanities texts.49
Out of the 400+ existing tags, a so-called ‘schema’ can be created that contains exactly those tags that are applicable to a certain type of document (such as a letter that is prepared for a scholarly edition). New tags can be defined when the existing tagset is insufficient. The schema describes the required and permitted tags in a class of XML documents. It can be used to check the correctness of these documents. A dedicated schema was created for the Van Gogh edition. A number of non-standard tags were used, some of which were ‘borrowed’ from the DALF (Digital Archive of Letters in Flanders) project.50
One XML document was created for each of Van Gogh’s letters and each related document. It holds letter-level metadata (title, number, date, correspondents, etc.), the full transcription, the translation, the notes, the textual notes, and the information that connects transcribed pages with images of those pages (facsimile elements). The XML files were created in an automatic conversion from word-processor documents. The conversion result was checked and extensively corrected. The XML files were manually indexed to facilitate searching and cross-referencing.
For those interested in technical matters, we provide some sample XML files. In the zip file we also include the so-called 'ODD' file which is used to customise the TEI schema and the schema files generated from the ODD-file.51 We use W3C schema rather than Relax NG because the contractors who performed the conversion to XML were more familiar with that format.
Other data sources for the edition include a number of databases. The editors created databases containing physical descriptions of the letters, data about the illustrations, and information about the people mentioned in the correspondence. Those databases were used for various purposes, such as to support searching and to provide the illustrations with captions.
The facsimile images of the letters were created by the Metamorfoze programme, the Netherlands’ national programme for the preservation of paper heritage.
The site that houses the edition was built using the Ruby programming language. Two major Ruby programs were developed: (1) a program that generates static HTML pages and an index from the XML files describing the letters and the databases, and (2) a web server that searches the index based on requests from the search forms in the static site. The second program also facilitates autocompletion of some of the search fields in the advanced search screen. The search index was built using the Lucene search engine. For zooming, software was developed based on the GSV image viewer.
We used the ImageMagick suite of tools for cutting up the facsimile images into tiles of various magnifications (for zooming) and for scaling the images.
6.3. Referring to the edition
Please refer to this edition as follows: Leo Jansen, Hans Luijten, Nienke Bakker (eds.) (2009), Vincent van Gogh - The Letters. Version: December 2010. Amsterdam & The Hague: Van Gogh Museum & Huygens ING. http://vangoghletters.org. Consult the homepage for the current version.
To hyperlink to individual letters and related documents, use links of the following form: http://vangoghletters.org/orig/let001 (to refer to the original-language version) or http://vangoghletters.org/en/let001 (to refer to the translation). Replace 'let001' by e.g. 'RM01' to create a link to the first related manuscript.