CENDARI Summer School Day One: XML, TEI, T-Pen and Tradamus.

CENDARI Summer School: Day One

CENDARI Summer School: Day One

The CENDARI summer school commenced with an introduction to the CENDARI project and a general discussion of the burgeoning trend in digitisation by Jakub Benes. The morning’s discussion highlighted important concerns that would resurface throughout the week, namely who decides what gets digitised and more worryingly, how our decision to digitise certain materials subsequently creates “blind spots” or casts shadows over non-digitised items, causing them to fall further from notice.

The first workshop of the day was lead by Roman Bleier from TCD, who demonstrated how using XML (Extensible Markup Language) and adhering to the TEI (Text Encoding Initiative) could make medieval material relevant for today’s digital audience by transcribing medieval documents into a language that is machine-readable.

Roman Bleier discussing XML and TEI

Roman Bleier discussing XML and TEI

In his workshop Roman explained that this is currently being achieved through XML, which is a Descriptive Markup Language that adds semantic value to the data by allowing the user to describe the content.

XML adds semantic value to the content

XML adds semantic value to the content

The advantage of using XML documents over creating a basic text file is audience. Creating XML documents for medieval materials extends their reach considerably as the content preserved in these items can be understood and searched by computers. Incorporating this digital language into our research not only reinvigorates our methodologies but increases accessibility to the resources themselves.

The advantages of embracing digital technology resumed with Kathleen Walker-Meikle from CERL (the Consortium of European Research Libraries). Kathleen’s workshop introduced two free online transcribing softwares, T-Pen (Transcription for Palaeographical and Editorial Notation) and Tradamus (the software which supports T-Pen). The T-Pen software allows researchers to upload images from the manuscripts that they are studying and to add a transcription to each line of the manuscript. The website provides a video tutorial on how to use T-Pen, which can be viewed by clicking on the following link: An Introduction to T-Pen.

I found T-Pen incredibly useful, the software allowed me to rearrange the columns on the folio I was transcribing as well as readjust the lines so that I could accurately transcribe each line. I was impressed with the flexibility of the software, one of the functions of T-Pen enabled researchers to add additional special characters by inputting the Unicode number. I found this aspect of the software particularly appealing as an Anglo-Saxonist, since I will need to be able to transcribe Latin, Old English and runic characters for my own research.

Kathleen Walker-Meikle guides the CENDARI summer school participants through T-Pen.

Kathleen Walker-Meikle guides the CENDARI summer school participants through T-Pen.

It was an auspicious start for me at the CENDARI summer school, the first day alone introduced me to two tools that would benefit my research enormously. Conforming to the TEI standard for XML would make my research machine-readable and accessible to a wider audience, while using T-Pen would facilitate my research by allowing me to transcribe online and to export my transcriptions in PDF or XML format. The remainder of the summer school promised to be just as beneficial.

Digital Skills for Research Postgraduates

The “Digital Skills for Research Postgraduates in the Humanities and Social Sciences” Digital Arts and Humanities module consisted of a one-day intensive workshop that highlighted the advantages to be gained through the application of digital skills to humanities research. The workshop was predominantly theoretically based and commenced with a presentation from module coordinator Paul O’ Shea, which introduced the following four concerns facing students of the burgeoning discipline of Digital Humanities:

  1. How does the ‘digital’ reshape traditional research skills in the Humanities?
  2. How will the digital age shape the contours of cultural and historical memory?
  3. Will digital storytelling coincide or diverge with oral and print-based storytelling?
  4. In the networked world we live in, what is the place of humanitas?

After the presentation these questions were addressed at length in a group discussion. I was placed in group two and we transcribed our answers  into the following Google document which can be viewed here: PG6011 In-class Discussion.

The practical element of the workshop involved participating in the Letters of 1916 crowd-sourcing project. The Letters of 1916 project is the first humanities project open to the public in Ireland, which seeks to create an online collection of letters written during 1916; specifically from 1st of November 1915 until 31st October 1916 (Letters of 1916). This online archival project is created by the public, in that it allows interested parties to register as a transcriber to encode its extensive epistolary evidence following the Text Encoding Initiative (TEI) compliant XML (Extensible MarkUp Language), which is the accepted standard for coding documents in the humanities. This aspect of the workshop held the greatest appeal for me as I was eager to add coding, which transcribes content into a machine-readable format, to the traditional transcription skills that I had acquired and refined throughout my MA. More importantly however, becoming familiar with these standards of coding is integral for my own research because of its potential to facilitate future research by allowing a more advanced searching of texts that encompasses not only individual words or phrases but codicological information as well.

The assignment for the workshop required that I attempt to transcribe two separate letters from the Letters of 1916 project. I chose two letters from the Official Documents category and quickly came to appreciate the fact that the Letters of 1916 project actively assists the transcription process by providing a transcription toolbar that is clearly explained in the Instructions. The presence of the toolbar and accompanying instructions was immensely reassuring for me as a first time digital transcriber. It was user-friendly and easy to get familiar with because the tabs in the toolbar already had the markup text for the frequently occurring features. This not only greatly sped up the transcription process but gently introduced novice transcribers such as myself, to the language of XML.

Despite these user-friendly measures, I quickly encountered problems completing my first letter: A Letter from William J. Thompson to Robert Chalmers, 6 June 1916. This particular letter is a three page document, that not only includes a letter but two comprehensive tables that contains numerical data referring to native Irish emigrants. My experience in transcribing this document revealed that the transcription toolbar was equipped to encode the features found in the opening letter, such as the address, date, salute, line breaks, paragraphs as well as the additions, marginalia and handwritten signature. The transcription toolbar was incredibly useful in this instance as although this letter was predominantly typed, it contained a considerable number of handwritten or stamped additions and marginalia to render the transcription process quite challenging. However, as I progressed through the document I quickly realised that the toolbar or the instructions manual had not specified how to encode the tabular material within this document. As I was encoding a document from the Official Documents category I assumed that there was a strong possibility that other letters would contain tables as well, so I emailed the editors of the Letters of 1916 project to bring this to their attention. Fortunately, as this was part of an assignment I could contact the module coordinator for assistance, who subsequently drew my attention to the TEI Guidelines. It became apparent that successful completion of this assignment would depend upon my own initiative to learn XML following the TEI guidelines for marking up tabular material. Thankfully these guidelines were easy to understand and I had little difficulty in applying the XML mark up to the tables contained within this letter.

The content of the second letter that I had selected, A Letter from Henry Arthur Wynne to Philip C. MacDermot, 27 July 1916, was more in keeping with the mark up that is generously provided by the Letters of 1916 project. In this letter Henry Arthur Wynne advises Philip C. MacDermot to “proceed with cases against as many of the persons charged as have been arrested” (Letters of 1916). In comparison with the first letter this document was simpler to encode as the toolbar was equipped to mark up its content. The letter itself was typed except for Henry Arthur Wynne’s handwritten signature and included a date, salute and concluded with the recipient’s address, all of which I could mark up easily using the transcription toolbar tabs.

In conclusion, the Digital Skills for Research Postgraduate Students workshop was intensive but incredibly beneficial to any student interested in integrating digital skills with traditional humanities research. Personally, I enjoyed participating in the Letters of 1916 crowd-sourcing project and found the experience incredibly exciting and rewarding. I appreciated being given the opportunity to acquire invaluable practical experience in encoding documents for future humanities’ research. Especially when one considers that if the Letters of 1916 project adhered to traditional practices only candidates with an extensive knowledge of this period would have been considered eligible to assist. Crowd-sourcing however, democraticises the project, by encouraging participation from all levels of society, thereby fostering a wider research network.

For a more comprehensive overview of the topics covered in the workshop check out #TEACHTEI or this Storify from workshop coordinator and media mogul Donna Alexander.

Thijs Porck

Scholar of Old English, Early Medieval England and Tolkien

Borderlines XXI

Authority in the Medieval and Early Modern World

For the Wynn

a blog about medieval manuscripts, by Kate Thomas

Medieval Marginalia

Exploring Medieval Folklore, Literature and Archaeology.


CACSSS PhD Informal Gathering


Spanish Society for Medieval English Language and Literature

Wellcome Collection Blog

The blog for the incurably curious

Windows & Wardrobes.

A trek through the world of children's films and literature.


Old English Literature and Other Interests.

Marilyn's Meandering Mind

Historian- Freelance Aritist - Painter and Digital Artist

Digital Material

National University of Ireland, Galway. 21-22 May 2015.

Google Ancient Places

Finding Ancient Mediterranean Places in Literature

Languages, Myths and Finds

Exploring Norse and Viking heritage in communities around Britain and Ireland

The Long View

Texts in context