These are conference notes on Digital Humanities 2012 in Hamburg, Germany.

Note that these are being written during the conference so they will be spotty.

Opening by Claudine Moulin

Dr. Moulin gave a opening on research infrastructure from the European perspective. She authored the European Science Foundation report on Research Infrastructure in the Humanities and gave an overview of it which helped me better understand it. Among other things she called for a cultural history of infrastructure. She also emphasized that research infrastructure for humanists had to be designed by humanists.

Wednesday, July 18

Craig Belany: Opportunity and accountability in the ‘eResearch push’

Craig gave a great paper with some of the most elegant slides that started with Lang's dystopia "Metropolis" and the infrastructure it imagined. He commented that instead of physical infrastructure we now have cyberinfrastructure. He talked about e-research infrastructure investments in Australia. He talked about the modernist agenda of bigger is better. In an environment that thinks excellence is big science the humanities may be always disadvantaged.

We need cyberinfrastructure, but we need to be cautious. Despite the sums spent there has been almost no investment in the human - scholarships, postdocs, and so on.

He also commented on how cyberinfrastructure is usually funded only in the short term, which is paradoxical.

"It is not that e-research doesn't do things well, it is the promise of research that isn't doing well."


  • Do you think it is possible and desireable for th ehumanities to have its own "conceptual cyberinfrastructure"?
  • If so, how can dh help?

Rockwell and Sinclair: The Swallow Flies Swiftly Through

Stéfan and I gave a paper about analyzing the HUMANIST archives. You can see the abstract here. The links for those who want to try our corpus are:

Lynne Siemens: Notes from the Collaboratory: An Informal Study of an Academic DH Lab in Transition

Lynne and Ray gave a paper on the U Victoria Collaboratory. She started by talking about collaboration and the need to develop new work patterns, new social skills and so on. These are not typical skills developed in graduate programmes - we should look to science and applied sciences for models.

Some of the organizational structures in the sciences are:

  • The faculty-directed lab - this has clear reporting lines, but can be alienating for people in
  • The collaboratory - this is a second model that is not directed top-down

The collaboratory can be of two types. The co-laboratory where two or more research teams share infrastructure. There is also the colab-atory where researchers bring their own funding, but do research together. This type of lab needs more deliberate management. Except at the project-leader level (where people can be eccentric) the rest of a collaboratory has to be fairly well defined - roles and expectations need to be defined or a collaboratory can fail.

Lynne then talked about the collaboratory at Victoria. In the first phase it was small; in the second phase it got larger and staff wanted more autonomy and consensus management. This seems to have led to not much getting done. The collaboration diffused accountability and responsibility and important things didn't get done. The third phase is thus a structured collaboratory with a lead researcher.

The lesson (for a lab leader) is not to give up direction and hierarchy, especially if you are accountable financially and politically. The director must have the last word. One still wants an engaged lab as they are more productive and satisfied.

Digital Humanities as a university degree: The status quo and beyond

I was on a panel on the digital humanities degree organized that included: Thaller, Manfred; Sahle, Patrick; Clavaud, Florence; Clement, Tanya; Fiormonte, Domenico; Pierazzo, Elena; Rehbein, Malte; Rockwell, Geoffrey; Schreibman, Susan; and Sinclair, Stéfan.

It struck me that in other countries they have the same problem we have in Canada, that students don't know what the humanities are so the digital humanities doesn't draw students. Most of our programmes are called things like interactive arts, media studies, computing and cognition and so on.

Another interesting issue is the link with library schools. At U of A we have a joint MA/MLIS that is very popular - this link I think is working for both sides. Tanya Clement is developing something like this in Texas. See DH Answers thread.

Digital arts/media seems to be a very popular type of program. Universities like Concordia and SFU have very strong BF As? in computation arts, new media design and so on. We could learn from the way the arts have embraced digital practices and media.

Manfred Thaller had an interesting abstract definition of what makes a digital humanities degree. Someone prepared in DH should be able to:

  • Understand a humanities problem
  • Identify digital technologies of use in the addressing a problem
  • Able to manage the application of digital methods/technologies to the problem
  • Able to talk about the process and results

I argued that too often we just map old graduate models (courses, comps, and thesis) onto the digital as if swapping reading lists would work. We design graduate degrees to reproduce ourselves but students want options.

Poster Session

Susan Brown and I had a poster on our new in-browser XML editor, CWRC-Writer. You can see the abstract here. I spent the whole time showing CWRC-Writer and explaining the tradeoffs of using an in-browser XML editor so I didn't see other posters. You can see the list of posters here.

Thursday, July 18th

I went to the second short paper session. These short papers are a great idea - often academics give better papers when given less time.

Uncertain Date, Uncertain Place: Interpreting the History of Jewish Communities in the Byzantine Empire using GIS

Gethin Powell Rees talked about problems with spatial uncertainty. What do you do with a letter referring to Rhodes? Is it the city or island? They have developed a system of symbols for communicating uncertainty to people don't overinterpret evidence.

Uncovering lost histories through Geo Storyteller?: A digital Geo Humanities? project

Debbie L. Rabina and Anthony Cocciolo talked about an ARG they have developed to show people the German history of New York. GeoStoryteller is a platform for writing these stories. They have short video podcasts that were developed from original sources. They used white and grey literature. They used stuff in the public domain and put stuff back into the public domain.

They did user research on what youth (the intended audience) thought about the mobile experience. They wanted to see if the augmented reality made a difference. 90% thought that location was important. 64% experienced significant usability issues (the video didn't stream.)

People who understood augmented reality loved it, those who didn't found it frustrating. One has to train people.

I really appreciated how this short paper didn't just talk about they had done (or wanted to do) but also talked about the usability study they ran and the frustrations users experienced.

Words made Image. Towards a Language-Based Segmentation of Digitized Art Collections

Florentina Armaselu talked about semi-automatic segmentation. They are digitizing Byzantine and Orthodox icons. They want to identify regions of interest and connect to labels. They used GemIdent for image segmentation and the Stanford parser for extracting candidate labels. Gem Ident? comes from cancer image segmentation. It was impressive how Gem Ident separated out the regions of an icon.

This was an fascinating mashup of tools from very different worlds. I think this approach could be used in other fields like game studies.

The Digital Mellini Project: Exploring New Tools & Methods for Art-historical Research & Publication

Nuria Rodríguez talked about a project to a) explore new methods and tools with which to reinvent the concept of scholarly work and to b) explore the real behaviour and practices of art historians. Mellini's texts describes the paintings of a wealthy Roman family. "Pietro Mellini's 1681 rhyming inventory of paintings and drawings from his family's collection in Rome..." The Digital Mellini project has developed an interesting annotation tool that allows a critical dialogue to take place. They have tools for comparing images and a "concordance" that brings images together with Mellini's poetry on the painting and modern descriptions.

A plural multifaceted collection of information better represents the multifaceted understanding of a phenomenon by a community of scholars. The challenge is how to represent the represent open and dynamic knowledge. Other challenges are how to deal with multilinguality and how to cite portions.

centerNet and ACH AGM

The Association for Computing in the Humanities and centerNet has a joint Annual General Meeting. Neil Fraistat and Kay Walter talked about new initiatives from centerNet:

  • will become the "journal" of centerNet
  • centerNet is sponsoring DHCommons which, for example, runs workshops on digital careers, and they have a matchup database.
  • centerNet is taking over the Day of Digital Humanities
  • centerNet hopes to offer support for early career centre staff

They then had a presentation of center screencasts that are now on .

Bethany Nowviskie talked about the ACH. She talked about advocacy efforts among other things.

Julia Flanders talked about the FairCite initiative.

Short Paper Session

Bridging Multicultural Communities: Developing a Framework for a European Network of Museum, Libraries and Public Cultural Institutions

Perla Innocenti presented about the MeLa project which is looking at how cultural institutes handle the migration to digital. Libraries and museums have tended to evolve separately - but they should work together.

Experiments in Digital Philosophy – Putting new paradigms to the test in the Agora project

Lou Burnard presented about the Agora Project where they are running a number of digital philosophical projects. They are experimenting with TEI, semantic and contextual linking, Linked Open Data, and Open Peer Review and so on. The TEI component is looking at "round tripping" from Microsoft Docx and TEI P5 using Ox Garage?.

They have a number of partners including a Gramsci project developed by Net7 based in Pisa.

The project is using Value Sensitive Design which Lou claimed to know nothing about.

Retrieving Writing Patterns From Historical Manuscripts Using Local Descriptors

Rainer Herzog presented on recognizing (hand) writing of Chinese characters and then using that for search and retrieval.

Social Curation of large multimedia collections on the cloud

Kurt Maly presented on a faceted classification system for images and the issues around scaling this. The idea is that a user can create a facet for a collection - they can identify a bunch of items as being of sort X. Others can then use a facet for their searches. Facets take a while to emerge and get community buy-in.

Academic Research in the Blogosphere: Adapting to New Opportunities and Risks on the Internet

Michael Pleyer talked blogging as academic work. His example was replicated typo that has a number of authors. They publish code and they share ideas to get comments. He discussed the devaluing of research appearing in open forums.

Luciano Frizzera; Workflow Interface for Editorial Process

Luciano presented our work on prototyping workflow software. This is an INKE project. Structured surfaces is our name for visual metaphors that present a user with a surface that looks like it has regions (or structure.) A chessboard has structure, for example. He showed images of the prototype and designs for future versions.

Friday, July 20

Topic Modeling the Past

I went to a panel chaired by Mits Inaba on topic modelling. Travis Brown started the panel giving us a background on topic modeling and some examples. He showed some topics from an experiment with Amanda Visconti who got a ACH microgrant. He discussed the types of topics one gets from thematic to stylistic to artefacts (OCR errors).

Topic modeling is good for getting a sense of the shape of large collections that you don't have the time to read. A lot of humanities methods in the humanities are "supervised" and therefore take more time to train. Topic modeling is unsupervised. Often produces interesting output even when you don't have logical documents.

Interpretatively it sidesteps preconceived categories.

Then he talked about generative modeling where you pick topics and have a system generate phrases. He then went very quickly through a number of ideas that went by too quickly for me to get

The final message was that there are all sorts of uses of topic modeling in the humanities. He has a workshop at MITH coming up, see .

David Mimno talked about the Open Encyclopedia of Classical Sites: Non-expressive Analysis of 20th Century Books. He talked about how to do analysis on copyrighted material. He used a search engine interface that created an intermediate layer of information that he could use despite copyright. He talked about the problem of determining the copyright of orphan works. If you can't figure out copyright then you really can't use the materials. This leads to ideas about "non-consumptive" use or "non-experessive" use. This means that the particular sequence of words isn't expressed, but an intermediate layer that is a statistical description (without the expression) can be used. Jstor, for example, gives you word counts.

Working with Google Mimno got snippets out. He fed 23,000 books in and searched for a large number of sites and got 4.4 million snippets. He then talked about how he used topic modeling on these snippets and the topics that emerged and how he used them. He built an interface that lets you follow a topic over time and see what volumes in the Google corpus are relevant.

Robert Nelson from U of Richmond talked about "Means and Ends in Civil War Nationalism and the Digital Humanities." He mentioned how we sometimes treat tools and methods as a ends not means. He compared that to how historians treat nationalism as an end. Robert uses topic modeling as a means to look at nationalism as a means. He calls the outcomes of topic modeling "analyses" in a strict sense that they are formed by breaking apart and processing a text.

Nelson then looked at newspaper patriotic texts through topic modeling. He illustrated the importance of also dropping into the full text nicely by talking through two stanzas of a patriotic poem about how those who die will not be nameless. (If you die, your name will live on.)

He talked about a distribution graph of patriotic rhetoric in the South as a cardiogram. The peaks of rhetoric are not an end, but a means. They are to get men into the army. There was also a topic on killing and rhetoric on how it is OK to kill Northeners.

He then turned to "distant reading" and how we don't need distant reading of the Civil War - we actually need to get closer. He called "distant reading" a spectacular phrase - I don't think he meant that in the sense of fabulous so much as a phrase about spectacle. I take him to be thinking about how we make spectacles for reading. This raises questions about whether topic modeling is a spectacle in the sense of a lens through which we look or that at which we look.

In the discussion, some of the points that came up were:

  • How the models that come from topic modeling are models like a ship in a bottle - something you can look at, but which won't float. How are these models both
  • How words like model and reading have an ambiguity in that they can refer to an activity/means/method and the outcome/end/result. We model models, read readings, analyze analysis and we use spectacles to watch spectacles.
  • There is an anxiety about whether these methods distance us from the text and therefore the truth.

Mimno talked about how these methods remind him that meaning is real.

Designing Interactive Reading Environments for the Online Scholarly Edition

I was on a panel on the interface to the scholarly edition. Dan Sondheim talked about two examples of scholarly editions that have been designed in both print and digital. These let us ask about how editors are using both print and digital. We don't believe the digital replaces the print, but the two are evolving to complement each other.

Jennifer Windsor talked about "Implementing Text Analysis E-reader Tools." She has asked about how humanists read and therefore how we can support them with e-reader. She argued that the purpose of interface design is to make the interface design transparent. She quotes Tognazzini principles of interface design.

Mihaela Ilovan talked about "Visualizing Citation Patterns in Humanist Monographs." She started by talking about theories of citation. One end of theory argues that citations are normative and can be measured as evidence of productivity. The other end sees them as rhetorical. The author uses citation as part of her argument and to "manipulate" the reader. She showed the Citelens prototype.

Geoff Roeder talked about "The Dynamic Table of Contexts: User Experience and Future Directions." The Dynamic Table of Contexts, a tool that uses XML to let users read, navigate and analyze a text. Geoff talked about a user study he ran on the DTOC. The participants were interested by and challenged the XML. The categories marked up were, for the scholar users, something they wanted to question. We need to show the markup so we don't pretend it is objective.

Stéfan Sinclair talked about the Bubblelines visualization tool. He mentioned how it is nice to work with a team that has a full ecology from designers, usability specialists, programmers and so on. Bubblelines shows distribution for comparison and exploration. Stéfan talked about the usability test that was run on the tool.

We had some interesting questions about the technologies used and the applications of our prototypes. I realized that we often present prototypes as if they are almost there and therefore almost available for use. The assumption is that we will produce and run production versions. In interface research one doesn't necessarily create finished works, but instead we focus on prototyping quickly and then testing the prototypes. The research needs to be documented, but not necessarily produced. For me this raises the question of what is the most useful way to return information to others? Who are the others for whom we want to return results. I think of as a great example of how to return interface design ideas.

Masahiro Shimoda: Closing Keynote Talk

Harold Short introduced Dr. Shimoda and his pioneering work on digitizing buddhist texts. He also has been a lead on the development of the Japanese Association of Digital Humanities.

Dr. Shimoda spoke about "Embracing a Distant View of the Digital Humanities." He presented an Eastern perspective on the relationship between DH and the humanities.

The humanities are about transmitting cultural heritage. We need a clear view of

He quoted Romila Thapar about how to transmit culture in a period of decline to other cultures on the rise? How can the humanities transmit its culture to the digital? How can the digital humanities listen to what is transmitted?

He sees DH as taking a role similar to mathematics in the sciences. Digital humanists have to be prepared to deal with the constant change of technology if they are to support the humanities.

He talked about the encounter between East and West. The West set the stage for this encounter. Shimoda talked about the discovery of Buddhism by the West and what it meant to have a very different culture as the object of study. It was at the level of the text that this happened. Buddhism had to be transformed into a textual tradition to be studied. Then, not much later, it now has to be transformed into the digital. From movable type to digital type.

What is interesting is this idea that we could learn from how the East dealt with the encounter where their traditions where transformed in the encounter into text. In a similar way the modern textual humanities are being transformed by an encounter with the digital and have to deal with someone else setting the stage. Can we be sympathetic and listening as the humanities are changed (sometimes unwillingly.)

And that was the end!



