DH 2018

These are my notes on DH 2018 in Mexico City. As always they hardly scratch the surface of what was really said. Forgive me for all that is missing. Tell me what to fix.

Day 0: Monday, June 25, 2018

On Monday we had workshops. I helped run the New Scholars Seminar. Running it I couldn't take notes, but here are, in retrospect, some of the issues that the new scholars discussed:

  • Quick and dirty geo-spatial hacking. A large group of us wanted to see how to build an annotated map from data so together we hacked the conference program to get a list of institutions people are from and then used Leaflet to geocode that and map it. It was impressive to see all the skills we had across the group. It was also an example of a project going from scratch to basic web output for the scholars new to DH. We forget how useful it is to show the arc of a project.
  • Preparing a DH CV. We had discussions about how to figure out what people want in a scholar with DH training and how to get the skills needed. Many of the new scholars are entering the field and moving from working on one project to trying to position themselves as being capable to do DH.
  • Resources for learning and teaching DH. We talked about the types of online resources there are for learning DH and also about how one might prepare learning materials so that they can be open to others. We had a great conversation about getting credit for pedagogical scholarship. We talked about how design digital instructional projects and how to document them so that you can get credit.
  • DH theory for digital literary and cultural study. We talked about what theoretical approaches that support digital work might look like.

We also organized a mentorship program for the new scholars.

Day 1: Tuesday, June 26, 2018

Welcome words

A librarian spoke first and reminded us that this was the first DH conference in the Global South. The National The printed word is no longer the main form of distribution of information. Libraries enable open access to cultural patrimony. The National Library is working with digital humanities groups to achieve their goals of access to information.

Opening Plenary

The Dean (or his representative) of La Universidad Nacional Autónoma de México (UNAM) shared some thoughts about the hosting university. UNAM is the largest university in Mexico with 315,000 students and with over 1000 researchers and 600 academics technicians. He commented on the nearby statue of independence and the political demonstrations taking place there.

Then Laura Flamand the General Academic Coordinator of the Colegio de México talked. She talked about their library which may one of the best in central America. It is a tiny institution compared to UNAM, but is important and is mostly state funded. They are committed to open access. It was funded by refugees fleeing the Spanish Civil War. It emphasizes social justice and inclusion. She invited us to think about collaborations.

Isabela Galina talked about creating new spaces and making visible the challenges faced by many academics. The theme was the building of bridges. The conference of bridges takes place at a time in which others put children in cages. Families belong together. Furthermore the humanities have always had a role in times of conflict. Digital humanities should keep being humanities. We have the disciplinary training to be critical and we have the ability to build and have an impact. We will keep moving this field in this spirit.

We then heard from the head of ADHO, Karina van Dalen-Oskam who talked about all the work that goes into a conference.

The Co-Chairs of the Programme Committee, Élika Ortega and Glen Worthey then talked about the conference and how we can help make it accessible in different languages.

Janet Chávez Santiago: Tramando la palabra

Elizabeth Burr introduced the keynote speaker by talking about the study of dialects and how many thought they were disappearing. Information was collected of dialects and indigenous languages without giving the people any voice. With the Internet it seemed that English was the only language to be used. Things started to changed as communities began to organize themselves to preserve their languages.

Janet began by speaking a few words in Zabotek from Oaxaca. She wants to talk about word wefting. She showed a quote from Nancy Dorian from "Surviving the Broadcast Media in Small Languages Communities." She then talked about textiles and how they are stories of familieis that transcend tools and processes. The traditional communities are also always changing. She is the fourth generation in a family of weavers. Janet has been documenting her language and weaving. In her home town almost every family has a loom. The looms they use now were introduced by the Spanish.

Wefting! Why wefting? In Spanish wefting also means plotting. To weft words is to plan a discourse. In her day to day life she wefts her life and words.

She then showed the warp and weft of a carpet. The weft goes through between warp. Together they tell the story of their community. With their hands a weaver can tell a story that can be understood anywhere. Weft and warp are not repetative. Different weaving traditions vary just like different languages.

She then talked about patterns and frets. They interpret the patterns of the past. Similar patterns might have different interpretations in different cultures. Weaving are connected to nature. Digital media could be a warp with which speakers of endangered languages can weave their stories. They need to have their language and culture be present. She talked about some of the ways to make present an indigenous language including:

  • She has been creating an oral dictionary to help teach Zapotek
  • Using social media in Zapotek
  • Online newspapers in Zapotek
  • Basic school books in the language (and they shouldn't talk about Zapotek in the past tense)

She called us to bridge learning and coexistence; to use the digital to make indigenous languages present.

She talked about how we often think of indigenous people as primitive and then took our picture with an iPhone to emphasize how they are just like everyone else.

Day 2, Wednesday, June 27

DH in 3D: Multidimensional Research and Education in the Digital Humanities

Micki Kaufman pulled together a panel on 3D that I was on so these notes are after the fact. She gave the first short paper and talked about her Kissinger project and how visualizing in 3D has made a difference. She talked about face 3D and moving (using Unity) to real 3D. She pointed out how important typeface is in a 3D world - it is visual symbology.

Network graphs can be hairballs, but 3D can help disentangle it. In 3D you have to engage with the data - manipulate it as you are in it. It cognitively disrupts the gods-eye view of 2D.

In questions people asked about how to assess such interfaces.

Rachel Hendery talked about 3 fascinating projects that deal with aboriginal spaces. She showed the projects and talked about the elders were involved from the beginning. She talked about tensions between stakeholders. In one project funded by the Rugby League there was tension between the League that wanted publicity and elders that wanted context.

She showed a video of a 3D first person space in which students can learn about what the land was like before settlers. Some of the 3D assets were built by students and not quite accurate. Despite this, it trains students in First Nations culture.

The third project was about cultures around Alice springs. The community wants it to be for community - not to be shown to others. Their philosophy is that everything is left behind.

Their were questions about the Impact of the projects and issues with uncertainty - not knowing what it was originally like when recreating a space. The elders are detached from history and not always sure.

Steve Jones then talked about the RECAAL project that I am connected with. The project is recreating Busa's lab in Gallerate in 3D along with machines. He pointed out that the humanities is really n-dimensional.

The have pictures of a lost building, but they are modelling gaps. Creating 3D models reveals what we don’t know. He talked about Benjamin and the card index - arcades project. Benjamin moved to talking about picture writing - statistical writing?

We are try to understand what busa team actually did. The humans are the black boxes. Decisions are forced by 3D speculation - it is a feature of method - modelling means paying attention to mundane. Where was the bathroom?

Lastly I presented a short paper on our AARG projects (Augmented and Alternate Reality Games.) I pointed out that with Pokémon Go there is now some understanding of what locative game might do. I concluded with some of the failings of the project:

  • The technology is finicky. Authoring environments that are easy to use aren't there.
  • The interfaces are an issue if you want people wandering around to interact with information, especially the small smartphone interface.
  • We (academics) don't know how to write compelling games and experiences. We want to lecture and put up lots of text (pedagogic blah).
  • Potential participants don't want to download an application - they rightly don't trust unknown apps.
  • It is hard to imagine modes of play that don't default to the treasure hunt or walking tour.
  • Users (who haven't played Pokémon Go) don't understand the genre.

My hope is that better support in iOS/Android and HTML 5 libraries will allow us to build web based toys that people can access easily without downloading anything. Will we get the access to the GPS and other system features we need to make compelling games?

Critical Theory + Empirical Practice: “The Archive” as Bridge

I came in late to this panel and only caught the last two presentation, both of which were excellent (and fast.)

Sharon Webb: Community Archives: Preservation & Practice

Webb talked about her project to preserve a LGBTQ archive in Brighton. This archive, Bright On Our Story, was started in response to a law that prohibited the promotion of homosexuality. BOOS was a voluntary-led group that gathered materials but eventually couldn't sustain itself. The collection was dispersed. It is a good case study of the risks of voluntary archives and the politics of archives.

Content created by and for communities are "critically endangered" as many large archives with preservation capacity are interested in "high culture." Community is the key concept for the next archival turn. What are the requirements for the digital age.

Webb got a British Academy Rising Start Engagement Award to look at the issues around community archives. She has brought people together. Many say "I'm not an archivist but ...". Voluntary archivists tend to not have any training or budget. Another key issue is control and erasure - communities are aware

Ben Jackson: Old Bailey Voices

Jackson is creating tools for studying the Old Bailey dataset. He is part of the Sussex humanities lab. The dataset is verbose with metadata. He has built an interface that built on Data Mining With Criminal Intent. He has created virtual reality puppet shows of the actual trials for close reading as the trial transcripts are, we think, based on what was actually said. See . He is also drawing timelines and other graphs of the conversations. He can zoom out and see patterns about who is talking in many trials.

Now he using the ideas to motivate volunteers to help with the transcription of information about the Poor Law. He has created a selfie stick that can let people use their cell phone to scan books. It was a great example of adapting existing technology to give people a scanning station.

There was an interesting question about the relationships of power between archivist and users. We have access to huge datasets but don't know how to use them. Humanists are badly prepared to use them which opens a window for others. There are disparities in access to artificial intelligence and archives. Some people have more "intelligence."


Here are some of the posters I remember.

Rafael Alvarado: Introducing Polo: Exploring Topic Models as Database and Hypertext. Alvarado was a showing a nice environment for exploring the results of topic modelling.

Day 3: Thursday, June 28th

Cultural and Institutional Infrastructure

Kim Martin, Diane Jakacki: REED London and the Promise of Critical Infrastructure

Jakacki began by talking about Alan Liu's call for thinking critically about infrastructure. Liu calls infrastructure as social cum technological milieu.

REED (Records of Early English Drama) is a long-standing project that is now looking at going digital. They are doing this within CWRC - a virtual research environment. They showed CWRC and CWRC-Writer. In CWRC-Writer one can create linked data. They also showed Hum Viz?, a visualization environment for exploring linked data.

Sara Sikes: A Design Process Model for Inquiry-driven, Collaboration-first Scholarly Communications

Sikes talked about the Greenhouse Studios which has a nice looking lab to support projects across the humanities and arts. They look at both analogue and digital. They haven't pre-determined the outcome - the collaboration comes first. They look for "radical" collaboration.

They have a designed design process. They form a multi-disciplinary team that figures out what they are doing. They talk about collaboration. They survey what they have. This process is borrowed from design thinking. They look at feasibility, viability, and accessibility. The design process is documented at:

  • Assemble (gather a team)
  • Understand (understand the challenge)
  • Identify (relevant sources)
  • Build (build it)
  • Review (what has been built)
  • Release (different outcomes)

I asked about push-back on the process which is very deliberate. I think it is important to talk about collaboration and process, but one often gets push-back from people who just want to get on and do it.

Anna Neovesky, Frederic von Vlahovits: Incipit Search? - Interlinking Musicological Repositories

They have created a library service for searching through linked repositories. It was developed at Mainz. What is an incipit? It is the beginning or characteristic sequence of a musical piece. They are often used to disambiguate music with the same node. They encode it in an old typewriter friendly standard that can be searched on.

Their project allows search across 15 musicological projects. They also want a generic musicological search tool. They showed the search interface and the API.

William Dudley Pascoe: Rapid Bricolage

Pascoe talked about building something up from nothing. He showed a project they have developed on Colonial Frontier Massacres in Central and Eastern Australia.

They follow a rapid or agile process. I wonder how this compares to Greenhouse Studios process. Pascoe is a software developer and finds rapid prototyping helps humanist imagine what they want. Research also involves lots of speculation. Bricolage is an approach that has a history in the arts and matches agile development. He talked about the need to always tweak things in the humanities. Exceptions are important.

There was a good question about not documenting in agile methodology.

Gregory Zinman: Media Preservation between the Analog and Digital: Recovering and Recreating the Rio Videowall

The first multi-screen public installation, the Rio Videowall, was proposed by Dara Birnbaum and implemented in 1989. See This project is trying to document this first videowall in Atlanta. It was in a predominately middle-class black space. This project raises issues about preserving public art. How can one preserve the space that is since torn down.

Videowalls were a thing in the 1980s. Birnbaum got the idea from going dancing in New York where videowalls showing video art. This is a history that hasn't been told. Public art was also changing at the time.

The artist's critique was short circuited by the way people used videowalls. Videowalls got used commercially. How can one go back to an art work to preserve it? What is one preserving? To reconstruct one needs look at the effect on people - the project is gathering oral histories to preserve the impact.

The project page is here:


It wasn't always clear what was critical about the infrastructure projects presented. What makes infrastructure critical?

  • There is ongoing reflection built into the design and build process
  • The need for the infrastructure is not taken for granted - it is questioned
  • The nature of infrastructure is challenged or reviewed
  • Historical reflection
  • Documentation
  • Preservation

Performing Arts, Sound, Media

Erin Rose Glass: Designing writing: Educational technology as a site for fostering participatory, techno-rhetorical consciousness

Glass talked about the politics of writing in the university. She started with a certain understanding of the politics of digital technology in the university. Since the fake news scandal and so on there is now an understanding that digital technologies have politics. We are seeing discussion of this in the public sphere. Shoshana Zuboff, "The Age of Surveillance Capitalism". Digital technology seems incentivized to extract as much data as possible and then monetize it, whatever the ethics. Data is the new oil.

Lots of organizations are doing activism around these issues. Glass has found "oppression" as a useful way to think about the issues. Friere says "oppression is denying a people's right to critically understand and transform their reality." How can software now do that? Proprietary code is secret.

What does this have to do with academic writing? All the papers produced digitally (student and faculty) are data that companies want access to including Turnitin, Google Docs, Elsevier (now an information and analytics company, not a publisher), and so on. Should we think of the university as the data intensive university? What is the heart now? The servers? The corporate hosts?

Is one of the things sustaining the corporate data model our complicity. She proposed three projects, 1) the Social Paper (a student environment for writing), 2) Socializing dissertations, and 3) ... she ran out of time.

Clarisse Bardiot: Measuring Merce Cunningham : a theatre analytics research

She started by talking about cultural analytics and changed a few words to talk about theatre analytics. What can analytics bring to performance study. There is a lot of literature and history, but how to study performance which vanishes. How to turn performance into data. She has developed tools to preserve traces.

She then talked about a case study on Merce Cunningham. There are some online datasets. See

She then showed some analytical results like a graph of number of dancers by number of creations. She showed a network graph of collaborations. She drew some reflections from these about different periods of Cunningham's work.

She closed with ideas for next steps.

Elliott Hall: The Digital Ghost Hunt: A New Approach to Coding Education Through Immersive Theatre

Hall talked about an augmented reality and coding education project. The project is funded by AHRC led by Mary Krell and they work with KIT Theatre. The Digital Ghost Hunt begins with a tedious programming task. When the kids are bored their screens go haywire and a ghost asks them to build a ghost detector (using a Raspberry Pi and other things.) Various characters help them develop and then they go on a ghost hunt. They try to figure out who the ghost is, why they haunt, and how they might be freed.

The goal is to replicate the play that motivates most programmers to make it more attractive. They want to get students to value means rather than ends.

The project tech is there, but they have to choreograph the flow of the event.

Aurelio Meza: Peer Learning and Collaborative Networks: On the Use of Loop Pedals by Women Vocal Artists in Mexico

Meza is working on a collaboration with UNAM. Poetica Sonora is around voice, legibility and sound practices. They are trying to gather sound files of oral poetry, sound art, soundscape, spoken word and so on. They have a prototype with 369 files that can track different versions. See

He talked about loop pedals that poets use to alter their voices.

Amy De Rogatis?: Listening for Religion on a Digital Platform

De Rogatis started by pointing out that if you pay attention to sound you learn new things about religion int he US. The American Religious Sound Project is at:

This project worked with students who were trained on recording and Omeka. They send the students out to record sounds, upload them and add metadata. Eventually they had faculty helping. They now have a project that is scaling up. An important part of the project is that students are involved managing it.

They didn't just record canonical relitous sounds. They wanted non-canonical sounds like the washing of dishes or religious events in public spaces. They want to think about what is considered a religious space and religious sound. They find there are politics of representation and what communities want recorded.

They have a visualization tool. They have a gallery.

Day 4: Friday, June 29

Art History, Archives, Media

Benjamin Zweig: The (Digital) Space Between: Notes on Art History and Machine Vision Learning

There are assumptions made by machine vision systems (and by art historians) as to what needs to be tagged. Machine learning can be based on narrow and outmoded conception of art history. It can remove historical context. Focus on similarity can lead to false positive links between works.

Everyone thinks the art historians and the computer scientists need to collaborate, but it isn't that simple. CS people are looking for, art historians are looking at. Both sides need some humility. A first step is to find an alternatives to Wolfflin and Panofksy. Alois Riegl wrote a book about the long history of ornament. Perhaps ornament would provide a between point. It is modular, formal, complex, and repetitive. There is too much for one person to handle so a distant looking would help. Ornament shows up in interesting places.

Matthew Lincoln: Modeling the Fragmented Archive: A Missing Data Case Study from Provenance Research

Lincoln talked about a study on the art market, specifically an influential dealer, Nobler and his stock books. They are using predictive modelling to figure out what kinds of work would have made money and what would have been risky. They are trying to figure out what combinations of features would be a safe bet for making a profit. They have a problem with incomplete data - Nobler's clerks would leave out information. They are trying to fill in the gaps using statistical techniques. They want to not only use the methods, but communicate the methods. Quantizing uncertainty can be confusing. Historians deal with holes in data all the time.

As they put guesses for missing data in, they then run their modelling over again. This has changed their predictions. For example, history paintings, which seemed risky in later years, turn out to be possibly less risky. He then showed how they use animation as an alternative to a box plot.

Their future work is to give users the ability to tweak the assumptions as a way testing them and trying alternatives.

  • Ground simulations in common sense
  • Make process visual - show simulations
  • Demonstrate that assumptions can be modified

He closed by mentioning that they are trying to figure out how to publish their results with the interactive components.

Benoit Seguin: Extracting and Aligning Artist Names in Digitized Art Historical Archives

Seguin talked about a project with a Venetian foundation Cini Foundation which has a million images in a analogue catalogue. They worked with a company Factum Arte that built a cool custom scanner that lets them scan the two sided image catalogue. They are scanning about 1,500 images a day. The variation in images led them to built a machine learning framework dhSegment that they can use on all sorts of messy datasets. They have a system that now does OCR and extracts metadata. The problem they now have is that the information they have is flat. They have artists names that are inconsistent and they want to link them to information about the artists. Normalizing the names is the first step to linking. All in all they were able to automatically match 73%. The failures were cases like "Tiziano o Giorgione" or cases of people not in the master database.

He then talked about the analysis that they are doing. They found a lot of duplicates. They also found a skew to the more important artists - a power law of lots of images for few important artists and fewer for many unimportant.

Gabriela Elisa Sued: Métodos digitales para el estudio de la fotografía compartida. Una aproximación distante a tres ciudades iberoamericanas en Instagram

Sued had a nice solution to the bilingualism. She had a QR tag and link to the English version of her talk: .

Sued is interested in born-digital images or hybrid images, in particular Instagram. In particular she is interested in urban shared photography. She wants to develop a method that is compatible with her own Latin-American communication and media schools. She looked at a corpus of pictures with hashtags for cities like #madrid, #cdmx ... She wanted to understand the aesthetic and thematic represetnation patterns and if the shared photography was a community practice. She used a distant approach and mixed methods.

She then talked about her findings. She looked at other words added. High frequency worlds included "loves" and "love." There were differences between the hashtags for different cities. She also showed an image analysis that showed different colour tones. Here is her analysis:

Through visual analytics (exemplified with #cdmx at the slide) we can relate the tones with themes. The red and orange are related to heterogeneous themes, generally related to the representation of people, food and objects. The chromatic pattern reveals that food and people identify more with cities than green public spaces. The blue color is recurrent in the three visualizations, but only in #cdmx the sky is a photographic subject in itself. In #madrid and #buenosaires it appears as a backdrop of emblematic buildings or monuments, that is, as a complement to the photographic themes that seek contemporary aesthetics in architecture .

She then showed a network analysis and then a content analysis and engeagement analysis using hashtags. She had a fascinating visualization that she interpreted as a shift from "taking a picture" to a more modern "looking to a pricture".

She concluded with a methodological reflection. The distant approach effective to open the black box of social media. But we are limited to what the social media companies give us access to. The meaning attributed to each city is collectively built, but over a specific time and by those who use this media. She concluded with:

Globalization puts into circulation a dynamic based on the productive, where what is not is excluded. Tradition struggles through communication to overcome exclusion and re-elaborate its identity to build its future. These three cultural forms can be seen in the representations studied: network connections are made visible among cities of the world in #madrid, services related to consumption and cultural industries such as fashion, design and art are represented in #buenosaires, and reinvent the forms of representation between traditional and modern architecture in #cdmx. That is why we can say that photographs labeled with the names of cities add new ways of perceiving, narrating and describing Ibero-American cities.

Sabine Lang: Urban Art in a Digital Context: A Computer-Based Evaluation of Street Art and Graffiti Writing

We tried repeatedly to bring in Lang by Skype. Her prepared presentation was on urban art vs traditional art. Alas, I couldn't hear so well so this may have inaccuracies.

Urban art is very context dependent. Traditional art will be shown in different configurations. How does digitization affect these different types of art? Their digitization of urban art is aiming to preserve street context.

Can computers propel the evaluation of street art? She is looking at the relation between the art and context. She showed an interface where you can select an image and the system then shows similar ones. She looked for images like Banksy's rat. She showed how her algorithm can also distinguish regions in a work. She can look for other works with a similar arrangement of regions.

She concluded by talking about the importance of digitization of urban art and the dynamic relationship between different street works. Street works can be changed by others in response to other art.



There was an interesting question about whether the critique of the first talk was valid (that DH and computer vision is following an outdated version of art history). The problem is that machine learning, by definition, is going to be a formalist approach. Part of the answer was that we can look at the social side of the art, we can deal with large sets, and we can look at alternative collections.

One of the speakers suggested that there is nothing wrong with formalism, the challenge is types of formalisms.

I couldn't help wondering what they would think about - this is a now dated art history humanities computing project. It must date back to the late 1980s or 1990s.

I also wondered about the history of data in art history. How have logs, stock books, catalogues and databases been used over time by dealers, museums, and other arts organizations?

Mid-Range Reading: Manifesto Edition

Grant Wythoff: The Gadget

Wythoff is writing a book on the gadget! Lots of things have been called gadgets, but over time the types of things called gadgets changes. He created a gadget to look at the term over time. See . Gadgets originally were used a lot for ship gear. He showed queries about what "gadget" meant in turn of the century newspapers. By 1918 the word was a term of art for sailors.

He showed a graph of the word for an entire object vs the word for a part and the two lines crossed around 1955. Gadget now refers to wholes more than parts.

The availability of large datasets allows the history of ideas to be not about great minds but about everyday uses. We can do a history from below - the middle range scale. We in the humanities work in the middle range every day - we search for a word which is "datum mining". Searching is not data mining, but trying to find examples.

We need to think of the different scales at which we do research. We like to look at platforms and genres and shiny stuff. Perhaps what we should do is look at what people used with tacit knowledge. He called it a science fictional experience. In writing a cultural history of the gadget he wants to look at folk knowledge and patterns of everyday usage - another sense of middle range. Between high theory and low theory (the practicioners and tinkerers.) The gadget seems perfectly suited for

Alison Booth: Mid-Range Reading

She briefly talked about the Collective Biography of Women project. See These collective biographies often were national or about groups like noted negro women. Their method draws on bibliography and predates Google Books. The project went through a bibliographical phase, and then a interpretative phase. Now they are studying narrative and typologies. They look at how women are clustered. They look at biographical elements and structure (BESS).

The mid-range reading works at a scale of cohorts like a set of Latina women. The texts narrate lives briefly, but they can tag biographical elements in these narratives that can then be used to compare biographies. They

The manifesto is that mid-range reading offers what's missing from data mining of large corpora. They offer:

  • the reader or authorial reader
  • rhetorical communication
  • book and publishing history
  • biographical/social context of production and
  • reception

She complained about graphs, like those produced about plots - like Matt Jockers' graphs using sentiment analysis to represent plot. Why are so many fascinating by graphs of plot? What sorts of plots do biographies follow? How do you deal with the way someone gets memorialized after their death?

She showed how they are marking up the narratives with what looked like rich interpretative formalist markup. Reminded me of Propp's functions. See BESS:

Biography is a terrible model of a real life. They smooth the data.

Sarah Allison: Harnessing Pegasus: On Setting Reasonable Limits

Allison wants us to think about how we feel about the constraints of reductive reading. Reductive reading reduces textual complexity but is still useful. A literary scholar is always looking to open inquiry. To do that one can start with constraints.

She talked about how fictionality shouldn't be associated with the rise of the British novel. She looked at a number of different limits to fictionality set by people doing text mining. It was a subtle argument that I'm not sure I followed. She seemed to want to justify different limits on fictionality.

What stood out in her talk was that she used a handout rather than slides. She brought us back to a text in our hand that she would read parts of.

Daniel Shore: Other Than Scale: Abstract Signs in the Digital Archive

Shore started by commenting on how scale has been discussed in the field and in popular press. Analysts can tell you the number of texts or words. Shore wants to displace scale in its role distinguishing types of study. Scale has often been a matter of sacrificing some of the richness. We need to know what we have sacrificed. Shore is focusing on abstraction in the construction of corpus linguistics.

He illustrated his point with the term "thought leader" which has become a popular term. We actually know about the compound NOUN leader (squad leader, student leader ...) A further abstraction is NOUN Verb-er (as in wastebasket emptier ... ) We know that the compound has to end in er. We then get Noun Noun and then X Noun and so on. More and more abstract signs.

Shore then returned to talking about how poets do this compounding. They do it in different ways. Abstract nouns like this get left out when we do bag-of-word text mining. Mining splits them up if we don't build in linguistic tools.

This sort of study of constructions is a form of middle-range study. He tells stories about these constructions in his book Cyberformalism. Understanding these forms is something we all do and is important to understanding language. Flattening out "thought leader" hides culture. We now have the tools to look at these constructions.


I think Shore is right that scale is not really that important. So much "big data" is really a small surrogate extracted from a surrogate and so on. Does bigness really help us distinguish types of study in the humanities?

We talked about distinguishing fiction and biography AND fictional writing in biography and vice versa.

I wondered how this was a manifesto and an edition.

Scale etymologically comes from "ladder" or steps. While scale may not type studies, all of the speakers seemed to move up and down ladders.

Wythoff made an interesting point about the types of truth claims being made. Fiction doesn't make historical claims, but moral claims or philosophical. Distant reading makes different claims about different types of things than close or mid-range.

Approaches to Corpora

Alpo Honkapohja: A Corpus Approach to Manuscript Abbreviations (CAMA)

Honkapohja started by talking about how difficult medieval manuscripts are with lots of abbreviations. The challenge is how to expand the abbreviations. Many editions get rid of abbreviations, but there is interesting data in the abbreviations. Could the abbreviations show scriptorial house-styles or change over time? His project looks at them from 1150 to 1350 in English. After 1066 English became a colonized language as the rules were Normans. He is looking at the Linguistic Atlas of Early Middle English which has lots of metadata.

He then walked us through the results. He looked at distribution of common abbreviations and geographic abbreviations.

Ulrike Edith Gerda Henny-Krahmer: Exploration of Sentiments and Genre in Spanish American Novels

Krahmer talked first about sentiment analysis. It has been used for genre analysis. She is applying it to Spanish texts. She showed how did sentiment analysis using SentiWordNet and another system, NRC Emotion lexicon. She then used a decision tree using the sentiment results to guess the subgenre.

Alas, at this point my power ran out. What follows is from memory.

Maciej Eder: Words that Have Made History, or Modeling the Dynamics of Linguistic Changes

Eder is interested in dynamic change of language. He started with some assumptions like change is not linear. He didn't make assumptions as to what would be the features that track change and used a corpus of historical American English. He then used supervised binary classification to identify the similarity of language before and after each year.

Picks a date that there may have been change and then tests whether there is change between a segment of 20 years before and 20 years after (with a margin immediately on either side. This led to a diachronic graph of change over time.

Then he tried to guess if there was a historical event around major shifts. Civil war? Stock market crash? He then tried to figure what features were responsible for change - he showed some examples like the word “week” which changed dramatically. He looked at function words and specifically pronouns which explode in the 1970s.

Jan Rybicki: Polysystem Theory and Macroanalysis. A Case Study of Sienkiewicz in Italian.

Rybicki started with the polysystem theory that has proposes different levels of systems of literature.

Fatiha Idmhand: Spotting the Character: How to Collect Elements of Characterisation in Literary Texts?

Justice-Based DH, Practice, and Communities

The group talked about the project Di Scontent?. Vika Zafrin talked about a conference where they heard about what digital scholars wanted around advocacy and activist DH. How can DH practicioners and infrastructure builders learn about repairing and supporting each other. They created .

Vika talked about what she is doing in Boston. She started with a tweet from John Unsworth - are we all permeated with bad habits and infrastructure that has bad assumptions. At Boston U they are trying to help communities develop their own archives using their status as an institution. They also look for historically significant materials that risk disappearing. In other words they are trying to undo the assumptions built into the infrastructure. She ended with principles and lessons: (only some of which I had time to note)


  • They are not here to empower others
  • They should not endanger others
  • They should encourage rehistoricization


  • Nothing wrong with slow DH - build on existing stuff
  • Vulnerability is ok
  • Bring in larger context
  • Part of doing the work is saying no
  • Colonial infrastructure is just like decolonial infrastructure

Purdom Lindblad talked about what they are doing at MITH around developing values (Maryland Institute for Technology in the Humanities.) She didn't want us to go public with their discussion about values yet.

Roopika Risam talked about the Torn Apart project. This project moved quickly to document and visualize separations. See She talked about creating a project quickly. She was very interesting on the ethics of openness. Her conclusions sounded similar to what Berendt and I came up with in our essay Information Wants To Be Free?. See the Wired story on this:

Carolina Villarroel and Gabriela Baeza Ventura then talked about the Recovering the US Hispanic Literary Heritage project. (Thanks to the project for correcting my notes.) Their slides are at #DH2018

This is a project that is locating, preserving and disseminating Hispanic culture of the US in written form. They have done bibliographic work. They gather archives. They publish. They organize conferences. They support grad students. This work decolonizes the imaginary of the US. It challenges the sense of what is American literature.

They talked about the USLDH (US Latino DH) - The Digital Humanities and Social Justice speaker series explored the ethics of DH. They are asking about whose stories are told? Archives? Studied? Can DH decode privilege and subject positions? They are working on a local Houston history and oral histories.

The discussion started with an excellent question as to what we are anxious about? I'm anxious about silence - the silence that you realize you need to be in order to deal ethically with a situation. Not just shutting up, but having to delete data for ethical reasons that you know could be useful.

They talked about how to deal with questions of social justice. There are different ways to deal with our times.

Closing Plenary: Schuyler Esprit: Digital Experimentation, Courageous Citizenship and Caribbean Futurism

Alex Gil introduced the keynote speaker.

Schuyler Esprit thanked the organizers. The significance of this moment and the choice of two women keynotes is important. She then moved to a history lesson about Dominica and the coastal communities like where she grew up. Mahaut is typical of planning issues after a divisive plantation culture and slavery. The community is aspirational and upwardly mobile but is still constrained by history. In 2017 coastal cities were damaged by the hurricane. People forced to live on the coast (inland plantations were owned) are now being dislocated. Carisealand is a project in response, dealing with the futures in the face to climate change. How can literature and the arts contribute the change. They have projects around:

  • Agriculture and food/water security
  • Public health and pollution
  • Arts and activism
  • The foreign aid or "disaster porn" complex
  • Environmental law

She turns to the imaginative, the science fiction imaginary. The project will imagine Mahaut and alternative ways of living. She proposes to experiment with the digital to imagine the future.

They are working from Braithwaite's principles of Caribbean cosmology. She then worked through how they are doing the building. They are building on Google Maps. This rebuilding is based on Love, Live, and Work.

She gave an example of the house of her grandparents. How can you do more than just provide a building? She talked about work and love.

She approaches this challenging project as narrative - an exercise in speculative fiction. They are not just documenting, but imagining. They resist the idea of "resilience" but want to fight hard to rebuild. They are fighting for the sacredness of their space, for peace.

What does it mean to be Caribbean in the age of climate change. She is asking again about freedom and what it means to be free when survival is paramount (after a hurricane.) She asked what the digital humanities can do to help.



