These are my notes on CSDH-SCHN 2016 that is part of Congress 2016 in Calgary. Here is the conference schedule. Below you will also find notes on DHSI at Congress and CGSA.

They are being written live so they are full of problems and strange constructions when my mind wanders or I compress things I've heard. In particular I have strange moments where I'm trying to both listen to what is being said and write what has just been said and the two collide. Write me with corrections and I'll fix things.

The CSDH twitter tag is CSDH-SCHN 16.

Ian Milligan: Archives Unleashed: Unlocking Born-Digital Sources through Interdisciplinary Collaboration

Ian Milligan gave the opening keynote as he was awarded the Outstanding Early Career Award. Ian Milligan started by talking about the value of teamwork and collaboration.

The Problem

The research question that has animated his research recently can be summed up by a Geocities page or two. Until Yahoo shut it down, Geocities gave people for 15 years a place to write whatever they wanted to. To study Geocities you see the problem of scale. By Oct. 1997 they had 1 million users. By 2009 about 7 million. This is a scale that boggle the mind - we have an amazing amount of data of non-elite people. The Old Bailey is one of the biggest collections from before the web, but it only as about 190K trials.

Historians have typically dealt with a scarcity economy. Now we have too much. If you want to study the 1990s and beyond you have to deal with these sorts of digital resources.

This doesn't mean that we have a perfect record. There is uneven representation of people on the web. There is also a question of when the 1990s becomes history. Things become history roughly 20 years after the even. We are entering the period when the 1990s can be treated as history.

The Fears

Ian talked about his fears for how the history will be done. The decisions we make today will lay the foundations for how we work with born-digital cultural heritage! At the moment you have to use the Wayback Machine to get at web archives from the Internet Archive. This is not a serious tool for historical research. You can't distantly read using the Wayback Machine. We know we will have to develop new search engines, but what will these look like?

A fear is that these search engines will be black boxes and we won't know what happens in the engines. This is particularly true when there are too many hits and engines use relevance ranking which then influence what you read. Historians may depend on

His deepest fear is that it won't be historians that build the tools. The Google Ngram viewer is an example - it and the underlying theory wasn't built by historians.


His big take-away is that we can't do it alone. We have to bring web archives into conversation. He has been part of several neat teams with librarians, historians, and others.

These means that we have to understand how to make and contribute to teams. We need to understand:

  • How to bring students on and give them credit.
  • How to work with computer scientists.
  • How to build tools along with research and manage them.

In the last part of the paper he talked about two

Case One: Warcbase

Warcbase is being developed and run on the Geocities materials. It is called "Warc"-base because Warcs are the files of the Internet Archive. See it at and

It provides scriptable analytics and data processing. It is scalable so it can run on different types of machines. Warcbase outputs text files that can go to Voyant and they have lots of their own cool tools. They are moving away from content to working with structured metadata. He talked about how much you can learn from metadata.

Case Two: Datathon

He then talked about the problem of how to get people to use web archives. They have funding to run Datathons so that people can actually use these. They try to find the right balance of hacking and yacking. They did an analysis of the interactions and found increased networking. The finally projects were brilliant.

The catch with datathons is that they are ephemeral. How can we

Case Three: WALK

WALK stands for Web Archiving for Longitudinal Knowledge. The idea is to create the infrastructure for people to be able to study Canadian web archives.

There is a currently a poor interface to finding collections. We need to not just find that there are things, but be able to peek in and get access. Webarchives dot ca is the prototype.

What's Next?

Ian thinks Canada is decently positioned to lead in this area. We have decent funding streams. We have government support. We have a critical mass of digital humanists. Historians need to learn from the digital humanities, but they are not ready. Historians will face an existential crisis if they don't confront the digital.

Distance Technologies, Distant Reading, and Literary Pedagogy

I then went to a session that was joint with ACCUTE.

Meaningful Play: New pedagogical approaches to research in the humanities for undergraduate students

Matt Bouchard talked about the challenges of teaching the subtle skills of the humanities. For students it is hard to make the transition to university courses and their expectations. A lot of people think that gamification can help, but you need to understand how games work to layer them on. To do this they have developed a course where students rapidly make games and then break them. The course is taught at DHSI and it is led by Matt Bouchard and Andrew Keenan.

They don't call what they create games, but "meaningful play." Once you put it into a classroom it isn't voluntary which makes it not a game.

Digital zombies is the name of the resulting game that came out of DHSI 2014. The play experience was to get them into and using the library. The game had problems being scaled for real use as the DHSI participants who developed the game created assignments that were too complex and "real." Matt and Andrew then provided consulting to scale it (down) to work with large classes. They simplified the assignments/mission and so on. The course version of the game seems to be working well, but it too early to tell.

The New Medievalism

Andrew Bretz talked about "Binary Implementation, Blended Results: A Case Study in Blended Teaching of The English Literary Tradition (Beginnings to 1660)". He talked about how the digital has enabled a new form of itinerant academic who can move their digital courses from place to place. He mentioend the recent Ontario "Focus on Outcomes, Centre on Students" report that suggested that academics should be entrepreneurs.

He draws on his experience bringing a Beowulf to Milton course to Laurier. Every English student has to take it and the class is well populated. He adapted the course to be a blended-learning course. The "lectures" were multi-modal and were hosted on a Laurier web server, but a lot of materials were on You Tube? and students had to use other online tools.

What's interesting is that he developed the course without much help from central. To meet with office o learning would have taken too much time.

He talked about how terrible it is to record video of important profs for lectures. His videos use the medium more fruitfully. He has had thousands of viewers on You Tube. For Laurier students it was easier to use You Tube than the He also brought in open access tools like Zotero, Timeline, Word Press? and so on. These open tools problematize the relationship with the university which often has an investment in proprietary tools. Tenure track profs are encouraged to use the university CMS. Sessional instructors can choose to work outside the infrastructure as they will want to move their stuff.

Sessional instructors are disincentivized to use specialized tools than the open common ones. This then further divides the tenure track vs the sessional instructors.

What if a course is not tied to a university? What if a course/sessional instructor can move their course from place to place? Alas this isn't done, even though there would be saving if many universities drew on such courses.

He feels DH tends to get caught up in the tools and so on, without paying attention to the changes the tools configure. Of course it is hard to predict how the tools will disrupt or not the configurations. I think it is great that

Margaret Linley: Lake District Online Pedagogy Prototype or Reading Digitally: The Book Up Close and At a Distance

The prototype is in the early phase and it adapts a database - see . This prototype lets one study individual books. It lets students think about texts, the archive, and databases. It puts them in a new relationship to knowledge. It also lets them think about how the digital changes research allowing both macro and micro analytics.

Students used Word Press to do projects. When she introduced analytical tools the students didn't use them as there wasn't the support. She is now adapting the database to provide the analytics. The prototype has 8 books loaded. Students now can use tools on the relevant books. There are lots of statistics available if students want to approach the book from a quantitative point of view.

She then demoed some of what she is doing. There is a named-entity tagger. There are visualizations, maps, and so on.

Épuisement de la Transcanadienne Montréal-Calgary : representation ou production de l’espace?

After lunch there was a panel led by Marcello Vitali Rosati on the overlaps between digital and physical space. They wanted to use digital tools to understand the trip from Montreal to Calgary. Different participants took different routes (driving, plan, train and one person who just read about it.)

The presentation is being live streamed at

He situated the project in terms of earlier more spatial travel projects. He mentioned a trip from Paris to Marseilles.

When reading a space or trip - ie. reading about places, watching the road go by, one is actually producing a space. One creates an imaginary. They travelled not as a sociologist or ethnographer, but as innocents, open to develop their own imaginary.

Erwan Geffroy then talked about the image side of the project. They have a Flickr site with photos from the road trip. They geolocated the images. He showed some images taken, I think, by mistake, that had a lomographic quality. They are almost abstract striped photos taken out the side window with stripes.

Marcello reflected on the political dimension of the project where the public space traversed is stored and made available on private services (like Flickr.) How can we regain a

Julie T.-Devirieux tweeted across the country. See @julietrembalyde or #transcan16. Of course she had trouble connecting often as the drove through blind spots without cell coverage.

They brought poetry and other books about the places they were traversing. She reflected on how the window frames the landscape.

Next up was Claire Legendre who took the train and was posting to Instagram. Marcello introduced her commenting on how each commercial service jealously guards the content you put up there. Given more time one could produce standardized and interoperable data. For Claire this trip let her discover Canada and to try a type of travel with the digital. She wanted to document the different events of the trip. Her imaginary of the trip was structured by ideas of train travel, especially European train travel with all its dangers.

She talked about not having network connection and thus being digitally nowhere, even though she was in a community space (the train.) Without Google maps you are nowhere - if you can't connect then you aren't there - you are in an imaginary.

One of the neat things of the project is all the representations produced. Marcello has been tweeting links I should look at like and - I love the circularity of having my notes live commented by a presenter.

Marcello joked about how the digital never works when you need it to (as the next presentation failed to work.)

Marie-Christine Corbeil showed a Google map with the route some of them took. The map can then be linked to the other media. She then talked about a Storify that recombines the materials. They had a community that gathered around the hashtag (#transcan16) and sent them suggestions. It's neat how followers joined in.

Servanne Monjour talked about the construction of the Canadian countryside. She took her trip on Google street view taking postcards on her virtual way that are gathered at . She combines her postcards with what she was reading. The strange thing for me was the winter scenes.

What does it mean to say that the Canadian countryside is beautiful? Is the beauty constructed? Do the digital views available change things. How often does Street View show snow? Rarely, which is strange when you consider how Canada is often represented.

She talked about how such virtual travel differs in other ways from how people describe it. There is no noise in Street View, no congestion ... no Canadian Tire trucks to accompany you.

She also talked about the bilingualism of Google Street View. She was interesting on trying to compare Street View and texts. It can be hard to find the exact place from a text. The texts deform the image and vice versa.

And I thought I documented things when I travelled! This raises it to a whole new level.

I tried to ask about the compression of time. Marcello answered talking about rythms and architectures of time. The digital maps time onto visual space.

Finally I asked whether the project has an end, which it doesn't.

Digital Demos

There was a great set of digital demonstrations. For example there was an interesting demo by Reiner Kramer about doing analytics on music XML using PD and python. He was building interfaces using PD for python analytics which struck me a neat combination of technologies.

Tim Mac Kay? and Elizabeth Banks talked about "Re:Collection: A Digital Tool for Presenting Firsthand Accounts of the Holocaust." This project has built a beautiful interface to Canadian holocaust testimony.

Mihaela Ilovan was showing CWRC which is now in beta.

Hermeneutica Book Launch

We had a book launch for Hermeneutica: Computer-Assisted Text Analysis in the Humanities, the book Stéfan Sinclair and I wrote.

Tuesday, May 31st

Trace of Theory Panel

I was part of a panel on the Trace of Theory project so there are no notes.

Tara Mc Pherson?: DH by Design

Mc Pherson is the genius behind Vectors and Scalar.

She started by talking about the Neoliberal Tools essay in the LA Review of Books and how she hopes to show that developing a tool like Scalar isn't necessarily neoliberal.

She is turning away from big data and focusing on the connection of DH and the arts/design. She has led a team with designers since

Her story overlaps with the story of DH, which can often seem overly focused on textual and corpus computing. Her story is different. An alternate story might be with various experiments in screen culture. EAT (Electronic Art and Technology) was developing ideas and collaborative work processes at the same time as Busa. They looked at how artists might contribute more than text. This might be connected to virtual and augmented reality. In LA the Eames office also explored multimedia. This of the IBM pavilion at the New York fair in 1964. They pitched a multimedia lecture idea to UCLA (with smell.) Their work dealt with speed, visualization and public media. Computation was intertwined with images and other media, not just text. The origin stories of DH often focus only on heroic moments (of textual computing.)

Mc Pherson talked about how mappings can have problems. Does periodizing collapse the variety of work. Strategies of periodizing recapitulates the dominant stories of waves of technology.

The Vectors lab entangles theory in the digital humanities. Theory was brought into tension with media. She talked about the value of using not just theory to engage but also making. "Start wordy in nerdy"

They draw particular inspiration from feminists who are undertaking creative practice as a way to intervene. Curiosity about the material - about what material can do - is different from discursive interventions. What could words do with the database? She showed some examples from Vectors.

When we work from composing to compositing practices it defamiliarizes critique. It is not surprising that people seem optimistic about making compared to dry and negative theory. For Mc Pherson it isn't productive to separate theory and making.

She then described a set of projects that are in a feminist register. She started with Vectors that published multi-modal works. They had to develop new processes as they rarely got finished works. Their early projects were speculative in the sense Johanna Drucker talked about. She talked about some of these, like Public Secrets. Digital Dynamics Across Cultures by Kim Christen is another. It resists openness - frustrating the scholar. Christen sees openness as not necessarily good - for aboriginal cultures opacity is often better. Christen has since worked on Mukurtu is

Emily Thomson's The Roaring Twenties is about noise complaints in the 1920s in New York.

Trevor Paglen Unmarked Planes + Hidden Geographies

Like an art or design studio they are learning by doing. There are a lot of issues too. The works are often very intensive custom collaborations. Many of the projects were done in Flash. Things that are customized are more likely to break. None of the projects were attentive to ADA requirements. These lessons led to Scalar, a project that has built an easier to use tool for people to be able to make their own multimedia/multimodal works. Scalar has an open API and ways to export things. Version 2 has just been released. They have about 18,000 users.

She described some projects done with Scalar. For all sorts of scholars like sound scholars it is ridiculous to publish only into print. The digital can return us to close reading of media (sound, video and so on.)

Freedom's Ring by Evan Bissell with Erik Loyer is about Martin Luther King. Nicole Starostielki Surfacing is about the cables that move data around the world.

Samantha Gorman Pry doesn't use Scalar, but is a neat interactive fiction for iOS.

Web Archives for Digital Humanities panel

Ian Milligan and Nick Ruest: Enabling Access to Canadian Archival Collections through Web Archives?.ca

Ian started by talking about the importance of web archives to history. This recapped a lot of what he said on Monday. He took a research question about recent changes to Canadian society. Historians seem to only use elite sources like the Globe rather than web archives. U of Toronto has been collecting archives on Canadian Political Parties and Political Interest Groups? - why not use that? The problem is the site has a lousy search interface that doesn't facet results. Nick Ruest talked about how they have created an alternative interface that is much better for researchers. See . Ian then quickly gave an example of how you can use the new environment.

They got lots of media attention doing this:

  • Parties erase materials from their sites
  • Parties flirted with user comments
  • Absences can be more informative than premises
  • You can follow prominent politicians
  • Enabling user access is transformative - he gave an example of something that was found by users

Nick then closed with something on dealing with messy data. He showed how their tool can push stuff to Voyant. Very cool.

Todd Suomela: Creating just-in-time corpora

Todd talked about our projects to create corpora of stuff that is happening now. Most people working with web archives are not gathering the materials. Further, web archiving tools like Archive-It are not good at certain social media. The IIPC (International Internet Pereservation Consortium) has done a lot of work on the tools like Heretrix. For just-in-time corpora there are other tools.

He talked about event-driven corpora where you build a corpus around a recent event like blacklivesmatter, Charlie Hebdo and so on.

The two examples that he talked about were gg and the Fort Mac Murray? Wildfire 2016. He didn't talk about our results so much about how to crawl these materials. He mentioned "crawler traps" that catch your crawl and distort the record.

Web archiving tools like Heretrix are more document oriented. For our projects we had to use tools that would track Twitter.

He had some nice slides showing the workflows. For example for the Twitter data this is the flow:

  • We use Twarc to interact with the Twitter API scraping twice a day and saving JSON to the cloud
  • We use a JSON extractor to pull CS Vs? from the JSON and R scripts to do simple processing
  • We then use custom scripts to strip out the text and pass to tools like Voyant
  • We create custom visualization tools or analytical tools to answer specific questions
  • We use institutional archiving tools to then hold the data

Can we make this easier? No. The web archivists are outgunned by all the people creating innovative web sites and services. There are also commercial silos that you can't get at like Face Book?.

There was a question about languages. Ian talked about using Tika for language recognition.

Jeremy Wiebe: Enabling Analytics through the Warcbase Platform

Jeremy talked about the Warcbase tool they have been developing. This can be downloaded and run locally. It now has a scriptable data analytics API. How does one work through a large web archive? You can't read it so you need exploratory tools. That is where warcbase fits in.

He showed code from a SPARC notebook that does analytics. He talked about how you can filter all the web pages. He talked about the cool D3 stuff.

Florence Chee: Ethics and Web Archiving

Florence talked about how we are thinking through the ethics of gathering all the gg twitter stuff. Who are the subjects of research? Should we archive tweets that might embarrass people in the future or cause pain to the targets.

Who are the stakeholders? She talked about the types of stakeholders:

  • The gg tweeters
  • The people harassed or discussed
  • The research team

She then talked about how we are thinking though the ethical issues guided by the ethics of care. Care seemed the right lens.

She talked about how one needs to take of oneself if you get attacked. The attacks are not micro aggressions but macro aggressions. The attacks cause affective hidden work.

She talked about how the ethics of care changed our position on ethics, especially leading to the deleting of the 4chan/8chan data.

She then talked about relationships and the variety of them. Relationships are central to care. Understanding how to care in the relationships is important. How do we craft careful relationships with the stakeholders.

There was a great question about speaking to power. Who has power? As academics we have the power of the academy and the power of the archive.

Afrofuturism, Black Speculative Art and the Digital Humanities

Reynaldo Anderson: Afrofuturism, Black Speculative Art and the Digital Humanities

Anderson is a Harris-Stowe State University which is in St. Louis so he has been involved in what is happening there. He and others organized an exhibition on "Unveiling Visions: The Alchemy of the Black Imagination" at the Schonberg. He talked about the intersections of artivism - or art activism in the exhibition.

Where does arts, artivism, discursive space, digital humanities and futurity intersect? Afrofuturism is often focused on a small number of people like Octavia Butler.

The California Ideology is a term for the social context in which Afrofuturism emerges, coined by Mark Dery. It is speculative fiction that was defined as "treats African American themes ... " There are some weaknesses - it tended to deal in imaginary spaces which was American-centric.

Afrofuturism 2.0 and the Dark Enlightenment is the "early 21st century technogenesis of Black Identity ..." The Dark Enlightenment is this era of techno-libertarian determinism characterized by reactionaries and neo-social Darwinism. See "Afrofuturism 2.0" collection.

Where do the humanities come in? Critical race design could connect to the digital humanities. Afrofuturism is a theoretical framework and critical race design is the methodology. They are making comics to gain audience. He talked about Bruce Sterling's diegetic prototypes and their lack of history, but critical race design brings the history back in. He showed how they are designing the exhibit in Sketch Up?. They used tablets to share art that wasn't on the wall. They had cosplay at the exhibit.

He ended with a definition of his own, a definition on Black Speculative Art. See parts of the manifesto at

They are fighting for control of their own imagination. They want to define how they see themselves and not be defined by others.

He talked about opacity - how sometimes you need to hide what you are doing. Transparency is not for all situations.

Toniesha L. Taylor: Womanist Rhetoric, Afrofuturism and the Speculation of Respectability

Toniesha started by promoting her Debates in Digital Humanities 2016 co-authored essay on "Pedagogies of Race: Digital Humanities in the Age of Ferguson".

She then defined social justice and talked about how, if you were Black woman in the US, the summer of 2015 seemed to be never ending bad news and public grieving. It seemed an endless summer of violence. The violence is structured and continuous. She talked about some of the events and about the hashtag #whoisburningblackchurches was getting people to see how things were not random.

Toniesha Taylor has been gathering tweets and other materials over the summer. Many had to do with the Sandra Bland case. She talked about the suicide story and the likelihood.

Taylor developed a theory of womanist rhetoric - what is critical is "womanist voices, gendered cultural knowledge, and womanist ethics that includes an ethical discourse of Black love, ethical discourse of salvation and an ethical discourse of social justice."

She talked about how you employ all the things you learn through gendered spaces. She talked about embodiment and how one protects oneself. The ethics includes an idea of safety.

She talked about the data they gathered and tools. She uses hashtags, even though they have problems. There is also the problem of what hashtags to capture and staying on top of changing tags.

She then passed the data to Voyant.

Bree Newsome is an activist who climbed up the flagpole at Charlston and took down the Confederate flag.

She also used Pinterst to gather images she liked of, for example, Afrofuturism. Bree Newsome, for example, was reimagined as a superhero.

She likes the changing of the story from one of victimization to heroism.

Bree Newsome tweets out when she gets out of jail to tell her own story.

Taylor talked about the pushback rhetoric of respectability. If only X was more respectful they would be alive. This idea needs to be pushed back.

Visualization is the future. It is where the important interactions are. How do people interact with the videos of some of these events?

Taylor also wants to move to the intersections of Afrofuturism and womanist rhetoric. How do we imagine the future and tell stories about it.

There was a great discussion after about affect - how hard it can be emotionally to study certain stuff.

Wednesday, June 1

Milena Radzikowska: Materializing Text Analytical Experiences: Taking Bubblelines Literally

Milena presented a project with what looks like a neat team. She had an opening slide with pictures of the team - nice touch. She had an interesting machine in front of her which tempted me to sit right at the front. She told a story about how this project got all sorts of people talking about the digital humanities in a bike shop.

The project took the Bubblelines and created a material instantiation. She described the brainstorming process of bringing together a team. The saw it as an exploratory adventure.

Their question: Which features of select DH systems are more amenable to direct translation to the physical and which benefit form new forms of physical expression?

She commented about the suspense of physical systems. You have to wait and watch. Suspense gives time to think about results as they appear.

Second question: How can physical space be used to strengthen users' sense of size and complexity of a collection?

Third question: How does the use of various materials, for example ceramics, LE Ds?, canvas, or string, change the experience of the user?

She talked about the materiality of time and how the physical reminds us of time in a way that the virtual is out of time. She also talked about how this is approachable in ways that the online tools are not.

She then ran the machine and it poured dark or blue sand as it hit places in the text with the two keywords "open" and "free". This produced transparent tubes with different striations of sand that show where the words are. The machine runs off an Arduino that opens and closes spigots as the different sands fall. At the moment it is hard coded from a table.

What fun! She invited us to think of interfaces we want materialized.

Jason Boyd: Reading Texting Wilde: Interfaces for Analyzing a Digital Corpus of Texts

Jason started with a slide showing the research challenge showing four versions of the same story that are different. He showed how different stories are told about Oscar Wilde reflect their specific contexts. The stories

The Texting Wilde project is exploring computer-assisted methods for the study of large (not massive) corpora of life writing. He is developing a TEI customization for exegetical analysis. He showed some of the encoding that uses <seg>s. I wonder if he found he needed overlapping elements. They also use the <said> tags for speech reported by others.

He is working on the paragraph as the unit of pulling results. He wants to match similar stories across the corpus as different people tell different stories about Wilde. Can entities be extracted and matched automatically.

One of his ideas for an interface for the open ended interpretation is the "evidence wall". He wants to enable a freestyle process. I wonder how CWRC-Writer might help. He wants something that find clusters of entities (that would make up a story) and match them. I wonder if sequence alignment might work.

Another type of match would be on speech units. Can one find story telling moves?

Then Jason moved to the question of what one does with matches. He has developed a canonical chronology and then connect the matches to it. He wants to be able to go back and forth between stories and chronology. He wants to see things on a timeline so one can compare stories to what we know. He showed an example where an event told is hard to place and may say more about the storytellers.

They actually need a double timeline - one of publication and one of Wilde's life.

He talked about a classification system that allows sources of stories to be tracked. He uses the nesting of OCHO and that nicely shows stories said by someone in stories said by someone. The saids within saids could be visualized.

The ultimate objective is not to create the ultimate biography, but to foreground the ambiguity in the sources about Wilde. His life has become grist for different interpretations. He wants to show how Wilde challenges biography rather being exploited by it.

Enhancing Physical Analysis Tools with Virtual Affordances

Stan Ruecker wasn't here to present so Milena read his paper - or to be more accurate she presented the slides. The project uses three-dimensionality. They have plexiglas plates that people can construct as the plates have magnets. The panels can then take post-it notes or be drawn on. You can have two sides of an issue.

The next phase of the project is taking that model and building a virtual version. You can make your own 3D shapes and then attach things to those surfaces. You gain the computing power, but lose the interaction of the physical.

Now they are trying to have the physical plates, but have that reflected on the screen and vice-versa. In a related project they are using cameras that, in effect, scan the shape physically assembled and represent it in a game engine. Another idea is to replace the small panels with big panels that can be repositioned as needed.

We then had general questions. Milena talked about how doing a physical project gave her a much better appreciation of the hard work of coding and the time it takes.

Dominic Forest: Conception d’une grille d’évaluation de référence pour les outils de fouille de textes

Dominic started by talking about how they are documenting the variety of electronic documents and an augmentation in the diversity of text mining tools. How can the tools be described, evaluated, and organized. He is focusing on tools that can extract new information on structured patterns.

He had a nice graph that shows the flow in text mining from corpus to filters to statistics and so on. Some of the places they found tools:

  • Kdnuggets - Lots of stuff, but chaotic
  • Di RT? - More academic

He then talked about TA Po R? 3.0 which has only 7 text mining. He talked about the faceted searching.

How can one evaluate text mining tools. Dominic talked about how fragmented the evaluation literature is. There are different perspectives from interface evaluation to evaluation of algorithms. It is hard to get a global perspective. He builds on Collier et al. (1999). This is important, and we need to learn form Forest's review of evaluation strategies/criteria. He suggested we need to develop a more mature culture of evaluation.

How can one do evaluations that can be understood by humanists. Here is what he suggests. A three level strategy:

  • Factual evaluation - basic information
  • Functional evaluation - what can you do with it
  • Ergonomic evaluation - usability

I asked if one can also think about the community of evaluation. How can we create culture of people paying attention and sharing insight? We

Grady Zielke: TA Po R 3: Code as Tool

Grady talked about the TAPoR project. He mentioned that we are on the third iteration. This iteration has a new code base to make future development possible.

He talked about the code as tool feature that allows us to document useful chunks of code that usually disappear.

He then gave a live demo. He talked about the home page and the rich prospect browser. He showed how to search or filter tools. He showed a tool page and how we use the comments for adding all sorts other information.

He talked about how the code-as-tool feature is trying to find a niche between github and stack overflow. Then he showed the list feature.

We talked about the possibility of a French version. How can one create the community that would support multi-lingual version.

Ian Lancashire: LEME's Hidden Tools

LEME was supported by TA Po R (CFI) and SSHRC. It is hosted by U of Toronto Library IT. It is published by the U of Toronto Press and free, but they sell licenses to support it. It offers 1300 lexicons of the period (1480 to 1755). He showed an entry.

He talked about the editorial tools in the system that are available to only the editor - hence hidden. He can get reports on tags. Lists of different words/spellings and other fields.

He showed the screen he uses to interrogate his tagging. He showed the raw entry page and talked about how the raw text isn't copyrighted, but they regard the tagged text to be copyrighted.

This data can be used in interesting way. You can study the evolution of English vocabulary. You could use the dictionaries in your own projects. (Could one use it to create historically accurate search patterns.

He showed how LEME has helped answer the question about the increase in English vocabulary. Other estimates are way off as they use the OED and suffer from not separating different spellings. His findings show a different growth with moments of decreases. He talked about how lexicographers can inflate a language. You take English and Italian and combine them. You hire staff to collect quotations. The OED had hundreds of lexicographers working for years.

Ian then talked about another way of measuring. What can individuals read write and speak. They (Ian and Elisa Tersigni) studied letter writers to see how vocabulary richness in letters changed. The vocabulary richness increases much less. Dictionaries become monstrous distortions of the English language as spoken, read, and written by living individuals.

One of the things that changed was our notion of what should be named in dictionaries. Early on nouns were things and reflected the things experienced. Later lexicographers added new ideas of what a noun was.

Rockwell and Sinclair: Thinking Through Analytical Things

I gave a paper on work Stéfan Sinclair are doing around replicating analytics and argued it is one way to develop analytical literacy - a literacy needed in the face of what we are learning after Snowden.

Maria Martinez Gragera: Smart cities: Anatomie philosophique des représentations de l’intelligence de la ville numérique.

Maria is an architect who teaching in an architecture school and you works on mathematical tools. She is interested in the culture of math. She is interested in how AI is projected. She talked about "smart" cities where AI is being used to manage us. She mentioned Songdo International Business Disctrict in Korea. Montreal has stuff on "Ville Intelligent et Numerique." These projections of human characteristics onto the city interest her.

She talked about the history of the idea of urbanism. How and when did people think about managing cities to manage the social. Le Courbusier had a rationalist position that everything could be understood and managed through urban planning. It is a plan that doesn't make clear the force and power.

She talked about Rem Koolhaas and Bruce Mau on "S,M,L,XL". This is full of fragments, some of which are defeatist before neoliberal management. She talked about the language of the discussion of smart cities. She mentions, Holland (2008) "Will the real smart city please stand up?" Cities, 12:3, 303-320. He talks about how we have to pay attention to the capital and power rather than believing IT will solve things. Antoine Picon is someone else she mentioned as is the MIT Senseable Cities Lab.

She finished with a call for thinking about the quantitative.

Daniel O'Donnell: The Tip of the Iceberg: Transparency and Diversity in Contemporary DH

Daniel talked about how a student (Shealeigh Brandford) raised some issues that he has been thinking about. He presented the problem.

"For a group that emphasizes opennessness, collaboration, and sharing, you sure do have a lot going on under the water."

The student found the stuff about John Nerbonne resigning as ADHO Steering Committee Chair. Attached to the statement was a discussion about the need for a protocol to deal with communication across cultures.

Whatever was going on, these messages were all one could read about, but the messages made it clear that something happened and that something was not actually discussed. There was a tip of an iceberg.

His guess about what was going on was:

  • Debate between and among members of various ADHO committees about the proper place of and deference to issues of diversity and inclusivity.
  • Discussion about appropriate professional behaviours.
  • Arguable, a very close call for the survival of ADHO itself.

There are a lot of diversity statements in the last six months. You can of tell what is happening from these including a discussion about a path for redress of grievances. All this got Dan thinking:

  • Why this opacity?
  • Is this opacity unusual?
  • Is this opacity important?

He referenced Domenico Fiormonte's article on "Towards Cultural Critique of the Digital Humanities" to the effect that we are very centralized with a small number of people. Dan's view is that we have had a concentration that has benefited us through. A small number of people make a lot of decisions.

Does being in management roles really make a difference? One difference is that we don't want to air dirty laundry. We want to present a common and friendly front.

Does it matter that we in the digital humanities are opaque? Dan feels it does matter as we are field that calls for openness. What is being protected is the older generation (like us.) As this is a field about organization and infrastructure means that we should be open about our real organizational challenges.

What can be done? He talks about Unsworths summary of the TEI leadership spat which was more open. That didn't help because the board was attacked for months.

Some answers that came up included:

  • Can we use our tools (visualization and analytics) to understand what is happening
  • There should be mentorship for people new to committees

CSDH-SCHN Annual General Meeting

Here are my notes on the agenda:

Tri-Agency Data Management Policy Initiative

We first heard about the Tri-Agency Data Management Policy Initiative from the folk at SSHRC. They are encouraging the planning for data management. There is now a Tri-Agency Statement of Principles. They have been running a consultation that got mostly positive feedback, but concerns about the lack of infrastructure.

He mentioned how associations can help by having a policy, by fostering workshops, by developing resources. Next they are moving towards determining an actual policy with implementation timelines. They are working with the Portage Network.

What would it mean for data management plans to be reviewed and by who.

They emphasized that they are in a listening mode. SSHRC doesn't own this.

We heard from the Treasurer and

Guidelines on Digital Scholarship in the Humanities

I proposed a policy to CSDH which passed.

Be it resolved that the Society/Association adopt the Guidelines on the Evaluation of Digital Scholarship with the understanding that the Executive can edit and translate the text as needed to align the guidelines with that of other professional organizations. Further, be it resolved that the Guidelines should be edited by the Executive as scholarly technologies change. Let the Executive be empowered to revise and extend these guidelines in the future as long as

the membership is informed.

The document then included Guidelines.

CSDH/SCHN Draft Inclusiveness Statement

We then briefly discussed the Draft Inclusiveness Statement. Here is the opening of the statement:

As representatives of digital humanities scholars in Canada, CSDH/SCHN already strives to be open and inclusive in a way that welcomes newcomers to our discipline. This means working to include people from academic communities adjacent to DH, engaging with the perspectives of newcomers, and extending our welcome beyond academia, so that we can learn from others doing digital work.

Diane Jakacki: How do we Teach? Digital Humanities Pedagogy in an Imperfect World

Diane gave the final Keynote. She started by talking to us about how we epitomize the possibilities for progressive DH.

She gave the keynote without slides and talked about she wanted to emphasize the humanity of teaching. No amount of technology can replace the relationship between us in learning.

As we become more attractive as a brand, more pressure is being put on us to develop and implement curricular DH. Most of us experienced two educations. We fell in love with and learned the humanities and in our second education we had to learn without disciplinary structures. In our first education we took survey courses; in our second education we read what we found or people told us about.

At this point, those of us we are self-taught have to learn how to teach. It is important that we pay attention to instruction when all these expectations are on us. It is important that it is digital humanists that teach the digital humanities.

She then framed some questions:

  • Who do we teach?
  • Why do we teach?
  • Who are we to teach?

She described some of the courses she learned from and taught. She talked about how there is no shame in teaching a course that helps people get a job.

She talked about responses to the “Neoliberal Tools (and Archives): A Political History of Digital Humanities” article. In particular she talked about Brian Greenspan's excellent The Scandal of Digital Humanities where he argued that DH uncovers the gears of the digital rather than being just a tool of the system.

She talked about how many of those employing progressive teaching models are untenured and casual instructors. Why are senior faculty not sharing?

We can't expect students to teach themselves and we can't expect ourselves to learn everything including teaching.

She proposed a co-teaching idea. She recognized how co-teaching may not scale. We need to train the teachers to train the teachers. She suggested that DHSI can provide a model. We professionalize and recognize that our teaching should not be only aimed at the tenure-track job. We need to better prepare our students for the variety of jobs out there.

In discussion we talked about issues like:

  • How to teach students to learn digital on their own?
  • Why do people see the humanities Ph D? as antithetical to data-driven culture?
  • How can we bring digital humanities research into the classroom?

I was struck by the paradox of trying to teach the digital humanities which many of us learned on our own. If how we learned is the best way, then we shouldn't teach it; we should resist teaching recipes. We need to create the conditions for scrappy self-taught digital humanities.

In the paper Stéfan I gave we concluded by arguing that, after Snowden, it is now a citizenship issue to teach analytics (so citizens understand how big data analytics is being used to manage them). We are called to teach analytics in a way that allows people to understand the messiness of the practices.

We ended by talking about carework as a way of supporting graduate students, but also that we should recognize all the carework that makes possible what we do.

This was the end of CSDH 2016. I love the idea of a keynote after the AGM.

Thursday, June 2

CWRC presentation at DHSI @ Congress

I introduced a session presenting the CWRC beta to the DHSI group at Congress. The Canadian Writing Research Collaboratory is led by Susan Brown. We are building a distributed scholarly editing environment that is in beta now and will be released in the Fall.

Some of the goals of the project were to:

  • Facilitate collaboration, both within teams and across teams
  • Overcome data silos so that projects can share standards and linked data
  • Support the life-cycle of a project from design to publication to archiving
  • Build sustainable infrastructure

Mihaela then followed with a demo. She showed:

  • Some of the neat seed projects and how a project can use CWRC to manage the project, but have their own interface.
  • How you can publish objects to the web with different levels of permission. You can, for example, let citizen researchers read and annotate documents .
  • There is a rich faceted search tool.
  • There are bookmark lists that can be shared. Users can look at the history of an object.
  • How you can pass subsets to Voyant
  • The entity aggregation pages where named entities that have been tagged across projects can be explored. You can find all the tagged references to E. Pauline Johnson, for example.
  • How CWRC supports electronic texts and other objects (images, sound, video, xml files)

She demoed the CWRC-Writer and talked about how it is freely available XML editor in the browser. It is easier to use than Oxygen, but not as powerful. It is meant for teams where you have research assistants performing different editing roles. CWRC allows you to design a workflow that controls who gets what document when.

Mihaela then demonstrated the dashboard feature - both the personal one and the project one. This lets one see the status of documents, communications and so on. You can manage locked documents, user roles, and so on.

Constance Crompton then talked about the human side of the project. We building human review and support mechanisms. She talked about how we learn these tools/practices through mentorship and how CWRC will support mentorship.

Friday, June 3rd

On Friday I went to CGSA (Canadian Game Studies Association). Their programme is here

Evgeniya Kuznetsova: This Is Why We Can't Have Nice Things: Online Harassment of Game Developers

Evgeniya talked about the harassment of game designers. She talked about how easy it was for her to blame a game designer about whom she knew nothing. She has started to reflect on how fans attack designers they think are responsible for dissappointments.

She talked about some example attacks on writers and designers. These attacks included death threats, professional attacks, personal attacks questioning their integrity. The earliest case she found was from 2007 when a woman was targeted by even journalists that would discuss not the game, but her "scent". The situation culminated in a pornographic comic.

The best known case is the one that triggered gg. Fans felt that the critical acclaim for the game was not deserved and that the designers had corrupted the system. Another case involved a developer who ended up quitting designing games.

The types of attacks include:

  • Personal attacks
  • Swatting - abuse of authorities
  • Attacks on target's family/friends
  • Sexual harassment
  • Hacking
  • D Do Sing?
  • Forcing resignations
  • Threatening boycott

She talked about Games are online and developers are assumed to be "gamers" so they are considered fair game and part of the community in which abuse is normal. Also games are published in an unfinished state and consulted as if they were important which leads players to feel they have a right to engage.

There has been a reaction with a statement from the IGDA, a legal guide, a movement to appreciate developers and an online guide. A lot of this is aimed at the harassed rather than the harassers. Companies should back their staff better.

Chantal Robillard and Sylvain Payen: Toxicité et cohésion sociale dans League of Legends

Sylvain looked at disruptive behaviour in League of Legends, a real-time, team-based, strategy game. He is part of the CIBRG project that is looking at gendered experience and social interaction in LOL. They interviewed players (men = 20 and women = 8), half French-speakers and half English. Interviewees from Quebec between 20 and 30.

What is the disruptive behaviour and how was it represented. A common pattern was insults/harassment in the game chat or team speak or other communication platform. Players get harassed for not honouring the game contract (which could be due to technical issues.) There are also gameplay issues that lead to harassment when a player doesn't play in a cooperative fashion or as expected.

League of Legends has some cultural aspects that may facilitate the brutal communication behaviours. It is Free-to-Play, there is a lack of control, and an evolving game culture. Now the game controls behaviour more and rewards good behaviour.

We also see youtube hotshots who set the tone for their fans. Flaming is supposed to be a form of gameplay that is supposed to disrupt the other player.

Almost all players report getting abused and almost none stop playing. They tolerate it and treat it leniently. The game is stressful and abuse is a consequence of the game design.

The most disturbing behaviour is when a player decides to destroy the experience of another player. Beyond insulting they force agony through revenge tactics. This leads to reports, bans and rewards. It seems that the interviewees recognized these disruptive players, but didn't think they were at fault, even though they had done disruptive things. Everyone blames others while still doing it. They blame others, even when they have been banned. They describe their bans as due to others or conspiracies.

There is also a degree to which abusive communications get claimed to be "just a joke." Even some who accept that their communication is abusive see it as a tactic and accept the consequences.

In conclusion, there is abuse, but it is always someone else's fault

Emma Vossen: I was Vivian James: The Involvement of Girls and Women in Conservative Video Game Movements

Vossen began by asking us to be discrete with twitter.

She talked about who she was at 16 and how she has changed. She wasn't a femminist (as she is now) as she was the child of a men's rights activist (MRA). Not much is known about MRA and it is growing.

  • A Voice from Men
  • Men Going Their Own Way
  • Pick Up Artistry


The rhetoric is usually "some rights for women is OK, but now feminism has gone too far!" This is followed by some screed. Social media has enabled MRA. People turn to MRA as a way of justifying their feeling of oppression.

She talked about what MRA taught her - that gender is policed, that it is important, and that men and women are not equal. Strangely MRA led to feminism due to the "gender politics wormhole".

When you're accustomed to privilege, equality feels like oppression. When you are accustomed to being the normal consumer

She talked about growing up treated like sh*t, including

What are the pressures put on girls to get them separate from feminism. When you can't beat them, join them. Young women get pressured in various ways. They want to be Vivian James, a character created by the gg community to show how women should act.

The opposite of a gamer is a feminist according to . The only who could be both are the males. Females feel they have to choose. Now that game studies is seen as SJW then, by definition, we can't really be gamers and we can't understand games.

Policing women's behaviour has a long history. She is a construct, but she exists because of the ways women are trained to be like her. But, there can only be one real cool geek girl. Men encourage girls to try to be the exception to the rule and be the cool girl. The cool girl doesn't challenge and lets others feel comfortable.

Being cool, alas, you can't win, you can't be a feminist ...

Many think gg is just all guys with no lives. The reality is that the ggers are not losers. They have jobs, they have wives, they have children. They are often quite successful. We should not talk about them as losers. They often are in relationships with real power over women. The women are very real and we need to understand what choices they have.

Google "vivian james cosplay" to see how there really are women in gg.

Joining gg for women is often a survival tactic. Many women conform to the culture of gg to be safe.

The thing that scares me is the degree to which reactionary movements often appropriate and adapt the rhetoric and tactics of progressive movements. This goes beyond gg - one could argue that the California ideology drew on progressive ideas of the sixties.

G@merG@te Origins

A panel I was involved in then presented a data-driven study of gg. I'm not going to name people for obvious reasons and just use "we."

We started by talking about the gathering of data and the archive we have built at

We talked about workflows and what we were getting.

We then talked the ethics of this research. One issue is that researchers are themselves becoming accused of ethical lapses by gg folk.

We had a short paper on the tweeters. You see a drop of casual tweeters while the hard core tweeters seem to be sticking to their guns. We looked at the frequent tweeters. We developed a rubric to study the top tweeters. The top tweeters referenced gg positively. (Only one negative reference.) A number of accounts reference other sites from which do they make money.

We then talked about block lists. The block lists became a source of pride for top authors. They got a sense of belonging. Also frequent references to being evil, monster, oppressing of women etc. The authors seem to be appropriating the criticism aimed at them.

We then talked about the use of neoliberal economic ideological language. GG is a movement that has been continually redefining itself, which can make data-driven research important. We talked about high profile links to conservative news like Forbes that had a story about how gg is not a hate group, but a consumer movement. Is the right wing establishment taking advantage of gg to bring in a demographic? Are some commentators using gg?

We looked at words like "consumer" and their collocates. He found consumer discourse goes down and there is a focus on SJW. There is a lot of use of "rational" - as in gg is rational and the SJ Ws? are not.

We talked about how alt-right hashtags show up, but later in our data.

Whether or not the consumer revolt story was a shield, it is now important and connects gaming to various right wing ideas.

We then talked about gg and military campaign discourse. GG applied and organized using campaign language. This language comes from games like Call of Duty. There has been a lot of rhetoric about how games corrupt children, but it is interesting to look at the language used within the gg. We gathered words that were used in gg and looked at them in data finding collocates that explained some of the semantic field.

The gg discourse makes significant use of the argument-is-war topos. There seems to be a level to which the gg community likes to see what they are doing as war. How do games frame the imagination of gamers when it comes to other aspect of their lives? Or, do gamers bring metaphors from outside into gaming?

Then we looked at hashtags as gg itself is a hashtag. The highest frequency hashtag is notyourshield which seems to be an orchestrated shield itself. We looked at the top hashtags to see if there are patterns. Few of the top tags are about ethics in journalism, and a lot are about other things like feminism and sjw. There were a number of tags referencing events when gg was discussed rather than events of corruption. Finally there were a lot of spam tags.

Some of the categories are:

  • Generic
  • Event
  • Media
  • Community building
  • Meme
  • Speme
  • Organizational
  • Trending
  • Person
  • Company
  • Games
  • Side-markers

Many of the tags were opaque - you had to know what was being referenced. People perceived as opposition often were referred to by name, while others became

We finished by talking about the polarization, or war. Looking at the data shows a more nuanced situation than gg good or bad. We too have to avoid the polarization. We academics come from a position of privilege and need to take on some of the work of interpreting the nuances of the situation rather than continuing the war.

Then we talked about how we gathered materials other than Twitter to use as a comparison. We gathered corpora from youtube video comments - some of which had thousands of comments. Some of these started before gg which allowed to us to track the effect of gg on the larger discourse.

We showed how gg could be mentioned in a video about something else. It explodes on some forums. These people

The market language showed up in a lot in these non-twitter corpora, but campaign language less so. We looked at the sentiments in some of the video comments and found a much more negative tone.

We talked then about campaign language - how military games seems to form the imagination of gg.

We then talked about an analysis of high frequency hashtags. We talked about the polarization and how we should try to avoid furthering it. Perhaps there is more nuance to gg than many think there is.

The last short talk focused on how we built and analyzed non-twitter corpora as control and origin collections. These confirmed some of the theses drawn from the twitter data.

Geoffrey Rockwell and Keiji Amano: The Show of Play: Pachinko in Japan

I presented a paper written with my collaborator Amano on representations of pachinko in movies and novels. As I was presenting I couldn't take notes. Heres is the abstract:

"These silver balls are you. They’re your life itself." (Kurosawa, Ikiru, 1952)

Pachinko is the most popular game in Japan in economic terms and in terms of urban presence (Brooks et al. 2007) so it is no surprise that it is shown in other arts from cinema to literature. Notable works include Kurosawa’s Ikiru (1952) to Miri Yu’s novel Gold Rush (2002). But what do these representations show us and do the representations differ over time and culture? This paper will look at a selection of both Japanese and Western cinematic representations of pachinko both as a way to understand the history of the culture of pachinko, but also to understand how pachinko becomes a metaphor for Japan.

Japanese Representations Pachinko is so pervasive in Japan it no longer needs to be explained in Japanese films or novels the way it does in the West. That was not true in the 1950s when there was the first boom of parlours and pachinko became widely popular. It is in this context that we can look at how Yasujiro Ozu’s The Flavor of Green Tea over Rice (1952) repeatedly shows pachinko being discussed and played at a time when skill mattered and the professionals played against the parlour and its nail doctor (kugi-shi). There are scenes of characters talking about the pleasures or dangers of this changing form of entertainment and scenes of play in context. It also includes a scene with a small pachinko parlour owner of the time.

We will contrast Green Tea Over Rice with Kurosawa’s Ikiru (“Life”) where pachinko is metaphor for a frivolous approach to life. Watanabe’s “mephistopheles”, who leads him into the worlds of play, compares pachinko to life as we watch a ball dribble down bouncing off the nails, “Listen, These silver balls are you. They’re your life itself.” (Ikiru 1952)

Western Representations By contrast, in many non-Japanese movies and novels, the pachinko parlour is a glittery setting that stands in for a neon and exotic Japan. Sophia Coppola’s Lost In Translation (2003) treats Japan as the untranslatable other that can be safely mocked (King 2005). She is not really interested in Japan, let alone understanding it. Pachinko is just another flashing neon space in the night of Japan that serves as a backdrop of alienation. Ridley Scott’s Black Rain is better, but it still trades in stereotypes. That pachinko would serve as a backdrop is understandable, after all, it is designed to stand out visually to Japanese customers. As an architectural designer put it in an interview, “the building itself as a sign for the parlour…” (Verena 2008). These spaces become signs of potential play, attractive and alien.

Wim Wenders and Tokyo-Ga We will conclude with Wim Wenders’ documentary in search of director Yasujiro Ozu titled Tokyo-Ga (1985). The documentary, like Barthes’ Empire of Signs (1982), is a sympathetic exploration of Tokyo and Japanese culture, and it includes a meditation on pachinko. Wenders tries to understand Ozu and Japan through pachinko and settles on the game as an understandable way of forgetting in the face of trauma, national and personal. Wenders shows a moment of transition between the era of mechanical pachinko of the post-war period (as shown by Ozu and Kurosawa) and the era of increasingly electronic machines. This is the pachinko that serves as a different sort of backdrop in Miri Yu’s dark novel Gold Rush from the late 1990s after the economic bubble burst.

Mark R Johnson: Deep Play and Dark Play in Contemporary Cinema

Mark looked at a set of movies where there is a life-and-death game like Battle Royale, Cube and The Game. He looked at how such deep play is often shown as the sick extension of an aristocratic and unfair world. The games are a way of managing the oppressed in deeply unequal societies. He suggested that there are more and more of these, and this might be because game culture is

We were fascinated by his list of movies and there was some discussion about whether other movies might fit like Run Lola Run. The criteria for selection left a grey area, but that doesn't take from the political uses of the deep game.

I wondered if these movies don't draw on myths or types of quests, like that Theseus and the Minotaur in the labyrinth. Not quite a game like the others, though it is a sort of puzzle and battle.



