These are my notes on the Canadian Society for Digital Humanities 2017 conference at Congress 2017 at Ryerson University.

Note that they are written live and therefore will be full of misunderstandings, confusion, and oversights. Read between the lines.

Monday, May 29th

Laura Horak: Developing a Collective Database of Trans and Gender Variant Filmmakers

Dr. Horak talked about how the history of trans gendered people has most been written by others. Horak is trying to recover a history of trans films. She wants this to be collaborative as much of the knowledge is not held by academics. She wants to build something that amateurs could contribute to. She also wants to preserve the rich history of independent films by transgendered people. The films risk disappearing if not preserved.

One resources is Transgender Archive, but it doesn't have a lot about film and what it has is more in the nature of finding aides.

She has applied to SSHRC and hopes to get funding for the project. She showed mockups of what she would build with trans developers. She plans to have levels of access that makes sure that no one gets outed.

Abigel Lemak, Colin Faulkner, Kim Martin, Robert Warren, Susan Brown: Translating and Relating: Contentious Cultural Forms in the Development of a Linked Open Data Ontology

Dr. Brown started by talking about the context of the project to create a linked open data ontology. She talked about how difficult it is to arrive at interoperability in the humanities. She gave us some background on the Orlando project and how they are trying to bridge interoperability and local context.

Robert Warren talked about the shift from XML to RDF - the shift from strings to links. With links you don't need to enforce normalized strings - you can link to common entities. He also talked about ontologies how they are for machines, not people. He showed how you can disagree with other ontologies and how you don't have to commit to things.

Colin Faulkner then talked about how they have been converting Orlando data to linked data. They are not using Same As? as much as Derived From? and trying to connect to other ontologies.

Then we heard about the issues of multilingualism in linked data when the concepts and technologies of linked data are deeply English.

Abigel Lemak then talked about cultural criticism and the project. They have developed a cultural formation tag set for Orlando. They are trying to develop an ontology that explores the messiness of indentity. They don't want to create a networked nuanced discussion.

At the end Kim Martin showed the which lets one see the links between people in a visualization that reminded me of Ruecker's Mandala project. A nice feature of Hu Viz? is that you can call up the context text from the visualization.

This project does a great job of exploring the issues around naming in their shift from strings to things. I'm not sure exactly what the ontological shift is, because a string is a thing and a URI is a string. I guess it is more a shift from a string that is human interpreted to a string that can be machine interpreted. In both cases you have shifting interpretations, but with a string like "jewish" the human interpretation can only be handled by the usual humanities practices while with a link one can draw on a larger set of practices.

Another interesting question is what sort of thing an RDF entity is. I feel that the URI is not the thing; the thing is an abstract real thing which reminds me of Alan Renear's work.

A question they hinted at is the question of what sort of thing is their ontology. They seemed to want to say that it wasn't a theory, but I may have got it wrong. There is also something to asked about how the history/versions of ontologies are tracked. If these are to have interpretative integrity they need to be anchored in time so that a link made to such and such a definition tracks

Chad Gaffield made an important intervention about how the humanities reminds us that everything has a location and one of the things we study is the evolution of labels/categories/concepts.

Mark Kaethler: The Problem with Prompt-books, or, The Problem with TEI? Tagging Time and Space

Kaethler talked about the challenges of editing TEI-encoded editions of prompt-books from Stratford. They found that the TEI was limited and needed to be extended. The stuck with the TEI as they wanted their texts to be in conversations with other electronic texts. Kaethler quoted Julia Flanders to the effect that that TEI is a mechanism for negotiating dissent.

Aaron Mauro: Machine Learning Training Data and Algorithmic Bias

Dr. Mauro started by talking about how mobile apps are declining and AI is getting more important. 2016 has been called the year of the chatbot.

AIML is a markup language for AI and Amy was a chatbot that won a Loebner prize. Cleverbot is a nice example.

There are a bunch of ways to build bot, Cheap Bots, Built Quick is place to be able to build one. kik is another place for youth bots. A lot of these are on Twitter. You can talk to Kim Kardashian on Twitter.

There is a lot of cultural froth around bots. The problem is that many are political and may have political implications. Trumpbot amplifies the voice of people. The democrats seem to have been less capable. The Oxford Internet Institute showed how pro-trump bots outperformed the Clinton based bots. They use services like the Botometer or Botornot, but these are not that sophisticated. They are tree

Tay (Thinking about you) was one of the first really modern bot. She was a neural net that learned naively. In about 16 hours she was trained to be homophobic and antisemitic. Microsoft has produced another one, Zo that is a-political and less

Geoffrey Hinton has made a library that goes on top of tensor flow that can be used to quickly model more sophisticated bots. Mauro has built on information about Faulkner based on information they have at the U of Virginia. Then he used the Cornell Movie-Dialogs Corpus further trained Faulkner. He used the Bechdel test as a criterion to extract from Cornell corpus specific dialogs. Alas there weren't too many.

Machine learning is a political space. The training data is important, we need trained humanist. Backpropagation can magnify aggressive opinions, even if offered ironically.

Bridget Christine Moynihan: Looking Out to Move Ahead: Reading and Digitizing the “Prismatic Fringes” of Archives

My laptop was being used for this presentation so my notes are from memory. Moynihan presented two projects that are working with anthologies or scrapbooks full of copyrighted materials. One is out of the University of Calgary, . Calgary has a collection of hand crafted anthologies by Bob Gibson of speculative fictions that were donated to the university. These provide a picture of science fiction or speculative fiction different from the canonical one as Gibson collected stuff from all sorts of sources that don't normally get studied. As Gibson didn't secure copyright the digitized materials can't be published online so they build a visualization.

The second project is a set of scrapbooks by a poet that are beautiful, but again contain copyrighted material. Moynihan's thesis is on these scrapbooks and she is collaborating with a design student to visualize the pages in different ways. Neat use of prototyping as thinking through.

Moynihan nicely theorized the two projects by talking about the fringe and quoting Benjamin.

Ian Milligan: Building a National Web Archiving Collaborative Platform: The Web Archives for Longitudinal Knowledge Project

Dr. Milligan talked about what we can do with the fabulous resources. The web archives are siloed by collections. Most are using Archive-It to collect Warcs. What WALK is doing is getting the WAR Cs? from other Canadian libraries and then use Warcbase to transfer and then generate scholarly scholarly derivatives. Researchers are looking at the data to check it and then setting it up on a server so people can search directly.

Then Milligan talked about where they are going. They had been using Shine from the UK web archives community. Now they are using Blacklight to build a web archiving back end. They are moving toward a Solr Cloud? architecture. They are also putting up derivative collections on dataverse so you can pass the data to Gephi, Voyant, or other tools.

Katie Mac Kinnon?: Research "Cyberkids:" Geocities Archives and How to Use Its Data

Mac Kinnon talked about how she studied cyberkids in Geocities. She talked about Geocities and the area of the Enchanted forest which was an area for kids. She had to use the Wayback Machine. She found lots of gaps. What helped was following communities. In Enchanted Forest many sites were diaries of a sort.

She then talked about the ethics of studying kids web sites that are so recent. She then gave some examples of cyberkid sites.

Todd Suomela: Gamer Gate? and Digital Humanities: Applying Ethics of Care to Internet Research

Dr. Suomela started by talking about Gamergate and our project to archive the controversy. He then talked about the types of ethical issues:

  • The gamergaters claimed the issue was about ethics
  • As researchers we have to think about the ethics of gathering data about contemporary subjects
  • We have to think of the ethics and datafication
  • And finally think about ethics, big data and analytics

He talked about some of the ethical frameworks one can use:

  • Deontological - what duties do we have
  • Utilitarian - What is useful
  • Virtue - What would being virtuous researcher mean
  • Legal - What can we legally do

He then talked about how we settled on an ethics of care to guide us and how we applied it to the project. We had to think about our relationships with ourselves, with the Gamergate community, and with colleagues.

There was a great discussion at the end about ethics and social media.

Ray Siemens: Social Knowledge Formation, in Medieval and Renaissance / Early Modern Studies and in Today's Digital Humanities

Dr. Siemens started by talking about the Henrician Lyric (lyrics by Henry VIII). Gibbon called it amphibious - half words and half music. He talked about how everyone working on lyric would have been mixing and remixing words and music. They were part of a commonality - a network that fostered creation and curation. This was a very popular form and traveled across Europe.

Siemens then gave some examples and talked about the echoes and repercussions of the music and words. He then talked about social authoring and collective (vs individual) agency. Notion of the coterie.

Siemens then talked about the humanities and social formation. The humanities focus on what it is to be human through artefacts that are the evidentiary remains of that experience. The sciences and humanities have differences in how they are formed. The digital humanities are at an intersection of the two. We sit at the intersection point of methods and computation.

He asked an important question: "Do we understand our computational methods as something that can be outsourced to other fields?" In this context the production of data/knowledge in the humanities is important. The transition of scholarship from early modern to modern was dramatic and involved the foundation of an idea of a professional academic. Frye talked about the Wissenschaft period until 1935 which was like an assembly line. It was a time of assembly of knowledge. What is this epoch? Is it also a time of the assembly of digital knowledge? Siemens then looked at questions about the humanities:

  • How do we produce knowledge? It seems to be localized piece work.
  • How do we stockpile it? Same as always?
  • How do we retrieve it? This has changed.
  • How are we connected through scholarship? One way is through a methodological commons.

Siemens talked about the methodological commons as a model and how we model our data, our processes, and our dissemination.

A second model is the community of practice as a model. Waegner (2008) Communities of Practice. Shared infrastructure, journals, conferences make us a community of practice.

Then Siemens talked about trends:

  • More data
  • workflow speeding
  • Accelerated communication

Does this improve the questions or answers. We want to be prepared for some very real changes. How do we prepare ourselves? He suggested active engagement and initiatives:

  • DHSI - consensus-driven pedagogical community
  • The sweet spot was international skills-based workshops - they are cheaper and easier to implement than formal programs
  • Now they are working on internationally distributed curriculum for a community of practice

Then he talked about a community enterprise - a social edition of the Devonshire Manuscript. He talked about the Dynamic text (1980s), the Hypertextual edition (early 1990s), and the Dynamic Edition (early 2000s). The social edition builds on these earlier forms of editing. They have documented what they have done.

  • Collaborative annotation
  • User-derived content
  • Folksonomical tagging
  • Analytics

They started their social edition by creating an entry on Wikipedia. They put their edition into the wikibooks platform and all the information is there open to all. All this went online while they did it. They had a number of interesting interactions with academics and vandals. People left annotations. Someone set up a monitor that warned about vandalism.

They also tweeted the poem and blogged about it. Finally a peer-reviewed print volume was published.

Siemens then talked about a third experiment, the Iter Community that gathers all sorts materials into a community.

He also talked about the INKE project. One of the ideas is to find ways to use tools at hand for editing rather than building new ones - like using Wikibooks for editing rather than building a custom tool. Ultimately they want to make the practices to expand the community of practice.

Tuesday, May 30

Geoffrey Rockwell: The Beginnings of Content Analysis: From the General Inquirer to Sally Sedelow

I presented a paper about work we (Stéfan Sinclair and myself) are doing replicating old methods. In this paper we looked at General Inquirer, an early Content Analysis tool. Then we looked at Sedelow's Via.

Dominic Forest: De quoi est-il question dans le discours en art contemporain? La fouille de textes appliquée à l’art contemporain dans les centres d’artistes

Forest is looking at contemporary art in Quebec and using text analysis . He is interested in the galleries and arts organizations. He is using a "distant viewing" approach (K. Bender) that adapts distant reading to the arts. He is looking at "art speak."

There has been work on artificial intelligence applications to art for classfication or style generation. Distant viewing for Forest is looking at the texts around the art, not the art itself. Their corpus typically has metadata and is divided between the texts from arts organizations and commercial galleries. They have then developed a map of the themes over time.

Their approach is then to use Topic Modelling to extract themes. He showed how different types of art (new media, sculpture) were popular and growing in subsidized organizations. Other themes like the dream, body of women, were dropping. For commercial galleries memory and local culture was higher and growing. Painting was more popular with commercial galleries. Installations with public galleries.

He closed with ideas for next steps.

Sergii Gorbachov: Comparative Analysis of Terms Denoting Donetsk (DNR) and Luhansk (LNR) People’s Republics in Russian and Ukrainian Media

Gorbachov talked about media in conflict, specifically the conflict in Ukraine. He gave a nice summary of what happened in Eastern Ukraine in 2014.

Unlike normal dialogue, in media discourse there is a power difference between producers of media and consumers. Gorbachov looked at the discourse about the newly formed republics in the east of Ukraine. He compared the discourse between Ukrainian and Russian media on the subject.

He built a corpus of articles in Ukrainian and Russian news. He used Fairclough's framework to frame his study. He talked about the key words and how the different types of media painted the events differently. They legitimize the break-away republics.

Nandita Dutta: The Fandom Speaks: Redditors’ Interaction with Star Trek

Dutta is part of the cultureplex team at Western. She talked about how they have been studying comments about Star Trek in Reddit across various subreddits (including startrek.) Star Trek is viewed positively, but some characters can be seen negatively. She showed a diachronic graph of all the comments and talked about spikes. She also looked at how Star Trek is handled in different subreddits.

Jonathan Ilan Armoza: Probabilistic Matrix Factorization for Digital Humanists: Modeling the Parts of Speech of Emily Dickinson's Fascicles

Armoza started by talking about Dickinson and the fascicles that she sewed up. He then showed how we can get a lot of data about texts.

  • Dimension reduction (PCA)
  • Clustering (K-Means)
  • Machine Learning
  • Probabilistic Graphical Models like Topic Models

He then talked about counting things, creating vectors, and then stacking them into matrices. We can do operations on the matrices like scalar multiplication. We can take matrices and try to find factors. He talked about the netflix challenge.

He uses Nimfa - a Python library for non matrix factorization (?). Here is his process

  • Tag POS using spaCy (
  • Count categories
  • Categorize using Nimfa

He showed what we can do with this. He showed how you could see anomalies. He is doing a close reading through factorization methods.

He mentioned an article by Koren et. al.

Joel Kalvesmaki: Introduction to Text Alignment Network (TAN) XML

Kalvesmaki started by talking about a complex text that he and a team were translating that quoted lots of other texts. They started technically with TEI. Anything you wanted to say you could. It was powerful and had an enthusiastic support community. The TEI had no shortage of tags, was ambiguous, idiosyncratic, and other people's TEI was different. TEI invites complexity. It focuses on the scriptum or the work. He also called it inert. You need the tools to make it work. So, he started to reimagine what he wanted: ''a web of sources with stand-off annotation with strong normalization that is heavily regulated." Such a primary source web would be tethered to IR Is? - an RDF syntax that is easier.

His second approach is now the TAN (Text Alignment Network XML).

  • Scholarly freedom - keep it simple
  • Scholarly responsibility - unique citability
  • Useful to humans and computers

He then talks about the types of files. Transcriptions, Annotations and everything else. For example TAM-KEY provides IR Is for key concepts. TAN-C is for claims and assertions.

He showed a demo of what his system could do.

The status is that it is a work in progress, but he worries about what will happen if he moves on.

He imagines a Napster for primary sources. He imagines an oXygen and XML framework.

Poster Session and Award

At the end of the day we had a poster/demo session. At the session Stéfan Sinclair and I were awarded the Outstanding Achievement Award for Computing in the Arts and Humanities for Voyant Tools 2.0 and Hermeneutica.

Wednesday, May 31st

Annual General Meeting

Tracy Fullerton: Finer Fruits: A Game as Participatory Text

Fullerton started by talking about the quest to develop the Walden game. We are coming up to the 200th anniversary of Thoreau's birth. He embodies one of the myths of the American dream. The tiny house movement can be traced back to Thoreau's experiment. Fullerton's game deals with the first year of his experiment on Walden pond. The game is partially a survival simulation, but goes beyond to inspiration which is what mattered to Thoreau. If all you do is grind away you will lose inspiration and miss out on parts of the game. The question Fullerton asked was:

Can we make a game that embodies Thoreau's experiment at Walden Pond - i.e. to reduce life to its lowest terms and see if it is "mean" or "sublime"?

Some of the themes of Thoreau and the game Walden include:

  • Self-reliance and simplicity - They based the game on the economy outlined by Thoreau. They have created different possibility spaces such that you can live entirely in the wild (and have no time to think) or depend on your mother and money (and then be able to think in a subsidized fashion.) The game presents tradeoffs between pure living off the land and having time to philosophize. These tradeoffs are a form of critique of Thoreau (who compromised.)
  • Cycles and rhythms of Nature - The game turns the woods into a character who changes over the year. Its a different type of character that touch on personal issues in Thoreau's life like the death of his brother. There are 8 seasons (3 winter seasons.)
  • Transcendentalism and American Romanticism. As much as Thoreau wasn't a joiner, he did participate a bit in Emerson's Dial journal. In the game the letters are drawn from (and condensed) from textual research. Thoreau was influenced by many women including Margaret Fuller and his mother and sisters. Fuller was also a critic of Thoreau and the game introduces her and her criticism. We see Thoreau not only through the eyes of the important transcendentalists, but also through the eyes of everyday fictional people. Issues around the abolition of slavery were heating up. Also you can find texts that inspired the transcendentalists (Confucius, Vedas ...) and these will inspire you.
  • Civil Disobedience. The transcendentalists debated how far to go in civil disobedience. The issues come in the game and you can get seized and put into jail. Thoreau eventually writes on "Civil Disobedience."
  • Encroaching Technology and the Pace of Life. When Thoreau wrote the railroad and telegraph just game through. He was sceptical of technology but good at it. The game has a to do list including surveying quests that bring in the tension between technology and nature. Do you want to do surveying that may lead to development? The to do list becomes a site of reflection. In most games crafting is good. In this game

These issues show why Fullerton thinks it is interesting to play a text rather then just read it. She talked about how students assigned Walden (the book) find it boring. As a game it can be engaging in a different way.

Fullerton talked about the irony of virtual nature. She recognizes that the game is no substitute, but it might give context to your next trip outdoors.

She talked about the deep reading and deep mining of the text that led to the translation effort. She sees the book itself as a reflecting surface, like a pond, that reflects back what you bring to it. Fullerton wanted to honour that in the game. She has taken away so much from the surface and hopes that other will reflect on it as she has.

In the discussion she talked about sustainability and how to cut things out strategically, especially things that you are passionate about.

Thursday June 1

Canadian Sociology Roundtable on Theorizing Culture:

Sonja Sapach: How Mario Gave Me Meaning and Skyrim Gave Me Structure: Exploring the Resolution of Trauma and Alienation through Participation in Video Game Culture

Sapach started by talking about alienation as powerlessness, meaninglessness, normlessness, and isolation. She is using trauma as a lens/story telling tool to look at social theory. She reads social theory in light of trauma. She sees in trauma a way of understanding both personal and societal alienation.

She is now doing an autoethnography of her experience with trauma and videogames to understand how videogames helped with trauma and alienation. She has found that she now has friends with whom she can perform knowledge of games. She can grow through gaming and experiment with identities.

Chris Martin: The Social Semiotics of Sacred Tattooing: An Analysis of Body Art in Liquid Modern Times

Martin is studying tattoos. He showed a daisy (after his daughter) on his hand. He worried if he had gone too far. The fair is real. Tattoos are permanent.

Michael Atkinson talked about 6 eras of tattooing . Martin feels that we have to go beyond Atkinson. He talked about Bauman and liquid modernity and how it would change things for tattooing. Why do people committ to being permanently tattooed despite changing world.

Tattoo enthusiasts use tattoos to map out their identity. Martin talked about Susie Scott and her theory of identity . He then talked about a particular interviewee - Dr. Harry (Hairy?). To get a tattoo is to get separated from the norm which delays the gratification of participation. Or tatoos are not rebellion as they are common, but they can separate one from previous selves. Perhaps the permanent nature of the tatoo separates oneself from a previous self in the face of liquid modernity.

Mike Folbert: On Deal-Making & Diplomacy: Emerging & Persisting Social Forms

Folbert has taken a keen interest in handshakes recently. He talked about Harper shaking Putin's hand. Then there is Trump.

The handshake is key to diplomacy - it brings the other close and keeps them far. Trump is supposed to be the great deal maker, he should be good at diplomacy. The handshake is important to both deal making and diplomacy. Folbert then compares two texts on deal making (The Art of the Deal) and the early 18th century Art of Diplomacy. ======= Tattoo enthusiasts use tattoos to map out their identity. Martin talked about Susie Scott and her theory of identity . He then talked about a particular interviewee - Dr. Harry (Hairy?). To get a tattoo is to get separated from the norm which delays the gratification of participation. Or tatoos are not rebellion as they are common, but they can separate one from previous selves. Perhaps the permanent nature of the tatoo separates oneself from a previous self in the face of liquid modernity.

Mike Folbert: On Deal-Making & Diplomacy: Emerging & Persisting Social Forms

Folbert has taken a keen interest in handshakes recently. He talked about Harper shaking Putin's hand. Then there is Trump.

He talked about differences between deal making and diplomacy. Authenticity is important to both, but in different ways. Ironically now deal making has become the model for diplomacy.

He talked about differences between deal making and diplomacy. Authenticity is important to both, but in different ways. Ironically now deal making has become the model for diplomacy.

Sheldon Richmond: Reconstructing Lost Cultural Memory from a Void

Richmond started by talking about how we (all of us at the table) are talking about truth. Richmond dedicated his talk to Bauman, Wiesel, and the people in his family killed in the Holocaust.

His project is a social memory project that tries to recover the culture that was lost. He talked about how Jews that are rediscovering their Jewishness in Poland. Many are trying to recover the culture of Judaism in Poland. Rebuilding Judaism will take institutions. Should it rebuild a simulation?

In the discussion there was an interesting question about digital dualism. Do we want the virtual to be separate from the real? Or do we think it is continuous? If we want to think of playing as a way of experimenting then it needs to be separate, but is it really? Can one practice being bad and it not change you?



