philosophi.ca : Theoretical Issues In Humanities Computing

Outline to an Approach to Theoretical Issues in Humanities Computing

Note this is a work in progress and is both truncated and rough.

Introduction

What are the theoretical issues of humanities computing?

In January of 2009 there was a discussion on Humanist about whether things like prototypes could be theories. [1] "Thing theory," as the issue was called, nicely captures the problem of theory in humanities computing, a discipline committed to creating digital works as research practice. If we theorize in humanities computing then we must be doing in some around the digital things we make which raises the question of whether things can be theoretical and, if they are, how do we theorize through them? This seemed a new question as humanities computing has typically concerned itself not with theory but with methods and practices in the creation and analysis of electronic texts and hypermedia. Humanities computing evolved as a particularly pragmatic field in an academic context that has privileged theory. We used to be content to be the fiddlers, fixers, and doers with our digits firmly grounded in the matter rather than in the face of the theoretical turn. Now we are turning to wonder about the things we have wrought and how they are theoretical. Is it time to turn to theory? What would be the theoretical issue we should turn to?

This document outlines an approach to theoretical issues through questions. It is not an essay except in the sense that it proposes a way to move the discussion forward. It is an outline since it is written like a course outline of questions that is meant to provide an overview of what could be discussed. It is an approach because I will propose a way to start asking about the theoretical issues rather than an exhaustive topology. I prefer to think of theory as a path than a map, for which reason I prefer words of movement like "approach" for reasons that I hope will become clear. For that matter, I don't believe one should try to exhaustively list the theoretical issues in any discipline. One can try to organize the issues that have been raised, but to anticipate what will be issued is hubris. What matters is how we start and orient ourselves not closing off where future students will go. Lastly, this approach or organized around questions; questions that I hope are important, questions that are recognizable as having been asked already in the discipline, and questions without obvious answers so that the discussion would be something theoretical. [2]

Doing Theory

This outline is certainly not the first attempt to approach theoretical issues in humanities computing. Willard McCarty regularly floats theoretical issues across our screen through Humanist, usually followed by a request for comment. He even called for the top issues once, though he called them "questions." [3] In his book Humanities Computing he raises issues, notably in the last chapter "Agenda" where he sets out an agenda of study for the discipline. [4] These are ways of doing theory whether by using the technology of the discussion list to ask questions or the technology of the book to propose an agenda.

This document takes another approach, that of outlining a sequence of questions, but how can questions approach theory? This outline assumes that a theoretical issue is some form of challenge or question for which a theory is the response.Theory is thus "issue" here in an active sense of theorizing forth as what you can do as part of the conversation of a community. For example, if I raise the issue of the nature of computation by asking "What is computation?" the response could be a theory of computation which might start with "Computation is ..." and that response would further a conversation we would recognize as belonging to the community. Of course, one could also answer by pointing to some black box and respond, "computation is what that box over there does." The questions are, after all, not the issues, but prompts to theoretical conversation. In some cases they are provocations and if they work they will provoke you to theorize rather than just tell you about theories past.

There is a pedagogical aspect to this outline. It was born in the course of discussion and was framed to be put down before it runs its course. Questions and responses that approach theory have the advantage that they can be used when teaching theoretical issues. I can ask a class a question like "What is computation?" and expect that they understand the game even if they don't understand what a theory is or what computation is. Understanding the game and wishing to delay the lecture part of the class they will typically answer with something that we can treat as a prototype theory to toss around like a thing as we probe deeper. [5] To be honest the most difficult part about starting to theorize is the starting, especially in the face of broad and deceptively simple questions like "What is computation?" To avoid answers that are not theories and to help us issue trial theories, it is helpful to identify different tactics for theory exploration. Here are some:

List examples of the phenomenon to be explained by theory. While pointing at an example of something is not theorizing, one can usefully start by listing the breadth of examples that we would expect a theory to explain. Scientific theories are based on observations of
Figure out what other people have proposed as theories in response to similar questions. To do this you can use reference tools from the Wikipedia to the Stanford Encyclopedia of Philosophy to look up the phenomenon in question and see if the article mentions something that could serve as a theory. Even better would be to go back over the literature in the field to recover (or research) candidate theories.
Unpack the etymology of the words in question. Dictionary entries, especially those that provide etymologies can help develop theories. "Computation" comes from "compute" that comes from the Latin for "to reckon, to count". We could then start with a theory that computation is counting and other mathematical operations like reckoning. From there we can ask how we need to adapt that simple theory to account for counterexamples. How is spell-checking computation as counting?
Ask what people tend to say about the phenomenon. When people talk about computation, what theoretical claims do they make? Such a linguistic approach doesn't have to be tested by gathering an appropriately representative sample of discourse around computations - you can just ask yourself what people typically say or would say about computation. Does it sound natural to say that my computer is counting things? Does it sound right to say that a computer can scratch your belly when you itch? Philosophers play such language games all the time to develop theories based on everyday discourse. Linguists take such games seriously developing representative corpora to systematically show how limited is the imagination of philosophy.
Ask what people "really" do. Similarly we can ask about what people do with a phenomenon (rather than what they say about it.) What do people do with computers and how is that different from what they say they do? Does that give us a clue as to what computation is? If most people around us use computers for communication can we develop a theory of computation as communication? How does such a theory account for all things done for us on computers like

Theoretical Issues

What's in a discipline's name? The Path of Naming

One way to approach the theoretical issues in "humanities computing" would be to work with the two terms of the phrase. We could start by asking:

What are the humanities?
What are computers and what is computing?
And then ask some sort of question about that involves the two like, How can computing assist humanities inquiry?

This approach has a panoptic advantage that I will return to later, but also suffers in that the first two questions are not really asked in the field. Humanities computing does not, generally ask what the humanities are or about what computers are. It isn't until the third question that we get something that comes up in the field regularly. It is assumed that we know what the humanities are and what computing is, and therefore the new thing is the combination.

I suspect that the reason for our avoiding asking about the humanities is that the "humanities" is, by and large, an administrative category for a collection of disciplines that typically are gathered in faculties called "humanities" in the modern Anglo-American university. In the French tradition they talk about the "human sciences" that include what we call the social sciences. I don't think anyone wants to limit humanities computing to an administrative tradition in certain universities. The "humanities" is the best word in English for those disciplines interested in human expression, artifice, interpretation, and (yes) theorizing.

Our reason for avoiding defining the computer is because there are other disciplines that do that. Computer engineers, computer scientists and information scientists are concerned with the possibilities of computing and its definitions. Humanists can offer interesting perspectives, but we tend to start pragmatically with computing as a given opportunity and then ask about its application to problems that interest us. Which brings us to the third question about the application of information technology to our practices, methods, and questions.

A second problem is that the question involving "humanities" and "computing" could be framed in different ways. I framed it so that the answer might be the types of computing artifacts that computing humanists typically produce, but that runs the danger of defining the question by what we have done rather than defining it in a way that opens new vistas for what we could do. (Not that we don't want questions that acknowledge what happens on the ground, so to speak.) Other ways we could frame the joining question could be:

It could be framed historically, How *has* computing been used in the humanities? What are the issues that have been central to the field humanities computing (digital humanities)?
It could be framed by placing the humanities perspective first, How can the humanities think through computing? What insights do the humanities have to offer computing?
It could be the two perspectives inform each other and there is a dialogue between them.

What is our subject? The Path of Matter

Another way to develop a taxonomy of questions is to look at the materials of individual disciplines in the humanities and ask about their representation on the computer. We could go through each discipline in the humanities and ask:

What materials are consumed as evidence or subject matter by the discipline and how can these materials be best represented digitally?
How could digital representation of these materials enhance the activities of the discipline from teaching to research?
How can computing help return the new knowledge of the discipline to the community?

Thus for literary studies the questions might read:

How can literature be best represented digitally? This would then lead to questions about encoding and hypermedia editions.
How can computers assist in the interpretation of electronic editions of literature? This would lead to questions about scholarly reading and text analysis.
How can literary studies research be published electronically? This would lead to questions about online journals and other forms of scholarly return like blogging.

These are good starting questions, especially since they are accessible to people new to humanities computing. Most academics have probably wondered about access to digital texts and whether they want to publish in online venues. The second question about interpretation is less likely to be arise spontaneously, but colleagues can understand it as a variant of concerns in media and technology studies about the relationship between medium and message. There are, however problems with this framework:

This frames questions by discipline rather than for humanities computing as a whole. Surely there are common questions, especially where there are common materials, interpretative practices and publishing practices. Rather than promote disciplinary insularity and thereby fragmenting humanities computing into computing and X (where X is your discipline) we should start by assuming that there are common questions.
This framework uses as an organizing principle the idea that disciplines have an input (subject matter), practices of interpretation of that input (methods), and an output (publication venues). While this may be true of some disciplines to some degree, it overlooks all the interesting questions that don't map onto this model of the disciplines. For example, to paraphrase Jerome McGann, humanities disciplines are autopoetic - their output is an input and therefore we need to ask about the ways disciplines negotiate themselves in dialogue. We also have disciplines whose subject matter is not necessarily material and therefore difficult to frame as a problem of digitization. If philosophy for example, is concerned with ideas and theories as they are held by people, it is difficult to see how philosophy's subject can be digitized without agreeing to treat cognition as measurable (and hence digitizable) and question in philosophy. To try to digitize thought would be more of a hypothesis than an assistance to philosophy.

Doing this we find that there are common materials across the humanities. Humanities computing as opposed to *choose your discipline* computing as in "computational linguistics" is concerned with sharing questions and methods across disciplines. Thus we can look at the common types of materials commonly handled:

Text
Images
Time-dependent media from audio to video
Buildings and spaces
Interactions?

There are also common forms of scholarly apparatus that we use to track knowledge:

Notes
Annotations on another work
Bibliographies

The Outline of an Approach through Questions

I propose to end with an organic itinerary of questions. These are not arranged according to some framework, but organized for a purpose, namely to provide a sequence of questions that introduce the reader to the theoretical issues in humanities computing so as to open room for new questions. If posing questions is half the work I want to avoid halving the work by first following questions that have been raised and then stopping right where you can ask what are the next questions. Some of the reason for this itinerary and not another are:

I wanted a relatively short list so that each question could, in a sustained discussion, be treated thoroughly. If you want to theorize seriously you want to avoid rushing through questions as if they are appetizers before serious work.
I wanted a list that would map onto the real practices in humanities computing. Each question should, if possible, have consequences for what we do. When Renear et al. propose a theory of text as Ordered Hierarchy of Content Objects they do so because it maps onto a technology of representation, namely SGML and XML.
I wanted a list that would lead back into theoretical essays in the field. Where possible I wanted the questions to open a window onto the rich literature that we share. That is not to say there is a canon, but there are landmark essays. For that matter there are gaps that these questions might point to.
I wanted questions that would distinguish between humanities computing and its neighbours like game studies, new media studies, cyberculture, and so on. It take it that what is important to humanities computing is the computability of the questions. In other words we are the field that applies theory or thinks theory through computable applications. This limits the questions properly our own, though it should never the limit the questions we acknowledge are important.

1. Electronic Texts and Markup

What is a text and how should it be represented on a computer?

I start with this question because it was one of the founding questions of the field, at least the second part was. It is a question that carries the the particular relationship between the theoretical and practical that distinguishes what we do when using computing in the humanities. We make decisions about what is important in the textual traditions of the humanities as we represent texts on the computer. Actually I think the practices often go the other way - in developing electronic text models we discover theories of texts.

Associated Questions

How have we understood text by representing it on a computer?
What is an electronic text and what is its relationship to other types of text?
What can we do with an electronic text?
What is markup? Is markup part of the text or beyond it? What does markup add to an electronic text? How is markup a theory of text and its interpretation?
What is structure in a text?
Are digital artefacts interpretations? If they are, what are they interpretations of? Are they theories? Can there be theoretical things?
How does the representation of a text in TEI/XML facilitate interpretation? How would it constrain it? How does it facilitate access and electronic publishing?

Readings

Sperberg-McQueen, C. M. (1994). The Text Encoding Initiative: Electronic Text Markup for Research. Literary Texts in an Electronic Age. B. Sutton. Urbana-Champaign, IL, University of Illinois at Urbana-Champaign, Graduate School of Library and Information Science: 35–55.
Sperberg-McQueen, "The State of Computing in the Humanities: Making a Synthesizer Sound like an Oboe" - Paper given at a colloquium at Tübingen in 1995. Sperberg-McQueen has an interesting survey of "Document Geometries" that nicely shows just how different the ways of representing text on a computer are. He also, with the synthesizing an oboe problem, has a nice metaphor for the challenges of humanities computing.
Renear, Allen; Mylonas, Elli; and David Durand,"Refining our Notion of What Text Really Is: The Problem of Overlapping Hierarchies" - Available online here. "Final version, January 6, 1993. A slightly edited version of this paper was published in 1996 in Research in Humanities Computing, Oxford University Press, Nancy Ide and Susan Hockey, eds." This is a follow up to the classic "What is Text, Really?" paper that argued that text is an Ordered Hierarchy of Content Objects (OHCO). This paper moves nicely to counterexamples and then to alternative theses.
Hayles, N. K. (2003). "Translating Media: Why We Should Rethink Textuality." The Yale Journal of Criticism 16(2): 263-290. DOI: 10.1353/yale.2003.0018

2. Computers and Interpretation

What is interpretation and how can tools assist in the interpretation of texts?

This is the companion question to the question of text. The two are intertwined in the trajectory of the field which tacks back and forth between text representation and text analysis as strategies for advancing humanities research. Of course the two are connected - assumptions about what one might do with an electronic text inform the design of the representation of the text and vice versa. But while intertwined, the two approaches are different and draw on different backgrounds - that of the editor for representation and that of the tool developer or programmer for automatic interpretation. If you will, the dance of representation and interpretation is the dance of humanities and computing, though the humanities does not necessarily map onto representation in this equation.

Associated Questions

How does a concordance enable or constrain interpretation? What are the interpretative presuppositions of a concordance? How are text analysis tools different from a concordance?
What questions can we ask of electronic texts and how can we ask them?
What are tools? Are tools an interpretation or a theory of interpretation?
What are the practices of interpretation in the humanities and how are we already using tools?
Is there really a difference between representation and interpretation? Is there really a difference between an electronic text and a tool?
What is code? What is the difference between code and text?
Is markup code or text? How is markup different from executable code?
Is markup part of the text or external to the text? Is markup metadata?
How can markup be used to enrich a text?
What is metadata and how can it be used to enrich knowledge?
Is the writing of code different from the writing of prose? What are the differences between the tools of writing? How is writing and programming similar or different?
How is computation different from reading?

Readings This is, of course, the question Stéfan Sinclair and I have dedicated ourselves to in Hermeneuti.ca, the book (MIT Press, 2016) and project. See the bibliography of that book for more.

3. Annotation and Enrichment

How can a community of interpretation maintain, enrich, extend and critique its texts over time? How can computers assist a community of interpretation?

Computers have not only changed the representation and analysis of texts, they have also changed the way a community of interpretation maintains and deforms texts as tradition over time. This question gets at how we use computers to distribute information, how we use them to sample, edit and remediate information, and how we use them to enrich a text in a tradition.

Associated Questions

How can we enrich a text? What are the best strategies for enrichment?
What are we trying to achieve through the annotation and enrichment of electronic texts? Are we adding to a text, or interweaving new texts, or embedding our interpretations?
How would representation of the history and context of a text including its reading facilitate or constrain interpretation?
What should be preserved and what should be allowed to change over time in research data/text? How should we represent our knowledge in a text, if at all? Should we keep our interpretations out and separate or intertwingle them?
How should research data/text and community data/text be archived? How can computers assist in archiving? What are the different models for digital archives and who should do it?
How does digitization and archiving constrain a community and what does it enable?
What are the best practices in data/text preservation at all stages in the creation and distribution? What should each of us be doing?
What should be lost?
Are we being overcome with information and what, if anything, should we do about it?

4. The Matter of Computing

What is the digital and why does it matter?

We need to ask about the objects of humanities computing. Humanities computing has to do its ontology and ethics. We are therefore interested in what matters and our subject (or object) matter.

Associated Questions

What are the things of humanities computing?
What is the subject matter?
How is a digital text a thing and what sort of thing is it?
How can we represent (other) evidence of interest in the humanities on a computer? How can we digitize images, buildings, historical events, ideas, and processes? What is gained and lost by digitization?
What sort of thing is data? What is code? What is hypertext and interactivity?
What is a program and what is computation?
What sort of thing is the computer and what is a network?
What are the responsibilities of humanities computing?
What are we expected to do and by who?
Who are the stakeholders in humanities computing and how do we interact with them?
What would a code of ethics look like for computing humanists?
What would be the consequences of there not being a field like humanities computing?

5. Multimedia and Transcoding

Is hypermedia a metamedia capable of including and remediating all other media?

Associated Questions

When is the level of digitization enough and what would that mean for different media?
What are the standards and best practices for scholarly digitization and scholarly multimedia works? How were those developed and how will they be maintained?
What is a multimedia work?
What media can multimedia remediate?
How is multimedia different from other media?
What is new in new media?
If multimedia works are new unities of other media then what are the structures that integrate multiple media?
Is code a media? Is interactivity new?
How can we visualize a text?
What is to be learned if we sonify an image?
Do different media have different rhetorical effects? Are different media better for different uses?
What is gained or lost when you convert information from one medium to another?

6. Visualization and Interactivity

What is a visualization and what is its relationship to the original?

Associated Questions

How does a distribution graph work, really? What does it help us do?
How is a word cloud read? What are the assumptions we bring to reading a word cloud?
How have we used illustrations of texts? What can an illustration show?
What is a visualization and how is it different from an illustration?
What are the different types of visualizations? What can they show? How do they assist research?
What are visualizations read? How is that reading different from reading a text?
What are the expectations about visualizations like distribution graphs?
What is the difference between you standard KWIC and a visualization like a distribution graph? How is the reading of them different?
What do we lose when we translate one code from one sensory medium (like text) to another (like sound?) What are the dangers and opportunities for transcoding?
What is interactivity? What does it add to a visualization?
How is interacting a way of interpreting?

7. Hypertext and Online Publication

How do hypertext technologies change the relationship between author and reader, if at all?

Associated Questions

Who is the author of the Wikipedia? How is authorship changed by distributed web technologies?
How are online hypertexts read and by who?
How have the community of knowledge been changed by distributed electronic media and the web?
Is hypertextuality just another form of interactivity?
What is interactivity and why does it matter?
What sort of rhetorical artifacts are computer games? How do they communicate, if at all?
How should interactive works, including computer games, be studied and interpreted?
How can one evaluate new media work for publication?

Associated Questions

How can we publish a text on the internet?
What opportunities does online publishing offer? How could an online essay be different?

8. Methods and Practices

What are the different strategies we can use to analyze a text with a computer?
What types new questions can we ask with computers? What questions can we ask with computers? How can questions approach theory? How have these questions approached theory?
How does computing change how we think about research and our research practices? How could computing support research?

9. Primitives and Principles

What primitives of the humanities can be computed?

Associated Questions

What is a word and how is that a primitive of language? How can words be identified in text by computer? What is the relationship between words and meaning such that indexes of words can help understand meaning?
What are the primitives of the genres, types of information, and media studied across the humanities?

9. Interface and Virtuality

Why do we flip the pages of a book and scroll down in word-processor?

Associated Questions

Is there any reality to the virtual?
What is a virtual machine? Is there a virtual human? in the machine?
How does software model its audience?
What are the affordances of software?
What are models for the relationship between human and computer?
Why do we notice the interface?
How can one learn from users about interface? How can one probe possibilities for interface? When should usability testing take place in a digital humanities project?
How can you know what users will do with technology they don't have?
How do computers manage myths of space and time?
Have computers compressed space and time?

10. Instructional Technology

How can we teach the humanities with computing?

Associated Questions

How are computers used in teaching and learning?
How are computers used in the administration of education?
How would we know if computers helped learning? How should we evaluate the use of instructional technology?
What are the costs of instructional technology?
Is innovation and experimentation with computers in education a form of research?
Is there a level of computer literacy we should expect of our students?
Should we offer courses in computer literacy for the humanities?
How should we integrate computing into the humanities curriculum, if at all?

11. Social Technology

How can computers change the ways academic communities form and conduct their business?

Associated Questions

What are the types of institutions that characterize the field or sustain it? How are the institutions changing?
What could the institutions of digital humanities be?
How does computing change how we interact?
How does it change research and its practices?
Will computing change the questions asked in the humanities and the answers?
What is an institution? Are institutions a technology? Can humanities computing treat its institutions as subjects for study, research and creation?

12. Discipline and History

Is humanities computing a discipline?

A question that raises its head on a regular basis is whether we are a discipline or not.

Associate Questions

What is the difference between humanities computing and the digital humanities?
Who studies humanities computing and in what context? Who teaches it?
What is the subject of study in humanities computing?
What is the relationship to method in the field?
How does the field reproduce itself?
Does the field have a canon or common practices that should be taught?
How does the field use computing to extend its reach? How should it?
What should be the relationship between humanities computing and the humanities?
Should new media work be considered for tenure and promotion? If so by who and how?
What is the history of the field?
How should the history of humanities computing be discovered?

What's Left

It is only honest to ask what is left out by this approach through questions. Some of the things left out have been touched in the paths not taken like the first path of asking about the humanities and computing. Humanities computing could be a discipline that asks again about what is obvious like what it is to be human and how we are reflected in computation. Humanities computing could actually be about the human(ities) and computing. It says something about our discipline that we have left such questions to others whether it those interested in cyborg studies or the philosophers of science and technology.

One of the returning characteristics of humanities computing is that in developing electronic representations of traditional works we are forced to re-examine what we know about traditional technologies of representation like the book. To create a digital edition of a novel you have to chose what to represent in code and what to leave behind which forces you to ask what was important in the reading of the novel. Thus humanities computing is by practice led back to the technologies of the humanities and then to what it is to be human in a conversation of traditions that stretch thousands of years. This turning back and re-searching what we thought we knew in light of the challenge of computing is strangely not a theoretical issue for us to be raised by questions, but a humanities issue raised in our case through practice, that of making theoretical things. It may be that humanities computing is the craft way of making rather than theorizing in which case the theoretical issues outlined here are not properly the subject of humanities computing unless pursued through fictio and poesis.

How can theoretical issues be thought through the making of interpretative things?

In the end this outline too is a "theoretica" or theoretical thing. This outline is about theory and presents a theory of what is important to theory in humanities computing. It does it on a computer, but not through a particularly innovative use of the computer as this page could be printed.

This outline is also a theoretical thing in another sense closer to the type of theorizing we do in humanities computing, it is meant to be implemented or run through a course. This thing grew out of a course and it is a reflection on the path that could be taken by the next course. It is meant to be implemented and tested in the type of community you get in a graduate class where researchers work through something. The emphasis on questioning as hints to theory is to stay the course.

Notes

[1] Link to excerpted discussion.

[2] I should add that this document originated as an outline for a course I had been assigned just then on "Theoretical Issues in Humanities Computing." Thing theory nicely anchored the course connecting what we were talking about to what the community was discussing.

[3] McCarty call for questions.

[4] McCarty, Willard. Humanities Computing. London: Palgrave, 2005.

[5] There is a distinct anti-theoretical instinct in humanities computing, perhaps because many of felt alienated by a theoretical turn in our home disciplines that deligitimized the very materiality of the subjects we were studying. We might elsewhere consider the anti-theoretical moves of those skeptical about theorizing and take seriously the suggestion that theory is not what humanities computing does and therefore there are no theoretical issues.