Main »

What Is Infrastructure

At a Bamboo meeting I met a man who felt there was a difference between "infrastructure" and "supplies" and who felt strongly that that difference was significant. Alas, in the dance of a workshop where we move from table to table I didn't get the chance to pursue the distinction so I have to make it up, again, out of whole cloth. So, let me rephrase the question,

What is computing infrastructure in humanities research and how is it different from supplies?

This will seem a funny question to ask, after all, supplies would seem too trivial and infrastructure important. But that is the point - what is trivial and what is important when it comes to computing for research in the humanities?

Before looking specifically at computing infrastructure, it is useful to survey some of the models we have for infrastructure in general and see how it might apply. At first glance infrastructure seems to be something that cities have and therefore is applied only metaphorically to research computing. Perhaps that's what is at stake; we want to know what we should expect to have and who will run it for us, just as we expect roads, sewers, and water maintained by cities.

Infrastructure Between, as in Roads

The infrastructure like roads and hydro that makes modern life possible seem an obvious candidate paradigm for what infrastructure is. We each buy a plot of land and the state creates the infrastructure that connects our house to others or to the services like the water supply, so our house works.

Applying this paradigm to computing we can see how the internet is infrastructure like roads for the movement of information from personal computers. You can also see where the articulation between infrastructure and personal computing is located. The infrastructure connects the personal. Anything that is needed to connect more than one person, project, or entity is infrastructure. Anything used exclusively by a project is not.

Utilities that Service, as in Water

We also tend to think of services like electricity and water as infrastructure. Obviously the transmission lines, the water mains, and the sewers are infrastructure is the sense of what is between, but we also think of the provision of electricity, water and disposal of waste as a service type of infrastructure.

Utilities are an obvious extension of the civic paradigm of infrastructure in part because they are closely connected to the infrastructure-between needed to transmit electricity or water. Utilities like sewers are also an obvious case of something that is best done for everyone. We could each dispose of our own sewage, generate our own electricity and get our own water, but these functions are better provided as services on a large and efficient scale.

There are a number of computing services that we have come to expect as infrastructure beyond the provision of the internet. There are the services like DNS that are needed to make the internet work, there are the services like e-mail that work over the internet, but we have all come to expect, and there are services like web servers that more efficiently provided centrally, but have not become expectations yet.

Organizations like Governments

When we look closely at civic infrastructure we see that the physical infrastructure and service infrastructure is dependent on organizations. In fact, if it is important that infrastructure last and be public (see below) then the organization that maintains it is more important than the item itself. A good government that builds and maintains bridges is more important than any one bridge. A bridge might be built, but it won't be safe to cross if there isn't the government behind it and that means specifically the management, staffing, budgets, and equipment that make infrastructure work and keep on working.

Policies like Standards

Digging another level down one finds that essential to infrastructure are the standards, policies, and procedures that allow us to build on infrastructure. It matters that electricity is provided at a standard and advertised voltage. Governments have zoning laws, policies, and procedures for handling construction both of the infrastructure they will maintain and the for those who build new developments on infrastructure.

In computing we see the importance of standards in technologies like the World Wide Web. What makes the web work is not one server or one web browser, but the W3C standards that let different tools work together. This is the lightest type of infrastructure where there is no material or software base on which others build, but a base of definitions and standards on which others build the tools. Perhaps works like the Text Encoding Initiative Guidelines are the real infrastructure of humanities computing.

Infrastructure and Supplies

What then is the difference between infrastructure and supplies? The argument for civic infrastructure would be that infrastructure facilitates the production of goods and services while supplies are transformed into goods and services. Infrastructure is the public works that are needed for a functioning modern society while supplies are that which are often delivered over the infrastructure to private businesses to produce goods for market. While there are problems as to what exactly a service is such that we can distinguish between the infrastructure and the thing itself (because infrastructure services look a lot like good services), we still think we know the difference. Electricity is a service that needs significant investment in physical generation and transmission infrastructure. Coal is a supply, even if for the power generating station that provides the electrical service. Some of the things that distinguish infrastructure from supplies are:

  • Infrastructure should be designed to be sustainable. You build a bridge so that it will last and you plan to maintain it. Infrastructure is not designed for ad hoc or one-off use, it is meant to be used over and over. Supplies are meant to be used once (and possibly recycled). They are those things that we assume will be consumed in the usage. Even if we end up needing more of the them, any one supply item is not expected to last. In this sense infrastructure is the computing stuff that we need to always have at hand like the internet and the basic services.
  • Infrastructure should support public as opposed to private interests. Infrastructure is what is needed in common and therefore that which it makes sense to provide efficiently through one service. Supplies are consumed individually, they are the things we need for our individual jobs. In this sense supplies might be things like software licenses that are used by one person for the duration of the license. Software infrastructure might be the things that other software depends on.

Defining Cyberinfrastructure

It is worth asking why defining infrastructure is important at this juncture. One reason is that the act of defining things as infrastructure positions them as things that someone should provide and provide in an efficient and sustainable fashion. Calling something infrastructure is not a neutral act, it positions that thing as something that:

  • is broadly useful to a public,
  • should be funded by the public for the public,
  • is further expected to be a high priority for funding as opposed to something that would be nice to have,
  • may be paid for first by special subvention, but then needs to be maintained, and
  • is maintained by some organization that has ongoing funding to maintain the infrastructure.

In effect, calling something infrastructure is a way of changing the urgency of its provision and changing the perception of who should fund it and maintain it. It is, in short, a great way to argue that some organization like a university or government should fund something in perpetuity rather than fund it as a grant would for a particular period and group. Calling something cyberinfrastructure distinguishes it from that which only a project needs and which is needed only for the duration of the research.

The call for infrastructure is not new. The Wikipedia article on infrastructure argues that it gained prominence in the 1980s with the publication of America in Ruins by Choate and Walter, a book which argued that the US had underinvested in public works.

In the case of research in the humanities the Mellon supported ACLS Commission on Cyberinfrastructure in the Humanities and Social Sciences report, Our Cultural Commonwealth (PDF) answers the question thus on the home page,

"Cyberinfrastructure" is more than just hardware and software, more than bigger computer boxes and wider pipes and wires connecting them. The term was coined by NSF to describe the new research environments in which capabilities of the highest level of computing tools are available to researchers in an interoperable network. These environments will be built, and ACLS feels it is important for the humanities and social sciences to participate in their design and construction. Ed Ayers has commented that much of the work of developing the Valley of the Shadow was analogous to building a printing press when none existed. Effective cyberinfrastructure for the humanities and social sciences will allow scholars to focus their intellectual and scholarly energies on the issues that engage them, and to be effective users of new media and new technologies, rather than having to invent them.
"Cyberinfrastructure" becomes less mysterious once we reflect that scholarship already has an infrastructure. The foundation of that infrastructure consists of the libraries, archives, and museums that preserve information; the bibliographies, finding aids, citation systems, and concordances that make that information retrievable; the journals and university presses that distribute the information; and the editors, librarians, archivists, and curators who link the operation of this structure to the scholars who use it. All of these structures have both extensions and analogues in the digital realm. The infrastructure of scholarship was built over centuries with the active participation of scholars. Cyberinfrastructure will be built more quickly, and so it is especially important to have broad scholarly participation in its construction: after it is built, it will be much harder to shift, alter, or improve its foundations.

Note how the commission drew on the work of the 2003 National Science Foundation report Revolutionizing Science and Engineering through Cyberinfrastructure. In Our Cultural Commonwealth they draw on the Atkins report (as the NSF report is known) for a definition of infrastructure - one that to some extent determines the outcome.

In other words, for the Atkins report (and for this one), cyberinfrastructure is more than a tangible network and means of storage in digitized form, and it is not only discipline-specific software applications and project-specific data collections. It is also the more intangible layer of expertise and the best practices, standards, tools, collections and collaborative environments that can be broadly shared across communities of inquiry. (Page 6)

It is also worth noting how the commission writes a history to cyberinfrastructure arguing that libraries, finding aids, journals and so on are already existing infrastructure and cyberinfrastructure is just the extension of what we expect into the digital realm. Many of the things like concordances and dictionaries we would call (following SSHRC) "research tools." Others like digital editions of content I would call just editions. Few would have called them infrastructure except in the weakest sense of something that others build on. What changed was how these things have to be funded. A good print concordance or critical edition can be treated as a project. Once it is done you print it, sell it to libraries, close down the project and move on. Not so with digital editions or digital tools. They, it seems, need to be maintained perpetually to be accessible at all - you can't print a bunch of copies, put them in libraries, and let the librarians deal with the maintenance.

Infrastructure is a change in funding model

This means we have problem in the digital humanities, and one that has been noted under a different rubric. The crude perspective on the problem is that we are drowning in our own output. The more sophisticated digital works we create the more there is that has to be maintained and maintained at much greater cost than just shelving a book and occasionally rebinding it. Centres and institutes get to the point that they can't do anything new because maintaining what they have done is consuming all their resources. One way to solve that problem is to convince libraries to take your digital editions and the Fedora Commons project is an admirable attempt to define a repository technology so that it is possible for libraries to maintain digital content (and so that there is an organization, the Fedora Commons, to maintain the technology.) Another way is to define certain tools as cyberinfrastructure so that they are understood as things that need ongoing support by organizations funded over the long term. If the scale is right we might even have an economy of scale so that we would all pay for a common organization to maintain the commonwealth of infrastructure and that is one read of what Bamboo is trying to do: determine what things are needed in common for research and develop a consortium that could develop and sustain them for us at a cost we could afford. A worthy goal that may be too late or just in time given the fiscal storm that could redefine higher education.

Another Infrastructure Model

Bamboo may be the last ambitious project in the digital humanities - the last attempt to do something for the common good on a large scale for a few decades. I write this in March of 2009 when the fiscal prognostications are for a long recession that will mean deep cuts to universities and therefore to new infrastructure projects. Perhaps by redefining infrastructure we can imagine how to meet the ambitious recommendations of Our Cultural Commonwealth. I therefore provisionally propose the following:

  • We should *not duplicate infrastructure that is provided by industry*. I am thinking of things like Google Scholar or Google Documents. Industry tools and services have all sorts of problems, but they are at least free or cheap. As much as we know how industry services have problems, we still should try to work around them not against them.
  • We should propose *open infrastructure for the public*, not just academic researchers. I have two reasons for this. First, if infrastructure is to be public works then it should be available to the broader public. If we ask the public to pay for it, why not let them use it? It certainly makes it easier to argue for cyberinfrastructure for the humanities is we argue for it for all. Interestingly this is where humanities infrastructure is different from scientific infrastructure. Most of the infrastructure we need is also usable by the public as a whole just as they can use libraries. From which follows my second reason, the public can use cyberifrastructure and will use it if it is openned to them. The humanities are distinguished as those disciplines that all are potentially interested in and all are potentially able to contribute to. Everyone is a philosopher, everyone has read literature, everyone knows something about history. Lets include the public of everyone into the project of the humanities by imagining infrastructure that involves them.
  • We should work on the *infrastructure as policies, interface definitions, and candidate standards*. The least expensive infrastructure and that which we are best suited to maintain is the infrastructure of standards, policies, and guidelines. Let us take a moment to sit back and imagine that others will build things if we can define what we need in a standard form. This is consensual and communicative work that we should be good at. Alas, the hard part lies is getting people to contribute if you do define a way things could interface, but again, we should know about rhetoric.

Afterword on Bamboo

Infrastructure moves between stories of need to recipes for provision

I should admit at this point that I am one of the volunteers working with the Bamboo project in the Tools and Content Working Group. That means that I believe in it enough to contribute to it and to let my name stand as associated with it. This essay in fact was provoked by ongoing reflections as to what Bamboo could be and the challenge of infrastructure. Once you commit to a large participatory project like Bamboo you switch from asking "what is it?" to "what could it be that would be useful?" I have in fact taken the opportunities provided by being a volunteer to argue and contribute to one thing I think Bamboo should do, and that is develop a sustainable means for researchers to describe what they need so that others can try to imagine services. In particular I am arguing that the most important part of Bamboo is that it:

  • Provide a means for researchers to describe what they want to do that could use computing,
  • Provide a process for drawing general recipes out of these stories for general academic tasks and prioritizing them,
  • Provide a means to connect these recipes to existing tools and services where they exist, or to define new ones genuinely needed where the tools and services don't exist, and
  • Recognize those others who do develop new tools and services that meet the needs drawn thus.

Bamboo would thus keep our collective cookbook in such a way that would celebrate what has been done, connect researchers to recipes for getting things done, and give us a way to imagine what could be done.

This is not an expensive Bamboo, it is perhaps the "Lemongrass" that flavors what others cook, but it is doable and it takes advantage of what is at hand, which was the subject of last week's essay.



edit SideBar

Page last modified on February 28, 2010, at 12:32 PM - Powered by PmWiki