Science 2.0

These are my live notes on the Science 2.0 conference held in Hamburg, Germany. The Twitter hash tag is #sci20conf - See the tweets at https://twitter.com/search?q=%23sci20conf&src=tyah

Note: these are being written live, so they will have all sorts of omissions, silly typos, strange constructions and missing parts.

This conference is organized by the Leibniz Research Alliance Science 2.0 and Leibniz Information Centre for Economics. Klaus Tochtermann was the conference chair.

Wednesday March 25th

Klaus Tochtermann: Introduction

Tochtermann began by describing the organization of the conference and the growing interest in Science 2.0 and forms of participation. There has been a major public consultation on the subject. Libraries and other information organizations can play an important role in Science 2.0 because they have the archives/information and they are in contact with the scientists. The consultation has also opened their view of what is important to the public. It is not just a matter of social media, but also of big data and open access - issues beyond social media.

Jean-Claude Burgelman: Open Science: outcome of the public consultation on ‘Science 2.0: science in transition’

Burgelman delivered the results of the public consultation. He started by saying how important the library community is to the explosion of knowledge. We need to try to make sense of all the information.

What is Science 2.0? It is about the whole ecosystem of science and research. It is about changing how research is done. There is an ecosystem of organizations and standards that is emerging. It is a real and irreversible change. It has to do with the explosion of data and a changing relationship with data.

It is also about the globalization of the science community. There are more and more people competing and doing science around the world. There is a pressure to do science faster and solve problems like Ebola in almost real time.

There are also changing expectations among our publics. They expect to be able participate and they want transparency. Science isn't happening in isolation - it is being effected by other changes and disruptions.

The public consultation was to test the awareness, what are the opportunities and whether they are shared by stakeholders, and identify policy implications. The responses came from stakeholders - organizations and authorities.

Some of the outcomes are:

stakeholders preferred the term "open science" to "science 2.0"
stakeholders recognized the key trends from their position paper
the drivers are perceived to be important
citizens acting as scientists did not rank as high as other drivers

Some perceived barriers are:

the main barrier was perceived to be maintaining quality assurance
another was lack of incentives (I'm for open data, but I won't open my data)
lack of infrastructure
there are lots of concerns about privacy

The perceived impacts include:

It will make science more reliable - open data should make science reproducible
It will make science faster
Crowdsourcing was seen as less important

The desirable policy initiatives identified included policies around infrastructure, improving incentives (for researchers), and making science more responsive to societal challenges.

They are now developing an Open Science as a priority action under the Digital Single Market initiative of the European Commission. They are launching a European Open Science Agenda.

They are thinking of trying to establish a stakeholders forum at the European level to deal with issues (rather than national level)

They want to propose a European "code of conduct" seeing out how Open Science should affect roles and responsibilities of researchers and employers

They also see the need for development of common standards/interfaces and a European Research Cloud. The idea of the cloud is to harmonize data standards and to harmonize services and to harmonize governance. The cloud will be distributed and co-created

Whenever I hear about European initiatives I am impressed at the ambition and centralization imagined. I am also stuck by how different the rhetoric in Europe is from Canada (and the US.) The word standard isn't used as much in the circles I run in - I think there is less interest in policy and planned change in North America. We have more belief in the invisible hand of lots of folk competing and experimenting without government intrusion. I am also struck by how different the discussion is among researchers compared to research administrators/funders. As a researcher we forget all the people involved in research administration and how they can (and do) manage us.

One can see more at http://scienceintransition.eu

Isidro F. Aguillo: Metrics 2.0 for a Science 2.0

Aguillo started with the recent San Francisco declaration which attacks the current bibliometrics (see DORA http://www.ascb.org/dora-old/files/SFDeclarationFINAL.pdf ). He argued for a shift from bad bibliometrics to webometrics or Metrics 2.0.

He talked about open data leads to big data and how big data is about more than just science data.

He talked about how we need librarians to pay more attention to bibliometrics and their misuse. We need multi-source metrics. We non bibliometric metrics. We also need better attributions - the way we define authorship is too simple and lacks context. We also need to move from publish or perish to a situation where only excellence is recognized.

We are moving from journal-level metrics to article-level metrics. Likewise we are seeing a move from University-level metrics to professor level metrics.

We have also moved from bibliometrics to altmetrics (not about publications) to webometrics and so on.

His recommendation is to move to usagemetrics where we learn to understand the variety of different types of metrics. In the ACUMEN project the idea of a personal portfolio was considered. A portfolio can be a richer representation. We can now have richer profiles on the web with the various web sites like academia.edu . He also talked about introducing the idea of ranks rather than using quantitative metrics only.

He gave a very interesting comparison of the different types of academic profile systems from the university systems to Google Scholar.

Then he talked about university rankings. His Ranking Web is designed to encourage the provision from universities of open data. See http://www.webometrics.info/ . He pointed out how in his ranking you find a lot of US universities in the top ranks, but you also find thousands at the bottom. By contrast all Swiss universities are good, even if none are in the very top. Which is important for a state - do have a few top ones and lots of

Geoffrey Rockwell: New Publics for the Humanities

I spoke about the need in the humanities to re-engage our publics and how we can use crowdsourcing to do so.

Aletta Bonn: Citizens create knowledge – knowledge creates citizens

Bonn of the Citizens Create Knowledge institute talked about the missed opportunity of citizen science. Lots of data is already contributed by volunteers. There is an old tradition of weather volunteers or bird watching.

Citizen science originated in the Anglo-Saxon world and now is spreading. She talked about various papers about citizen science and its opportunities, Understanding Citizen Science and Environment Monitoring.

It is also a political goal to involve citizens. The goals of citizen science should go from data acquisition and processing to true co-design and coproduction of research. There is tremendous potential for uptake and implementation of science if citizens are involved. Think of climate change.

She had a good hierarchy of participation from passive observation to active participation to co-production to co-design. Co-design is the top. When they asked citizens they found citizens were interested in all aspects of research right up to publication.

The Citizens Create Knowledge programme is developing a platform and guides. She mentioned that there are all sorts of opportunities for new sensors and new forms of research. Citizen science

She argued that CS is a reliable approach now that involves lots of German citizens. It is easy to devalue CS in the academy. The funding system lacks the capacity for co-design and co-production. Further, we need quality standards for data management, protocols for collection, and infrastructure. There are then legal, copyright, and privacy issues to be negotiated. Do we need insurance?

She talked about volunteer management and the need for a culture of recognition of volunteers. We can learn from sectors that see a lot of volunteers - there is a literature on how to manage volunteers with dignity.

She made a fascinating point that we can use media throughout the process in CS. They (news organizations and media) can be important partners.

Who will implement all the policies, infrastructure, and initiative? How can it be organized from the ground up? What stakeholders are there who could be involved?

There were questions about empirical data to back up the call for citizen science. These questions asking for proof seemed to me to miss the point. What proof is there that university funded research works so well? Why should these experiments have to demonstrate empirical data

Alexander Grossmann: New Perspectives in Scientific Publishing

Grossmann is the President of Science Open. He talked about how the scientific communication has changed dramatically. There are now over 20 million STM active scientists worldwide and 8m humanities and social science researchers. 4m manuscripts submitted in the sciences a year with half of them rejected.

Scientists are frustrated by slowness and lack of transparency in publication system. It is also too expensive.

We need open access, immediate publication systems, global systems, open and transparent post-publication review and public discourse.

He argued that we have the infrastructure for open publication. Libraries have repositories and there are all sorts of accessibly tools. The issue is discovery, aggregation and interoperation of all the existing utilities.

He gave us a tour through how you can use various tools/services at hand in ScienceOpen lets you share an article for people to help you edit it. It then lets you publish and provides editorial oversight. Once published you can then get peer-review on the article (after the fact.) All of this is free, but you (the author) have to pay.

Rodrigo Costas: Altmetrics: What is it, what do we know about it? And what can we expect?

Costas started with an introduction of altmetrics. Altmetrics started as an alternative to the peer review system which has limitations.

There is an altmetrics manifesto at http://altimetrics.org/manifesto

Altmetrics refers to many things - the combination of a many different impact metrics together, many of which are social. Twitter is a popular source to be combined with others.

When they compare results of altmetrics with other ways of measuring impact (bibliometrics) they found interesting differences. People cite a paper in social media for often trivial reasons while citation in an academic journal is different. The good thing is that altmetrics can inform us of new communities of attention. It can show us the impact in real time beyond that on our peers.

Discussion

The final session involved breakout discussions

Science 2.0 platform - what infrastructure is called for?
International School/Seminar on Science 2.0 - what sorts of training should be design?
The future of the Science 2.0 conference - how should the conference evolve?
Science 2.0 roadmap - what can we learn from consultation and which issues should be addressed?

Thursday March 26

Eric Tsui: Cloud Computing and MOOCs for supporting Knowledge Work, Innovation and Learning

Tsui began by talking about the knowledge economy and Web 2.0.

The cloud allows us to move from pre-payment to post-payment system. Whiz kids can take a smart idea and develop an innovative idea. Animoto's use of elastic cloud, is an example. They went from 5,000 users to 750,000 in three days. The cloud allowed them to expand without having to pay first (for lots of servers.)

The cloud has led to changes in services - the ability to co-create value with others. It allows for open business models with new customer experiences.

The cloud connects computers, data and people at a massive scale. People predict that a third of our content will be in the cloud by 2016. He sees the cloud as an intelligent engine with low upfront costs that is scalable. We can create spaces where people can solve problems. It could be disruptive - allowing new collaborations to emerge.

The cloud intelligence or problem solving takes 3 forms:

Humans to solve (like Mechanical Turk or Galaxy Zoo)
Computation solutions
Human-computer co-operative problem solving

He gave some examples like If this then that of problem solving in the cloud.

His own work includes TaxoFolk which is a combination of taxonomy and folksonomy which gives you the best of both. He described a neat application that combines Delicio.us bookmarking (folksonmy) with the Hong Kong taxonomy of tourist landmarks to get a hybrid that is useful to tourists.

He then talked about the use of the cloud for education through MOOCs. He is using MOOCs to flip the classroom. he talked about a MOOC tsunami. He also finds MOOCs are a way to introducing staff to learning. He closed by talking about the Personal Learning Environment & Network (PLE&N) that his team has developed.

Stefanie Lindstaedt: Science 2.0 and Big Data

Lindstaedt is from the Know Center in Austria. The Know Center does research for data-driven business and big data analytics. Their goal is to help Austrian businesses to become data-driven. This is as much about human competencies as anything else. How do people learn and perceive large amounts of data.

She defined Big Data as Volume, Velocity and Variety of data. She prefers the definition: "Data unprecedentied in its scale and scope in relation to a given phenomenon which allows for the generation of new knowledge." (Oxford Internet Institute, 2014) The trick is to find interesting patterns in the data. That takes human intelligence and judgement. Then end is actionable knowledge - knowledge useful to users.

How then do you build a data-driven business (or project). There are four central steps:

Provide appropriate data and IT infrastructure - some would argue that that means open data and cloud
Democratize the data within the company and encourage people to use it - this risks the data leaking
Enable experimentation with Data - this could mean creating new data
Support data-driven culture - one then needs to make this gathering

She then adapted this to talk about data-driven science. We have an enormous amount of data in the form of publications - there is too much.

We need to start by asking what we want to look at
We can create systems that give us overviews (She gave an example of a visualization, Kraker, 2013, Overview of Research Domain based on Usage Data)
Integrate facts with existing knowledge
Make facts available for visual analysis - they have created a query wizard and visualization wizard that make it easy to discover things

She talked about socializing research data. There are a lot of big data platforms, but few have social elements. They have created a data flea market called 42-Data where people can ask questions about data and get help. People can then donate the data and insight. This is a customer to customer business which raises the question about what the economic model is. Check out 42-Data - I think the idea has real potential and it reminds me of Many Eyes.

Her group is also looking at the research process. They are looking at processes and asking what can and has been supported by data or computing.

She closed with a question: Is there an open science data value chain? What are the processes that add value to open data (or through open data)?

In the question period Laurent Romary made a point about ethics and legislation regarding big data and mining. The EU is debating new data protection rules.

Eric W. Steinhauer: Some reflections on science 2.0 and the law

Science 2.0 leads to new collaborations. Tools like Mendeley are more than just information management tools but also ways of building collaborative communities. The amazing tools have legal implications - who is thinking about the legal implications. The lawyers are the bad guys, but they see and anticipate problems.

Steinhauer noted that legal issues are almost always national. His talk has 3 parts:

Science becomes digital
To quote or not to quote
License it!

Science in the 50s was of books and articles. It was tangible. Communication was by publication. Once you sold a book a library or individual could share the book. The deal was that the author granted rights to publisher and then it was understood what could be done with the book. You couldn't copy it, but you could sell the book or lend it.

Now things are different. Who owns and controls usage of content in Open science? There isn't exhaustion in the same way where exhaustion is what? Every use of a PDF is actually a copy - even accessing a web page.

Now publisher restricts access by building barriers. They have become a gatekeeper. Being a gatekeeper doesn't help Open Science. The publishers have gone from facilitators to blockers.

He gave an example from his life. He took a picture of a portion of a book and tweeted it. Taking a picture of the book is legal, but tweeting is a form of reproduction. The only way what he did could be legal is if one grants a limitation to copyright for quotation. Quotation is allowed if it is in an independent work of science or language. Is a tweet such a work? Steinhauer thinks a tweet isn't really a work. He noted that in the US there is the "fair use" clause which could be used to justify the tweet.

Science 2.0 happens in a specific digital environment with things like Twitter. There terms and conditions to these tools. Facebook has a term that they are granted a license to your data. Same with Twitter. The companies need rights to run their services (and then to make money.) But Steinhauer doesn't have the right to grant a right to Twitter for the quote. Open Science and its particular tools have all sorts of legal implications. That doesn't mean that Open Science will grind to a halt.

Open Science needs people to grant licenses that allows the information to flow. We need to consider the legal situations of our content. If we want our content to flow and be quoted then we need to grant licenses.

I asked about moral rights and if they could be used (in Germany) to block/guide how one's work is mined. On Twitter I was given a link to the German language on moral rights.

Laurent Romary: The development of a comprehensive open access 2.0 policy

Romary's talked about the French context and the Inria open data policy. He believes that we need open access to get open science. Romary is an advisor to Inria on policy. What do we want?

We want researchers to have access to knowledge and data
We want researchers to have the means to disseminate their new contributions

The problem is that the costs of publication have exploded. We also have a problem that the corpus of knowledge is fragmented. You can't go to one place.

In France there as been a national open access policy. See the declaration on open access - "L'information scientifique est un bien commun qui doit être disponible pour tous." In France they have a interesting freemium models like OpenEdition.org. There is a national archive HAL where researchers can deposit scholarly work.

Green open access is the baseline. If you want to report a paper in your annual report you have to deposit it. This forces researchers to deposit if they want credit. France is close to success in information sovereignty. It is interesting that it is important to France to have a national system.

Some of the issues include the need for training of digital curation librarians. Embargoes are not part of the policy - they don't have them. Romary believes that a central system is lower cost than the decentralized models.

Some of the enhanced services possible are:

GROBID extracts structured information from flat PDFs to enable analytics. Reconstituting structured data from PDFs is like trying reconstruct a cow from hamburger. GROBID tries to do that - extracting bibliographic references, for example.
anHALytics provides enhanced reporing and studies on scientific activities. They can track keywords and emerging trends.
They now allow people to do search on HAL. They use NER to tag articles and you can search using the metadata.
They can reduce the costs of starting an electronic journal as HAL has a lot of the functions. The editorial workflow can be handled by HAL.

He talked episcience issues. There are issues around sustainability - centralized and national archives are more likely to be sustained.

Lightning Talks

We then had a series of lightning talks.

Christoph Schindler: Liquid Semantics in a Collaborative Research Environment

Schindler talked about a research environment that tries to bridge the new possibilities of data analytics with qualitative approaches. Researchers can semantically browse and annotate. Scholars can create their own codings and analyze in Semantic CORa.

Peter Mutschke: PIDs4SOM. Persistence of Scholarly Content on the Social Web

Mutschke talked about increasing the importance of research blogs. They are trying to build infrastructure that will increase persistence and citation of social web.

Sylvia Künne: Altmetrics for large, multidisciplinary research groups: Comparison of current tools

Künne compared different altmetric tools. They used as their test set the research of Leibniz institutes. Plum Analytics, ImpactStory, Altmetric Explorer, and Altmetric Explorer @ Webometric Analyst were the tools tested.

Benedikt Fecher: Data sharing in academia. Results from an empirical survey among researchers in Germany

Fecher started by commenting on how people preach data sharing, but few practice it. They surveyed academic researchers. Some of the results:

Some worried that publishing data could lead to others publishing before them
Research is a reputation economy - people are motivated by fame
79% said that data citation would motivate them share

See http://data-sharing.org/

Sonja Utz: Academics’ Use of ResearchGate

Utz talked about academic use of ResearchGate. They now have a research gate score. RG is perceived as helping promote their research. Few felt it helps them get new ideas.

What matters on RG is your network in terms of your score.

Daniela Pscheida: Social Media and Web-based Tools in Academia 2013/2014. Results of the Science 2.0-Survey

They run an annual survey of the use of web-based tools and social media among scientists. Some results:

Wikipedia is now part of daily work of scientists
Social media have a low level of active usage.
Microblogs and social network sites seem to play a role in communication

You can download their data.

Doreen Siegfried: Professional usage of selected Social Media services. First findings from the Goportis II study 2015

Siegfried talked about a survey Goportis II. In 2013 they found the most used tools were Wikipedia, tools for sharing data, professional networks ... Now they are asking what people use these tools for. Some results:

Wikipedia is used many times a week
Dropbox is used to share research data

There was then a poster session for the lightning talks so we could follow up.

Matthew Hiebert: Social Knowledge Creation in Online Research Environments

Hiebert from the Electronic Textual Cultures Lab at the U of Victoria talked about the work he has been doing on the ITER project. He started by explaining the digital humanities reminding us that we have been using computers since the 1940s.

His research considers the messiness of knowledge production and representation. He talked about how social media shape research. To some extent researchers have always used social media like correspondence as part of research. Think of the Republic of Letters.

The ECTL has experimented with social media in the Devonshire Manuscript project. They created an experimental social edition of the Devonshire Manuscript. This edition shifted power from editors to a community.

He provided an interesting taxonomy of forms of citizen research:

Contributory projects where public contributes data through system designed by researchers
Collaborative projects are designed by professional researchers
Co-Created projects where citizens and researchers co-design a project from the ground up

He then talked about the importance of iterative critical making. At the ECTL they are creating an environment that facilitates co-creation that is geared towards building communities of practice with strong links rather than lots of weak links.

Within the Iter Community they have a project called the Renaissance Knowledge Network. They integrate tools into the Iter content. Humanism for Sale is an example where they put a book up in an open annotation format using Comment Press.

Christin Seifert: Personalised Interactive Access to Digital Library Content — Lessons Learned in the EEXCESS Project

Seifert started by asking us what the problem of the internet is. She says it is the long tail effect. The digital libraries, museums and archives with good content have few users. She gave an example of how we might be able to bridge the distance between the user and reliable data. Google is optimized for highly used resources not the scholarly resources. What if we could connect users to good content from where they are.

The approach is to locate users, find out about their needs, and then meet them. They start by locating users in the channels they are in. They have browser extensions for users to use to then explore using research resources.

They imagine automating queries if they can watch users. User needs seem to be entity based. They can identify the right content for users using entities. They use visualizations to then help users, but they mock up the visualizations first.

The EEXCESS framework allows them to inject content into social media channels. To quote from the web site: http://eexcess.eu :

The vision of EEXCESS is to push high-quality content from the so-called long tail to platforms and devices which are used every day. Instead of navigating a multitude of libraries, repositories and databases, users will find relevant and specialised information in their habitual environment.

The idea is great. Will it work? I've seen other neat web overlays not last.

Klaus Tochtermann: Conclusion

Tochtermann ended the conference by telling us about the next one which will be in Cologne.