NSF/JISC Repositories Workshop
Bas Cordewener,
SURF Foundation
April 10, 2007
Download: PDF Version WORD
version
This is a compilation of (slightly edited) existing documents – not
to be reproduced in this form.
For the original documents: see www.surf.nl/en and www.knowledge-exchange.info
Other relevant websites www.darenet.nl/en; www.hbo-kennisbank.nl (Dutch)
I. Current SURF programme: SURF Share (to be developed)
(page 2-8)
Free
Access to Research Results
Enhanced
Publications
Collaboratories
The
Digital Workbench
Review,
Access and Impact
Research
Information Systems and Infrastructure for Universities of Applied Sciences
International
Perspective
II. Former SURF programme: SURF Dare (now a service)
(page
9-11)
Showcase
your work
Selective
Search
Window
to your Field of Expertise
The
Power of Standards
Your
Work, your Rights
Digital
Sustainability
III. Knowledge Exchange recommendations
on IR (to be transformed into activities)
(page 12-18)
e-Theses
Open Archives Protocol for Metadata Harvesting
Usage Statistics
Author Identification
Exchanging Research Information
Research Paper Metadata
Annex A. Petition to European
Commission
(to support an OA supportive course)
(page 18-22)
Annex B. License to Publish
(OA copyright agreement for HEI publications)
(page 23-24)
Chapter 1.
Current SURF programme: SURF Share
(Condensed version)
1.1 Management Summary
“Our mission of disseminating knowledge
is only half complete if the information is not made widely
and readily available to society. New possibilities of knowledge
dissemination not only through the classical form but also
and increasingly through the open access paradigm via the
Internet have to be supported. We define open access as a
comprehensive source of human knowledge and cultural heritage
that has been approved by the scientific community.”
Berlin Declaration on Open Access to
Knowledge in the Sciences and Humanities. October 2003. Signed in the Netherlands by
KNAW, NWO, SURF and the universities of Groningen, Leiden,
Amsterdam, Delft, Eindhoven, Wageningen and Utrecht.
It is of paramount importance in an internationally competitive
knowledge economy that the knowledge that is created finds
its way to the research community, to society and to private
enterprise. Providing broad access is a crucial requirement,
and this can only be achieved in a communal approach. The SURF
General Board approved the new SURF Strategic Plan 2007 ‘Thinking
Ahead’ and the SURF Working Plan 2007-2010 ‘Forceful
Action’ on 26 April 2006. This approval shows the commitment
of all universities, universities of applied sciences, the
Royal Netherlands Academy of Arts and Sciences (KNAW), the
Netherlands Organisation for Scientific Research (NWO), The
National Library of the Netherlands (KB) and TNO to a strong
effort from SURF in the field of scholarly and knowledge communication.
The SURFdare programme 2007-2010 provides the structure in
terms of time and funding of the Strategic Plan and the Working
Plan for the Platform ICT and Research. It was approved, conditional
on funding, by the Board of the Platform ‘ICT and Research’ on
13 September 2006.
Among other things the SURFshare programme is a follow-up
of the SURFdare programme 2003-2006. Appealing achievements
of this programme are the creation of a national knowledge
infrastructure, constructed from a structure of interoperable
institutional repositories (digital warehouses), the development
of over a dozen decentralised services, as well as DAREnet,
the national ‘science showcase’, which comprises ‘Cream
of Science’, the National site for Doctoral Theses and
the Knowledge Bank of the universities of applied sciences,
among others. The DARE programme is an international example
and acts a model for the European Union’s DRIVER programme.
In the preceding DARE programme SURF distinguished between
DARE and the other activities of the Platform ICT and Research.
This distinction is no longer present. The SURFshare programme
presented here comprises all activities of the Platform. This
creates increased synergy and places the activities of the
Platform, such as supporting researchers in exercising their
copyright, in the service of the overall SURFshare programme.
ICT speeds up the traditional communication processes, and
also changes the nature of the knowledge chain. The roles and
tasks of researchers, institutions, university libraries and
publishers are changing. The research and publishing processes
are becoming more interwoven. For this reason and due to the
increased possibilities in the field of knowledge sharing and
dissemination blend together tools (models, algorithms, visualisations),
research data, and traditional publications. For the next four
years SURFshare has committed itself to achieving a shared
infrastructure to support the communication of and access to
scientific results. The researcher will be the focus. He or
she must be supported in such a way that the results of the
research are easy to create, record and disseminate, to achieve
as much of an impact as possible.
More than in the preceding programme the research and communication
processes in the SURFshare programme 2007-2010 are viewed as
a unity. The activities comprise several elements in the scholarly
communication cycle:

Figure 1. SURFshare activities by element in the scholarly
communication cycle in 2003-06 and 2007-10
Over the coming period SURFshare will tackle the following
issues:
- Innovation
of parts of the research and communication cycle by means
of ‘enhanced
publications’, among others;
- Creating
and assessing collaboratories that allow researchers to collaborate
and share their sources;
- The
development and application of several tools and applications
for quality control, dissemination and impact to support
the researcher in an open access environment;
- Registration
of research data and achieving long-term access and data
curation.
Enhanced publications contain data and visual information
that do not fit traditional printed articles. In a sense the
traditional scientific paper itself is transformed into the
metadata of the published research result. Addition of the
underlying data and research tools to the article makes it
easier to verify or reproduce results and to build on them.
A collaboratory, or virtual research environment, is a digital,
web-based collaborative association of researchers at several
locations that allows them to work together and to share knowledge
and sources. SURF takes the position that the administrative
tasks in taking part in a collaboratory and in producing an
enhanced publication should be limited. A digital workbench
will be developed to provide an optimum of support for the
researcher in his work. This workbench will contain several
applications, including the means to create enhanced publications
and to simplify the authoring and review processes.
The added value of the SURFshare programme for researchers
lies in an improved access to research results and an improved
dissemination and impact of their own research results. Additionally,
collaboration through collaboratories and open access result
in increased productivity as well as enhanced and accelerated
research. The added value of SURFshare for universities and
society in general in the Netherlands lies in bringing together
the different public (and private) organisations, the focus
on national coordination, developing national and international
standards and providing an incentive for institutions. Open
access and a closer involvement of universities of applied
sciences contribute to an improved valorisation of knowledge.
SURF guarantees the institutions and funding providers an effective
(and justified) use of the funds through a suitable structure
of the organisation and close monitoring of the programme.
1.2 SURFshare Vision of the Future
1.2.1 Free access to research results
The completion of the DARE programme has provided the foundation
for the SURFshare programme 2007-2010. Over the next few years
the SURFshare programme will focus on scholarly and knowledge
communication and the transition processes that are taking
place in this field. In this vision of the future the researcher’s
workflow takes central stage (see figure 1 in chapter 1).
The preceding DARE programme emphasised the creation of interoperable
repositories that reinforce the communication cycle, specifically
registration, long-term and dynamic archiving and dissemination.
In the next timeframe these repositories will be the foundation
for further innovation of scholarly communication and collaboration
in the field of research, compiling complex (‘enhanced’)
publications, and the review and publication processes. The
tasks and roles of researchers, institutions, university libraries
and publishers are changing in these processes. The research
and publishing processes become even more interwoven. This
phenomenon and the increased possibilities in the field of
knowledge sharing and dissemination remove the absolute distinction
between research data and the traditional publication as research
output. Internationalisation is not an option but a fact. The
success of DARE has attracted international attention: countries
such as Germany and the United Kingdom are starting similar
projects. The European DRIVER project takes DAREnet as the
model for a European network of at least 50 repositories. The
DRIVER project is funded partly from the 6th EU Framework Programme
and will run until the end of 2007.
1.2.2 Enhanced Publications
Through SURFshare researchers (and teaching staff) will gain
easy and broad access to as many sources as possible, not only
to publications, but also to the underlying collections of
data, models and algorithms in all their shapes and sizes.
The notion of data is interpreted to also include sources for
the humanities and the social sciences such as large bodies
of text, freely accessible statistical collections and simulations.
SURFshare does not focus on the immense field of research data
in general, but restricts itself to access to data that is
related to a publication. Research data, models and visualisations
are shared and included in ‘enhanced publications’ in
combination with the text of the article. Naturally, these ‘enhanced
publications’ are published electronically, and they
comprise data and visual information that can not fit the traditional
printed articles. In a sense the traditional scientific paper
itself is transformed into the metadata of the published research
result. Addition of the underlying data and research tools
to the article makes it easier to verify or reproduce results
and to use them. Enhanced publications strengthen the quality
and reliability of the publication and make it easier to build
on them.
The supporting infrastructure (architecture, protocols, metadata)
to achieve enhanced publications is generic in nature. The
use and development of the enhanced publications is specific
to each scientific field because the sources and publications
are discipline-specific and the research communities are usually
organised by their discipline. Articles in the field of STM
are published in refereed journals and the topical scientific
output quickly becomes outdated. This makes it essential to
support the dynamic nature of enhanced publications, using
long-term and flexible relations to changing sets of data.
In the humanities publication of results in digital and open
access journals is on the increase, but the traditional monograph
remains current. Reconstruction of the scientific debate and
support for critical review of sources are important in this
field. A good prelude to enhanced publications in the humanities
is provided by the digital support of Frits van Oostrom’s
Stemmen op Schrift.[1]
The website www.stemmenopschrift.nl allows the reader to reconstruct
the creation of the publication, and it includes images of
sources and of Van Oostrom’s notes. The SURFshare programme
aims to go a step further by reinforcing quality control and
knowledge sharing for the various fields at both a national
and an international level. Institutions and research communities
will not only be supported in the creation of enhanced publication
with respect to organisation and technology, but also in Digital
Rights Management and exercising copyrights.
1.2.3 Collaboratories
The collaboratory is an extraordinary application that reinforces
the research process. It is a digital, web-based collaborative
association of researchers at several locations that allows
them to work together and to share knowledge and sources. It
is the right instrument to enhance and accelerate research
in both national and international environments. Collaboratories
transcend the institutional boundaries and they can be created
for specific disciplines or areas of interest. This makes them
suitable for use as a unifying instrument for instance for
graduate schools. The business world has been using collaboratories
for R&D for a while, mainly because they do not depend
on time and location and allow efficient sharing of the research
facilities. Collaboratories are a relatively unexplored territory
in higher education beyond the exact sciences. In principle
they are suited for all domains of research, especially for
the smaller and interdisciplinary areas: they can provide an
incentive by combining knowledge and bringing researchers together.
Components of enhanced publications can be put to use in collaboratories
because they will act both as the published result as well
as the input for new (enhanced) publications.
1.2.4 The
Digital Workbench
SURFshare also includes the development of a digital workbench
for researchers. The digital workbench is a working environment
with an integrated set of tools that support researchers in creating,
registering and disseminating their research results. Through
this, SURFshare facilitates the researcher in creating enhanced
publications and in taking part in collaboratories. The applications
that are part of the workbench will include authoring tools that
allow the researcher to combine text, data and visual objects,
as well as showcases, search functions, notification services
and tools for managing copyrights (DRM) and the registration
and measurement of the impact of the publications. SURFshare
will integrate these applications into the workbench in a manner
that achieves a better efficiency of the activities
and administrative tasks that derive from publication, registration
and dissemination of the research results. A number of these
tools will be specific to the research discipline. They will
have to comply with international standards and protocols.
SURF will take a generic approach whenever possible, complemented
with a discipline-specific approach when necessary.
1.2.5 Review,
Access and Impact
The development of new systems for quality control (comparable
to peer review for traditional publications) and for measuring
the impact are essential for achieving open access. The use
of new applications for this purpose such as enhanced publications
can be reinforced by developing new and improved tools for
quality assessment. In order to increase the facilities available
to the researcher, SURF is developing review tools within
the framework of the corresponding programme that will support
the researcher in providing formal comments on research results.
These review tools are linked to the development of search
facilities. SURF also stimulates the development of Open
Source Software for measuring the usage of publications (citations,
downloads, requests). The basic premise in the development
of these tools is that the researcher is supported in publishing
his works while retaining his copyrights.
For the next few years the emphasis will be placed on developing
and testing these applications. This development will take
place in an international context. SURF is striving for more
collaboration with academic societies and academic publishers.
SURF does not exclude collaboration with private parties
such as Springer and Google that support open access and
provide increased diversity in the offering of research output.
In these matters SURF takes the position that such collaborations
should lead to open access, and that public costs should
be decreased rather than increased where possible.
In practice it is not just organisational and infrastructural
developments that achieve open access, but also the removal
of formal and legal obstacles. Over the past few years SURF
has achieved results that reinforce open access and the position
of the scientist in exercising his copyrights. For instance,
SURF achieved consensus on the ‘Zwolle Principles’ with
leading publishers. These Principles focus on maximum accessibility
to academic results without impeding their quality or academic
freedom, balancing the various interests[2].
Publication in academic open access journals is not always
possible or desirable. Many articles are therefore still
being published in scientific journals that are distributed
to the readers through subscriptions. In cases where works
are not published in open access periodicals, the recommended
strategy to achieve open access anyway is that the researcher
archives the works himself and makes them available through
a repository. SURF will support the researchers in increasing
the effectiveness of exercising their copyrights through
the upcoming SURFshare programme. Especially the inclusion
of enhanced publications in repositories will result in new
legal challenges.
1.2.6 Research
Information Systems and Infrastructure for Universities
of Applied Sciences
Good systems management and innovation of the underlying
infrastructure are essential for the development of new applications.
At the local level the repositories are linked to Metis,
the national management information system for research information.
Universities use Metis to register the results of projects
and scientific research, for instance. The DARE programme
has shown that the research information systems are an essential
part of the knowledge infrastructure of the institutions.
The strategic development of Metis takes place within the
SURFshare programme and follows the development of international
standards in order to facilitate the exchange of research
information between countries.
The universities of applied sciences will develop their
own research tradition in the coming years, and they will
want to connect to the repositories. The current DARE project
has already delivered a repository containing exam papers,
research papers, presentations and articles of seven universities
of applied sciences. A number of universities of applied
sciences are also taking part in the corresponding LOREnet
project, which is developing an infrastructure of research
repositories and related services. This fabric of repositories
is being expanded for the universities of applied sciences.
The Knowledge Bank for universities of applied sciences makes
it easy to find and access the output of these institutions.
The applications and services of universities of applied
sciences will be designed differently from those for universities,
as they undertake a large amount of research aimed at the
working practice and they exchange knowledge extensively
with professional practice institutions. For instance, collaboratories
will provide the possibility to reinforce the exchange of
knowledge with private enterprise and public organisations.
Universities could learn from the universities of applied
sciences regarding the dissemination of knowledge to society,
through lectorships, knowledge circles and other knowledge
links.
1.2.7 Internationalisation
The internationalisation of research is not an option but
a fact. The SURFshare programme therefore aims to reinforce
research in the Netherlands by improving the dissemination
of knowledge and by collaborating in an international context.
It has recently become clear that the 7th EU Framework Programme
will focus heavily on expanding the European knowledge infrastructure,
the so-called European Research Area. Networks of (data)
repositories will play an important part. The DRIVER project,
with SURF and participants from seven other countries, is
intended to be a prelude to a European-wide collaboration
in this field. SURFshare is therefore a pathfinder in the
field of institutional repositories.
A consortium of institutions from the Netherlands will be
established to take the initiative in submitting, designing
and undertaking projects in the 7th EU Framework Programme.
These projects will focus on improving the facilities for
researchers to gain access to research results, to undertake
research in collaboration and to register and provide access
to the results of their research within an international
setting. This specifically concerns enhanced publications
and collaboratory applications.
In the coming years SURF will continue the collaboration
with similar organisations in Western Europe (United Kingdom,
Germany, the Scandinavian countries) and the United States
through joint initiatives, projects, conferences and workshops.
A strategic partnership has been established with JISC, SURF’s
sister organisation in the United Kingdom. SURF has founded
the collaborative organisation, ‘Knowledge Exchange’,
together with the Deutsche Forschungs Gemeinschaft, Denmark’s
Electronic Research Library and JISC. SURF also participates
in international working groups in the field of standards
and protocols, interoperability and intellectual property
rights (IP).
Chapter 2.
“Our mission of disseminating knowledge is
only half complete if the information is not made widely
and readily availabl to society..”
Berlin Declaration on Open Access
to Knowledge in the Sciences and Humanities. October
2003. Signed in
the Netherlands by KNAW, NWO, SURF and the universities
of Groningen, Leiden, Amsterdam, Delft, Eindhoven,
Wageningen and Utrecht.
“Freely sharing data increases the productivity
of research”
Maria van der Hoeven, e-data&research, June 2006
“Our mission of disseminating knowledge is only
half complete if the information is not made widely and
readily available to society.”
Berlin Declaration, October 2003
All universities in the Netherlands, as well as the
KNAW (Royal Netherlands Academy of Arts and Sciences),
NWO (the Netherlands Organisation for Scientific Research)
and the Koninklijke Bibliotheek (National Library of
the Netherlands) created a joint network of repositories,
in the years 2003-2006. In doing so they have made many
results of scientific research available to the public
through open access.
This joint DARE programme, coordinated by SURF, has
given the Netherlands a prominent position in the innovation
of scientific information provisioning.
2.1 The added value of repositories
2.1.1 Showcase your work
Repositories aim to provide your scientific work with
optimum visibility: not only for colleagues in your field
of expertise, but also for other interested scientists,
students and the public. Thanks to the DARE programme
each university in the Netherlands has such a repository,
which can be accessed by everyone over the Internet.
There are no limitations to the nature of the stored
materials: besides articles and research reports
the repositories also contain datasets and multimedia
content.
The Netherlands is the first country in which all universities
and national research institutions participate, and moreover
in which all repositories have been interconnected into
a single transparent network. The repositories are compliant
with international standards. This means that it is also
easily accessible using search engines such as Google
Scholar.
Research has shown that digital publications are read
and cited more, due to their improved accessibility.
Placing a publication in a repository maximises this
effect. Moreover, it provides a reliable indication of
quality that distinguishes the publications from other
materials that can be found using search engines, resulting
in a positive effect on the ranking within the search
results.

nOA=non-Open Access, OA=Open Access; Gunter Eysenbach, ‘ The
open access advantage’, in: Journal of Medical
Internet Research, 2006 vol. 8
2.1.2 Selective search
The Dutch repositories have their own search engine:
www.DAREnet.nl. DAREnet offers direct access to a rich
collection of scientific materials that is guaranteed
to be freely accessible.
DAREnet also allows you to search specific collections.
One such collection is Cream of Science, containing the
complete works of over 200 top scientists; another collection
is the Promise of Science with doctoral e-theses. And
NARCIS, the gateway to Dutch scientific information,
can be accessed through DAREnet.
A derived service is the HBO Kennisbank (Knowledge Bank
for the Universities of applied sciences’), a network
of seven such universities, which provides access to
theses and other knowledge products.
2.1.3 Window to your field of expertise
Institutions in various fields of science have seized
the DARE programme with both hands to combine their output. ‘Connecting
Africa’, a service for Africa studies , is a website
with research results of Africanists from both Dutch
and international universities.
Other initiatives are DARLIN (Dutch
ARchive for Library and INformation sciences), which
provides access to publications in the field of library
and information science in the Netherlands; the
e-Depot Nederlandse Archeologie (e-repository for Dutch
Archaeology) makes accessible primary data of excavations,
regional assessments and material studies; the Utrecht
Law Review offers scientists in the legal domain an international
showcase for digital publications.
2.1.4 The power of standards
The power of DARE lies in its choice of clear standards.
For example, a system of unique authors’ names
has been developed which allows a scientist to have materials
in multiple repositories without losing track.
Due to the standardisation of DARE it is easy to connect
all sorts of services to the repositories that will simplify
your work. A sample of the services that have been developed:
- Several universities offer the possibility
to generate overviews of publications for personal
web pages from the repository.
- ARNEX (Agricultural
Repository News Exchange) offers RSS-type subscriptions
to research results at the major international agricultural
institutions.
- PROMAS offers quick and easy
creation of your academic profile.
- Text
enrichment in Virtual Communities allows you to enrich
documents with annotations that can be used in a virtual
community.
- The repositories are automatically
linked to Metis, the Dutch national university management information
system for research information.
The above mentioned examples of services are available
on DAREnet: Services.
2.1.4 Your work, your rights
You may ask: does my publisher allow publication in
a repository? In principle they do, unless you have an
agreement in which this is explicitly excluded. It is
therefore important to retain control of your own work.
In order to do so you can use the Licence to publish
that is developed in the context of DARE. With this license
you allow your publisher to publish your work, while
retaining all other rights yourself. The Licence is available
at DAREnet and at copyrighttoolbox.surf.nl.
2.1.5 Digital sustainability
All works that are published in the repositories are
automatically included in the e-depot of the National
Library of the Netherlands. This ensures that the digital
materials remain legible and accessible, even with future
technologies, so your work is guaranteed to be also available
in the future.
Chapter 3.
Knowledge Exchange workshop recommendations
3.1 Institutional Repositories: e-theses
Executive Summary
The workshop showed that e-theses are an important part
of a university’s research output and yet not
sufficiently integrated into a European repositories
infrastructure and searchable for the specific e-theses
information. As preparation of the the e-theses strand
a small demonstrator, showing how an integrated search
using the OAI-PHM could be used to offer a retrieval
portal on a European scale and where the difficulties
in terms of metadata are.
The participants of the discussion agreed upon the fact,
that e-theses should be seen as an integral part of broader
institutional repositories. As such they should not follow
an own metadata set, but be integrated into a metadata
profile for the whole repository. It was suggested to
develop a European e-prints Application profile. Such
a profile should work out the very specific metadata
that only apply to doctoral e-theses.
During the workshop the work was started to identify
key issues in handling doctoral e-theses and to prioritize
those issues. This table can be used for planning further
European and country specific activities to reach the
goal: make e-theses in Europe better retrievable and
therefore visible via institutional repositories.
The results of the demonstrator and discussions will
be further elaborated and presented as
“European e-Theses Demonstrator” at the ETD 2007 conference in
Uppsala.
Summary of recommendations
The discussions at the workshop brought the following
results:
- E-theses have to be seen as part of an overall institutional
repositories infrastructure and content. They should
not be handled differently from other scientific and
scholarly e-papers.
- The demonstrator showed that it is generally possible
to harvest European e-theses using the OAI-PMH.
But it was shown in the study and discussions that
Simple Dublin core is not enough. The participants
agreed that a richer metadata set is necessary to
offer a retrieval portal with good quality.
- Country specific best practise examples like the
XmetaDiss approach in Germany, the uketd metadata
set in the UK have been presented and compared to
older e-theses specific metadata sets like the NDLTD
etd metadata set, the latter was seen to be outdated.
- It was suggested to develop a European e-theses
application profile for metadata at a European level,
meaning within a European working group. It was suggested
that the GUIDE group could be a good address for
such further activities.
- Investigations have to be made to handle e-theses
in terms of metadata as part of e-prints. Therefore
e-theses specific informations has to be encoded
into metadata. A first list of e-theses specific
elements has been produced during the workshop.
- National authorities should fund national development
of richer metadata schemes. This allows to achieve
a demonstrator that shows what is possible in a
short term. It is the easiest to build 4 or 5 filters
at the demoservice level, but it is not scalable.
The main goal is to get at a higher level based on
national richer metadata, is a good way to put first
steps in achieving a European e-prints application
schema.
- The key issues in handling doctoral e-theses have
been identified by the workshop strand participants
and they have been prioritized as follows:
- Richer metadata (Use qualified dc; Compound docs;
Complex objects; Technical and preservation metadata)
- Wider IR perspective (How to integrate richer metadata;
Is document type a good choice to base services
on; How to get them; Compatibility)
- ETD specific issues (Degree, level, definition;
Various dates; Define minimal requirements)
- Cultural aspects (Language; Interpretation and
definitions; Local versus national services)
- Audience (Keep it simple; Focus on added value
metadata; Think about target groups)
- Subject classification
3.2 Institutional Repositories: Open Archives Protocol
for Metadata Harvesting
Executive Summary
The Open Archives Protocol for Metadata Harvesting has
been used for several years as the to ensure
interoperability between institutional repositories.
It has allowed to develop a number of services. The
number of repositories being developed is growing and
the variety of services for scholarly material is increasing.
The OAI-PMH strands aimed to analyze the current implementations
of the protocol in the IRs of the Knowledge Exchange
countries, to identify the major issues they encountered,
finally, to consider the necessary evolutions in the
deployment of the protocol that can allow to support
the new requirements of scalability and services for
IRs in the next few years.
Summary of recommendations
- One project of national OAI knowledge bases
- Communication with the DRIVER project to coordinate
the creating of data provider guidelines for IRs.
Creation of mechanisms, maybe through the knowledge
bases, to enforce higher degrees of compliance of
KE data providers to the guidelines.
- Contact with the DRIVER project on the establishment
of data providers guidelines
- Potential additions or definition of objectives
for compliance levels
- Definition of mechanisms to encourage/enforce
the guidelines in the KE countries.
- One
project on persistent identifiers implementation
- A
meeting of KE stakeholders on persistent identifiers
to establish a common solutionand a common strategy
- Creation
of a test bed that would involve actors in the
different countrie
3.3 Institutional Repositories: Usage Statistics
Executive Summary
An understanding of the use of repositories and their
contents is clearly desirable for authors and repository
managers alike, as well as those who are analysing
the state of scholarly communications. A number of
individual initiatives have produced statistics of
various kinds for individual repositories, but the
real challenge is to produce statistics that can be
collected and compared transparently on a global scale.
This report details the steps to be taken to address
the issues to attain this capability.
Summary of recommendations
1) ACTION: determine practical definition of ’usage’
- Decide meaning of ‘use’
- Produce event-based web-log based format for sharing ‘usage
events’ to deliver many profiles (COUNTER, awstats,
JISC Monitoring, DRIVER etc)
2) ACTION: define objects to be counted
- Lobby COUNTER to add article level stats
- Create vocabulary of academic output types in conjunction
with Research Paper Metadatagroup
3) ACTION: standard reports
- Agree on a small set of standard useful statistical
reports that repositories should produce
4) ACTION: agree policies for stats
- Compliance with local laws on e.g. privacy
- Enhance SHERPA policy tool
5) ACTION: collection and aggregation
- Agree de-spidering process (first draft agreed)
- Specify issues of aggregation and deduplication for
later study
6) ACTION: collation with external sources
- Talk to COUNTER about complex objects
- Set up COUNTER-IR to shadow the publisher group
- JISC Project IRS to provide initial COUNTER-style
reports
- Talk to COUNTER about aggregating COUNTER stats at
consortium level
- Investigate SUSHI interoperability with repositories
and OAI-PMH + OpenURLContextObjects
3.4 Institutional Repositories: Author Identification
Executive Summary
The group formulated two main recommendations:
Firstly: capitalise on knowledge from ongoing initiatives;
not only from the library community but also from the
Internet community and from standardisation activities
within e.g. the universities.
Secondly: address the need for a sustainable model for
author identification. The level of ambition should match
the expected funding for the activity.
Summary of recommendations
To support these recommendations, the group proposes
four initiatives:
1) initiate two studies, one on the need for author
Identification and one addressing the
potential business models for a author ID based services,
2) establish a prototype for cross institution
use of author ID, which can serve as a blueprint,
3) arrange workshops for experts from the library
field and from universities and from the Internet community
identifying relevant initiatives and potential architectures
for a service
4) establish a working group looking into the flow
of relevant metadata.
3.5 Institutional Repositories: Exchanging Research
Information
Executive Summary
The objective of the strand “Exchanging Research
Information” was to bring together CRIS (Current
Research Information Systems) and OAR (Open Access Repositories).
Both applications deal with a specific segment of the
academic information domain – notably the specifications,
products or outcomes of academic research. Substantial
commonalities exist between the two. Rooted in different
units of the university (research administration vs.
library) they, however, also have their individual characteristics:
CRIS primarily have an institutional scope and are mainly
referring to context of research whereas OAR are referring
to content of research and are per definition internationally
oriented.
Given their affinity, achieving interoperability between
CRIS and OAR is desirable and will benefit all parties
involved, including the researchers. A joint approach
will avoid double input and management of redundant data
as well as redundant services and processes and will
both enhance the efficiency and quality (mutual enrichment)
of the services offered by CRIS and OAR to their users.
Looking at the current situation in the KE-countries,
significant differences can be noticed between Denmark
(unified system), the Netherlands (strong national CRIS
solution METIS with first integration with repositories),
the UK and Germany (heterogeneous landscapes of institutional
and subject-based repositories and less standardized
CRIS). Successful integrated solutions can be found at
an institutional or subject-based level, but integration
becomes less probable moving towards complex landscapes
at the national and supra-national level.
As a specific consequence of this situation, ad hoc technical
developments to support interoperability between CRIS
and OAR at large scale are currently recommended to be
highly focussed on a specific entity in the academic
information domain (e.g. managing the full-text-link
of a research paper). As a broader consequence, a sustainable
and optimal solution for the combination of CRIS and
OAR at large scale requires a thorough analysis and specification
capable of representing the heterogeneity of the two
respective landscapes. It requires a flexible, service-oriented
approach based on an integrated institutional policy
concerning the academic information domain, and targeting
both organizational aspects (taking account of business
processes and services) and technical aspects (implementation
of service oriented architecture).
One step to take in this respect – and the first
part of a follow-up activity of the strand – would
be the delineation of the academic information domain,
and notably the part within the academic information
domain that is covered by CRIS and OAR. This would involve
an analysis of the information elements (entities and
attributes) and the workflows and services involved with
CRIS and OAR. Another, parallel, step – and the
second part of a follow-up activity of the strand – would
focus on ad hoc technical development for managing a
specific entity in the academic information domain in
both CRIS and OAR. Once this work is done and the results
of both steps are integrated, the definition of an optimized
services model, integrating CRIS and OAR becomes more
feasible and can be based on the principles of reuse
of services (also on a supra-institutional level) and
proper ownership of data. Such a new services model may
have an impact on existing technical solutions (decomposition
of systems) and even on organizational units (restructuring
of business processes and workflows).
Summary of recommendations
The strand recommends to the KE Board to initiate, and
provide ways of funding, for the follow-up activities
identified above. An adequate instrument is a sequel
of the workshop. The overall goal should be preserved
but the strand should be split-up in two groups: one
group is working on policies and services in the academic
information domain with experts for high-level and
middle-level research management, and the other group
is working ‘hands-on’ to build a demonstrator
for achieving interoperability between CRIS and OAR
for a specific entity in the academic information domain.
Responsibility for a joint report should ensure mutual
exchange.
1) Work against the background of an integrated information
policy and management in the institution, concerning
the Academic Information Domain
- Integration and optimizing of business processes
and workflows
- Institutional policy for integrated management
- Researcher-centred approach
- Open for re-use outside of the institution
- Apply a service oriented architecture
2) Think against the background of a distributed architecture
- ‘Contracts’ between data providers (e.g.
service-level agreements)
3) Apply a service reference framework (e.g., e-Framework
by JISC and DART)
- Start a follow-up activity within the KE framework
to delineate the entities, attributes, services and
workflows of the academic information domain as far
as CRIS and OAR are concerned, as a concrete follow-up
to the strand’s discussion.
4) Work out an operational example of a technical solution
- Start a second and parallel follow-up activity to
develop – as an example and demonstrator - a
technical solution for the management of a specific
entity within the academic information domain (e.g.
the full-text-link of a research paper), common to
both CRIS and OAR.
3.6
Institutional Repositories: research paper metadata
Executive
Summary
The Knowledge Exchange organised a workshop in January
2007 on the interoperability of institutional repositories,
at which the exchange of research paper metadata was
one of six thematic strands. The group covering this
topic included representatives from Denmark, Germany,
the Netherlands and the UK. Current practice was reviewed,
and found unsatisfactory, although it is difficult
to be precise about exactly where the shortcomings
are without a clearer idea of the services that are
intended to use the metadata. The group also concluded
that the other thematic strands at the workshop were
likely to recommend work that would be relevant to
research paper metadata. Therefore, the group recommended
further work that makes up an iterative programme, that
should be considered in conjunction with those from other
strands, and which should both clarify the aims and drivers
toward greater interoperability, and achieve some tangible
steps along the way.
Summary of recommendations
The group recommends that:
- a small piece of work be done to build a framework
within which the business drivers can be identified
for improved sharing of richer1 descriptions of research
papers.
- a range of scenarios be agreed and documented,
which would be made possible or easier or more effective
by the sharing of richer descriptions of research papers.
- A piece of work should be scoped and commissioned
that:
- compares the metadata formats in common use within
KE countries with the scenarios previously collected,
to identify the strengths and weaknesses of each
with respect to those scenarios and the services
they might imply.
- based on this analysis, makes recommendations
on how metadata exchange and interoperability may
be improved; the recommendations should refer to
both “quick
wins” and steps toward longer term goals.
- as well as looking at metadata structures, also
looks for opportunities with respect to:
i. time-stamped factual authority relations
ii. dependable (resolvable, persistent) identifiers
for entities described
iii. cataloguing rules
iv. shared vocabularies for common elements such
as ”resource
type” and the maintenance of such vocabularies
v. shared understanding on the use of xml wrappers
or containers
This work should be based on the more detailed
discussion in the main text of this report, and
should not be undertaken in isolation from similar
work following from other strands of the KE workshop.
- an
interoperability demonstrator / testbed should
be established, in collaboration where appropriate
within existing initiatives, to provide a realistic
environment in which to monitor and evaluate the
problems and progress of the area.
References:
- F.P. van
Oostrom, Stemmen op schrift (Amsterdam, 2006)
- See www.surf.nl/copyright
|