April 17 - 19, 2007   
Hyatt Regency Phoenix   
Phoenix, Arizona   

 

Position papers

Institutional Repositories in the Netherlands,
a national and international perspective

 
   

NSF/JISC Repositories Workshop
Bas Cordewener, SURF Foundation
April 10, 2007
Download: PDF Version  WORD version

This is a compilation of (slightly edited) existing documents – not to be reproduced in this form.
For the original documents: see www.surf.nl/en and www.knowledge-exchange.info
Other relevant websites www.darenet.nl/en; www.hbo-kennisbank.nl (Dutch)

I. Current SURF programme: SURF Share (to be developed)

(page 2-8)
Free Access to Research Results
Enhanced Publications
Collaboratories
The Digital Workbench
Review, Access and Impact
Research Information Systems and Infrastructure for Universities of Applied Sciences
International Perspective

II. Former SURF programme: SURF Dare (now a service)

(page 9-11)
Showcase your work
Selective Search
Window to your Field of Expertise
The Power of Standards
Your Work, your Rights
Digital Sustainability

III. Knowledge Exchange recommendations on IR (to be transformed into activities)

(page 12-18)
e-Theses
Open Archives Protocol for Metadata Harvesting
Usage Statistics
Author Identification
Exchanging Research Information
Research Paper Metadata

Annex A.   Petition to European Commission (to support an OA supportive course)  
(page 18-22) 

Annex B.   License to Publish (OA copyright agreement for HEI publications) 
(page 23-24) 

Chapter 1.

Current SURF programme: SURF Share

(Condensed version)

1.1 Management Summary

Our mission of disseminating knowledge is only half complete if the information is not made widely and readily available to society. New possibilities of knowledge dissemination not only through the classical form but also and increasingly through the open access paradigm via the Internet have to be supported. We define open access as a comprehensive source of human knowledge and cultural heritage that has been approved by the scientific community.

Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities. October 2003. Signed in the Netherlands by KNAW, NWO, SURF and the universities of Groningen, Leiden, Amsterdam, Delft, Eindhoven, Wageningen and Utrecht.

It is of paramount importance in an internationally competitive knowledge economy that the knowledge that is created finds its way to the research community, to society and to private enterprise. Providing broad access is a crucial requirement, and this can only be achieved in a communal approach. The SURF General Board approved the new SURF Strategic Plan 2007 ‘Thinking Ahead’ and the SURF Working Plan 2007-2010 ‘Forceful Action’ on 26 April 2006. This approval shows the commitment of all universities, universities of applied sciences, the Royal Netherlands Academy of Arts and Sciences (KNAW), the Netherlands Organisation for Scientific Research (NWO), The National Library of the Netherlands (KB) and TNO to a strong effort from SURF in the field of scholarly and knowledge communication. The SURFdare programme 2007-2010 provides the structure in terms of time and funding of the Strategic Plan and the Working Plan for the Platform ICT and Research. It was approved, conditional on funding, by the Board of the Platform ‘ICT and Research’ on 13 September 2006.

Among other things the SURFshare programme is a follow-up of the SURFdare programme 2003-2006. Appealing achievements of this programme are the creation of a national knowledge infrastructure, constructed from a structure of interoperable institutional repositories (digital warehouses), the development of over a dozen decentralised services, as well as DAREnet, the national ‘science showcase’, which comprises ‘Cream of Science’, the National site for Doctoral Theses and the Knowledge Bank of the universities of applied sciences, among others. The DARE programme is an international example and acts a model for the European Union’s DRIVER programme. In the preceding DARE programme SURF distinguished between DARE and the other activities of the Platform ICT and Research. This distinction is no longer present. The SURFshare programme presented here comprises all activities of the Platform. This creates increased synergy and places the activities of the Platform, such as supporting researchers in exercising their copyright, in the service of the overall SURFshare programme.

ICT speeds up the traditional communication processes, and also changes the nature of the knowledge chain. The roles and tasks of researchers, institutions, university libraries and publishers are changing. The research and publishing processes are becoming more interwoven. For this reason and due to the increased possibilities in the field of knowledge sharing and dissemination blend together tools (models, algorithms, visualisations), research data, and traditional publications. For the next four years SURFshare has committed itself to achieving a shared infrastructure to support the communication of and access to scientific results. The researcher will be the focus. He or she must be supported in such a way that the results of the research are easy to create, record and disseminate, to achieve as much of an impact as possible.

More than in the preceding programme the research and communication processes in the SURFshare programme 2007-2010 are viewed as a unity. The activities comprise several elements in the scholarly communication cycle:

Figure 1. SURFshare activities by element in the scholarly communication cycle in 2003-06 and 2007-10

Over the coming period SURFshare will tackle the following issues:

  1. Innovation of parts of the research and communication cycle by means of ‘enhanced publications’, among others;
  2. Creating and assessing collaboratories that allow researchers to collaborate and share their sources;
  3. The development and application of several tools and applications for quality control, dissemination and impact to support the researcher in an open access environment;
  4. Registration of research data and achieving long-term access and data curation.

Enhanced publications contain data and visual information that do not fit traditional printed articles. In a sense the traditional scientific paper itself is transformed into the metadata of the published research result. Addition of the underlying data and research tools to the article makes it easier to verify or reproduce results and to build on them. A collaboratory, or virtual research environment, is a digital, web-based collaborative association of researchers at several locations that allows them to work together and to share knowledge and sources. SURF takes the position that the administrative tasks in taking part in a collaboratory and in producing an enhanced publication should be limited. A digital workbench will be developed to provide an optimum of support for the researcher in his work. This workbench will contain several applications, including the means to create enhanced publications and to simplify the authoring and review processes.

The added value of the SURFshare programme for researchers lies in an improved access to research results and an improved dissemination and impact of their own research results. Additionally, collaboration through collaboratories and open access result in increased productivity as well as enhanced and accelerated research. The added value of SURFshare for universities and society in general in the Netherlands lies in bringing together the different public (and private) organisations, the focus on national coordination, developing national and international standards and providing an incentive for institutions. Open access and a closer involvement of universities of applied sciences contribute to an improved valorisation of knowledge. SURF guarantees the institutions and funding providers an effective (and justified) use of the funds through a suitable structure of the organisation and close monitoring of the programme.

1.2  SURFshare Vision of the Future

1.2.1  Free access to research results

The completion of the DARE programme has provided the foundation for the SURFshare programme 2007-2010. Over the next few years the SURFshare programme will focus on scholarly and knowledge communication and the transition processes that are taking place in this field. In this vision of the future the researcher’s workflow takes central stage (see figure 1 in chapter 1).

The preceding DARE programme emphasised the creation of interoperable repositories that reinforce the communication cycle, specifically registration, long-term and dynamic archiving and dissemination. In the next timeframe these repositories will be the foundation for further innovation of scholarly communication and collaboration in the field of research, compiling complex (‘enhanced’) publications, and the review and publication processes. The tasks and roles of researchers, institutions, university libraries and publishers are changing in these processes. The research and publishing processes become even more interwoven. This phenomenon and the increased possibilities in the field of knowledge sharing and dissemination remove the absolute distinction between research data and the traditional publication as research output. Internationalisation is not an option but a fact. The success of DARE has attracted international attention: countries such as Germany and the United Kingdom are starting similar projects. The European DRIVER project takes DAREnet as the model for a European network of at least 50 repositories. The DRIVER project is funded partly from the 6th EU Framework Programme and will run until the end of 2007.

1.2.2  Enhanced Publications

Through SURFshare researchers (and teaching staff) will gain easy and broad access to as many sources as possible, not only to publications, but also to the underlying collections of data, models and algorithms in all their shapes and sizes. The notion of data is interpreted to also include sources for the humanities and the social sciences such as large bodies of text, freely accessible statistical collections and simulations. SURFshare does not focus on the immense field of research data in general, but restricts itself to access to data that is related to a publication. Research data, models and visualisations are shared and included in ‘enhanced publications’ in combination with the text of the article. Naturally, these ‘enhanced publications’ are published electronically, and they comprise data and visual information that can not fit the traditional printed articles. In a sense the traditional scientific paper itself is transformed into the metadata of the published research result. Addition of the underlying data and research tools to the article makes it easier to verify or reproduce results and to use them. Enhanced publications strengthen the quality and reliability of the publication and make it easier to build on them.

The supporting infrastructure (architecture, protocols, metadata) to achieve enhanced publications is generic in nature. The use and development of the enhanced publications is specific to each scientific field because the sources and publications are discipline-specific and the research communities are usually organised by their discipline. Articles in the field of STM are published in refereed journals and the topical scientific output quickly becomes outdated. This makes it essential to support the dynamic nature of enhanced publications, using long-term and flexible relations to changing sets of data. In the humanities publication of results in digital and open access journals is on the increase, but the traditional monograph remains current. Reconstruction of the scientific debate and support for critical review of sources are important in this field. A good prelude to enhanced publications in the humanities is provided by the digital support of Frits van Oostrom’s Stemmen op Schrift.[1]

The website www.stemmenopschrift.nl allows the reader to reconstruct the creation of the publication, and it includes images of sources and of Van Oostrom’s notes. The SURFshare programme aims to go a step further by reinforcing quality control and knowledge sharing for the various fields at both a national and an international level. Institutions and research communities will not only be supported in the creation of enhanced publication with respect to organisation and technology, but also in Digital Rights Management and exercising copyrights.

1.2.3  Collaboratories

The collaboratory is an extraordinary application that reinforces the research process. It is a digital, web-based collaborative association of researchers at several locations that allows them to work together and to share knowledge and sources. It is the right instrument to enhance and accelerate research in both national and international environments. Collaboratories transcend the institutional boundaries and they can be created for specific disciplines or areas of interest. This makes them suitable for use as a unifying instrument for instance for graduate schools. The business world has been using collaboratories for R&D for a while, mainly because they do not depend on time and location and allow efficient sharing of the research facilities. Collaboratories are a relatively unexplored territory in higher education beyond the exact sciences. In principle they are suited for all domains of research, especially for the smaller and interdisciplinary areas: they can provide an incentive by combining knowledge and bringing researchers together.

Components of enhanced publications can be put to use in collaboratories because they will act both as the published result as well as the input for new (enhanced) publications.

1.2.4  The Digital Workbench

SURFshare also includes the development of a digital workbench for researchers. The digital workbench is a working environment with an integrated set of tools that support researchers in creating, registering and disseminating their research results. Through this, SURFshare facilitates the researcher in creating enhanced publications and in taking part in collaboratories. The applications that are part of the workbench will include authoring tools that allow the researcher to combine text, data and visual objects, as well as showcases, search functions, notification services and tools for managing copyrights (DRM) and the registration and measurement of the impact of the publications. SURFshare will integrate these applications into the workbench in a manner that achieves a better efficiency of the activities and administrative tasks that derive from publication, registration and dissemination of the research results. A number of these tools will be specific to the research discipline. They will have to comply with international standards and protocols. SURF will take a generic approach whenever possible, complemented with a discipline-specific approach when necessary.

1.2.5  Review, Access and Impact

The development of new systems for quality control (comparable to peer review for traditional publications) and for measuring the impact are essential for achieving open access. The use of new applications for this purpose such as enhanced publications can be reinforced by developing new and improved tools for quality assessment. In order to increase the facilities available to the researcher, SURF is developing review tools within the framework of the corresponding programme that will support the researcher in providing formal comments on research results. These review tools are linked to the development of search facilities. SURF also stimulates the development of Open Source Software for measuring the usage of publications (citations, downloads, requests). The basic premise in the development of these tools is that the researcher is supported in publishing his works while retaining his copyrights.
For the next few years the emphasis will be placed on developing and testing these applications. This development will take place in an international context. SURF is striving for more collaboration with academic societies and academic publishers. SURF does not exclude collaboration with private parties such as Springer and Google that support open access and provide increased diversity in the offering of research output. In these matters SURF takes the position that such collaborations should lead to open access, and that public costs should be decreased rather than increased where possible.

In practice it is not just organisational and infrastructural developments that achieve open access, but also the removal of formal and legal obstacles. Over the past few years SURF has achieved results that reinforce open access and the position of the scientist in exercising his copyrights. For instance, SURF achieved consensus on the ‘Zwolle Principles’ with leading publishers. These Principles focus on maximum accessibility to academic results without impeding their quality or academic freedom, balancing the various interests[2]. Publication in academic open access journals is not always possible or desirable. Many articles are therefore still being published in scientific journals that are distributed to the readers through subscriptions. In cases where works are not published in open access periodicals, the recommended strategy to achieve open access anyway is that the researcher archives the works himself and makes them available through a repository. SURF will support the researchers in increasing the effectiveness of exercising their copyrights through the upcoming SURFshare programme. Especially the inclusion of enhanced publications in repositories will result in new legal challenges.

1.2.6  Research Information Systems and Infrastructure for Universities of Applied Sciences

Good systems management and innovation of the underlying infrastructure are essential for the development of new applications. At the local level the repositories are linked to Metis, the national management information system for research information. Universities use Metis to register the results of projects and scientific research, for instance. The DARE programme has shown that the research information systems are an essential part of the knowledge infrastructure of the institutions. The strategic development of Metis takes place within the SURFshare programme and follows the development of international standards in order to facilitate the exchange of research information between countries.

The universities of applied sciences will develop their own research tradition in the coming years, and they will want to connect to the repositories. The current DARE project has already delivered a repository containing exam papers, research papers, presentations and articles of seven universities of applied sciences. A number of universities of applied sciences are also taking part in the corresponding LOREnet project, which is developing an infrastructure of research repositories and related services. This fabric of repositories is being expanded for the universities of applied sciences. The Knowledge Bank for universities of applied sciences makes it easy to find and access the output of these institutions. The applications and services of universities of applied sciences will be designed differently from those for universities, as they undertake a large amount of research aimed at the working practice and they exchange knowledge extensively with professional practice institutions. For instance, collaboratories will provide the possibility to reinforce the exchange of knowledge with private enterprise and public organisations. Universities could learn from the universities of applied sciences regarding the dissemination of knowledge to society, through lectorships, knowledge circles and other knowledge links.

1.2.7  Internationalisation

The internationalisation of research is not an option but a fact. The SURFshare programme therefore aims to reinforce research in the Netherlands by improving the dissemination of knowledge and by collaborating in an international context. It has recently become clear that the 7th EU Framework Programme will focus heavily on expanding the European knowledge infrastructure, the so-called European Research Area. Networks of (data) repositories will play an important part. The DRIVER project, with SURF and participants from seven other countries, is intended to be a prelude to a European-wide collaboration in this field. SURFshare is therefore a pathfinder in the field of institutional repositories.

A consortium of institutions from the Netherlands will be established to take the initiative in submitting, designing and undertaking projects in the 7th EU Framework Programme. These projects will focus on improving the facilities for researchers to gain access to research results, to undertake research in collaboration and to register and provide access to the results of their research within an international setting. This specifically concerns enhanced publications and collaboratory applications.

In the coming years SURF will continue the collaboration with similar organisations in Western Europe (United Kingdom, Germany, the Scandinavian countries) and the United States through joint initiatives, projects, conferences and workshops. A strategic partnership has been established with JISC, SURF’s sister organisation in the United Kingdom. SURF has founded the collaborative organisation, ‘Knowledge Exchange’, together with the Deutsche Forschungs Gemeinschaft, Denmark’s Electronic Research Library and JISC. SURF also participates in international working groups in the field of standards and protocols, interoperability and intellectual property rights (IP).

Chapter 2.

“Our mission of disseminating knowledge is only half complete if the information is not made widely and readily availabl to society..”

Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities. October 2003. Signed in the Netherlands by KNAW, NWO, SURF and the universities of Groningen, Leiden, Amsterdam, Delft, Eindhoven, Wageningen and Utrecht.

“Freely sharing data increases the productivity of research”
Maria van der Hoeven, e-data&research, June 2006

“Our mission of disseminating knowledge is only half complete if the information is not made widely and readily available to society.”
Berlin Declaration, October 2003

All universities in the Netherlands, as well as the KNAW (Royal Netherlands Academy of Arts and Sciences), NWO (the Netherlands Organisation for Scientific Research) and the Koninklijke Bibliotheek (National Library of the Netherlands) created a joint network of repositories, in the years 2003-2006. In doing so they have made many results of scientific research available to the public through open access.

This joint DARE programme, coordinated by SURF, has given the Netherlands a prominent position in the innovation of scientific information provisioning.

2.1  The added value of repositories

2.1.1  Showcase your work

Repositories aim to provide your scientific work with optimum visibility: not only for colleagues in your field of expertise, but also for other interested scientists, students and the public. Thanks to the DARE programme each university in the Netherlands has such a repository, which can be accessed by everyone over the Internet. There are no limitations to the nature of the stored materials: besides articles and research ­reports the repositories also contain datasets and multimedia content.

The Netherlands is the first country in which all universities and national research institutions participate, and moreover in which all repositories have been interconnected into a single transparent network. The repositories are compliant with international standards. This means that it is also easily accessible using search engines such as Google Scholar.

Research has shown that digital publications are read and cited more, due to their improved accessibility. Placing a publication in a repository maximises this effect. Moreover, it provides a reliable indication of quality that distinguishes the publications from other materials that can be found using search engines, resulting in a positive effect on the ranking within the search results.

nOA=non-Open Access, OA=Open Access; Gunter Eysenbach, ‘ The open access advantage’, in: Journal of Medical Internet Research, 2006 vol. 8

2.1.2  Selective search

The Dutch repositories have their own search engine: www.DAREnet.nl. DAREnet offers direct access to a rich collection of scientific materials that is guaranteed to be freely accessible.

DAREnet also allows you to search specific collections. One such collection is Cream of Science, containing the complete works of over 200 top scientists; another collection is the Promise of Science with doctoral e-theses. And NARCIS, the gateway to Dutch scientific information, can be accessed through DAREnet.

A derived service is the HBO Kennisbank (Knowledge Bank for the Universities of applied sciences’), a network of seven such universities, which provides access to theses and other knowledge products.

2.1.3  Window to your field of expertise

Institutions in various fields of science have seized the DARE programme with both hands to combine their output. ‘Connecting Africa’, a service for Africa studies , is a website with research results of Africanists from both Dutch and international universities.

Other initiatives are DARLIN (Dutch ARchive for Library and INformation sciences), which provides access to publications in the field of library and information science in the Netherlands;  the e-Depot Nederlandse Archeologie (e-repository for Dutch Archaeology) makes accessible primary data of excavations, regional assessments and material studies; the Utrecht Law Review offers scientists in the legal domain an international showcase for digital publications.

2.1.4  The power of standards

The power of DARE lies in its choice of clear standards. For example, a system of unique authors’ names has been developed which allows a scientist to have materials in multiple repositories without losing track.

Due to the standardisation of DARE it is easy to connect all sorts of services to the repositories that will simplify your work. A sample of the services that have been developed:

  • Several universities offer the possibility to generate overviews of publications for personal web pages from the repository.
  • ARNEX (Agricultural Repository News Exchange) offers RSS-type subscriptions to research results at the major international agricultural institutions.
  • PROMAS offers quick and easy creation of your academic profile. 
  • Text enrichment in Virtual Communities allows you to enrich documents with annotations that can be used in a virtual community.
  • The repositories are automatically linked to Metis, the Dutch national university management ­information system for research information.

The above mentioned examples of services are available on DAREnet: Services.

2.1.4  Your work, your rights

You may ask: does my publisher allow publication in a repository? In principle they do, unless you have an agreement in which this is explicitly excluded. It is therefore important to retain control of your own work.

In order to do so you can use the Licence to publish that is developed in the context of DARE. With this license you allow your publisher to publish your work, while retaining all other rights yourself. The Licence is available at DAREnet and at copyrighttoolbox.surf.nl.

2.1.5  Digital sustainability

All works that are published in the repositories are automatically included in the e-depot of the National Library of the Netherlands. This ensures that the digital materials remain legible and accessible, even with future technologies, so your work is guaranteed to be also available in the future.

Chapter 3.

Knowledge Exchange workshop recommendations

3.1 Institutional Repositories: e-theses

Executive Summary
The workshop showed that e-theses are an important part of a university’s research output and yet not sufficiently integrated into a European repositories infrastructure and searchable for the specific e-theses information. As preparation of the the e-theses strand a small demonstrator, showing how an integrated search using the OAI-PHM could be used to offer a retrieval portal on a European scale and where the difficulties in terms of metadata are.
The participants of the discussion agreed upon the fact, that e-theses should be seen as an integral part of broader institutional repositories. As such they should not follow an own metadata set, but be integrated into a metadata profile for the whole repository. It was suggested to develop a European e-prints Application profile. Such a profile should work out the very specific metadata that only apply to doctoral e-theses.
During the workshop the work was started to identify key issues in handling doctoral e-theses and to prioritize those issues. This table can be used for planning further European and country specific activities to reach the goal: make e-theses in Europe better retrievable and therefore visible via institutional repositories.
The results of the demonstrator and discussions will be further elaborated and presented as
“European e-Theses Demonstrator” at the ETD 2007 conference in Uppsala.

Summary of recommendations
The discussions at the workshop brought the following results:

  • E-theses have to be seen as part of an overall institutional repositories infrastructure and content. They should not be handled differently from other scientific and scholarly e-papers.
  • The demonstrator showed that it is generally possible to harvest European e-theses using the OAI-PMH. But it was shown in the study and discussions that Simple Dublin core is not enough. The participants agreed that a richer metadata set is necessary to offer a retrieval portal with good quality.
  • Country specific best practise examples like the XmetaDiss approach in Germany, the uketd metadata set in the UK have been presented and compared to older e-theses specific metadata sets like the NDLTD etd metadata set, the latter was seen to be outdated.
  • It was suggested to develop a European e-theses application profile for metadata at a European level, meaning within a European working group. It was suggested that the GUIDE group could be a good address for such further activities.
  • Investigations have to be made to handle e-theses in terms of metadata as part of e-prints. Therefore e-theses specific informations has to be encoded into metadata. A first list of e-theses specific elements has been produced during the workshop.
  • National authorities should fund national development of richer metadata schemes. This allows to achieve a demonstrator that shows what is possible in a short term. It is the easiest to build 4 or 5 filters at the demoservice level, but it is not scalable. The main goal is to get at a higher level based on national richer metadata, is a good way to put first steps in achieving a European e-prints application schema.
  • The key issues in handling doctoral e-theses have been identified by the workshop strand participants and they have been prioritized as follows:
  • Richer metadata (Use qualified dc; Compound docs; Complex objects; Technical and preservation metadata)
  • Wider IR perspective (How to integrate richer metadata; Is document type a good choice to base services on; How to get them; Compatibility)
  • ETD specific issues (Degree, level, definition; Various dates; Define minimal requirements)
  • Cultural aspects (Language; Interpretation and definitions; Local versus national services)
  • Audience (Keep it simple; Focus on added value metadata; Think about target groups)
  • Subject classification
3.2 Institutional Repositories: Open Archives Protocol for Metadata Harvesting

Executive Summary
The Open Archives Protocol for Metadata Harvesting has been used for several years as the  to ensure interoperability between institutional repositories. It has allowed to develop a number of services. The number of repositories being developed is growing and the variety of services for scholarly material is increasing. The OAI-PMH strands aimed to analyze the current implementations of the protocol in the IRs of the Knowledge Exchange countries, to identify the major issues they encountered, finally, to consider the necessary evolutions in the deployment of the protocol that can allow to support the new requirements of scalability and services for IRs in the next few years.

Summary of recommendations

  • One project of national OAI knowledge bases
  • Communication with the DRIVER project to coordinate the creating of data provider guidelines for IRs. Creation of mechanisms, maybe through the knowledge bases, to enforce higher degrees of compliance of KE data providers to the guidelines.
    • Contact with the DRIVER project on the establishment of data providers guidelines
    • Potential additions or definition of objectives for compliance levels
    • Definition of mechanisms to encourage/enforce the guidelines in the KE countries.
  • One project on persistent identifiers implementation
    • A meeting of KE stakeholders on persistent identifiers to establish a common solutionand a common strategy
    • Creation of a test bed that would involve actors in the different countrie
3.3  Institutional Repositories: Usage Statistics

Executive Summary
An understanding of the use of repositories and their contents is clearly desirable for authors and repository managers alike, as well as those who are analysing the state of scholarly communications. A number of individual initiatives have produced statistics of various kinds for individual repositories, but the real challenge is to produce statistics that can be collected and compared transparently on a global scale. This report details the steps to be taken to address the issues to attain this capability.

Summary of recommendations
1) ACTION: determine practical definition of ’usage’

  • Decide meaning of ‘use’
  • Produce event-based web-log based format for sharing ‘usage events’ to deliver many profiles (COUNTER, awstats, JISC Monitoring, DRIVER etc)

2) ACTION: define objects to be counted

  • Lobby COUNTER to add article level stats
  • Create vocabulary of academic output types in conjunction with Research Paper Metadatagroup

3) ACTION: standard reports

  • Agree on a small set of standard useful statistical reports that repositories should produce

4) ACTION: agree policies for stats

  • Compliance with local laws on e.g. privacy
  • Enhance SHERPA policy tool

5) ACTION: collection and aggregation

  • Agree de-spidering process (first draft agreed)
  • Specify issues of aggregation and deduplication for later study

6) ACTION: collation with external sources

  • Talk to COUNTER about complex objects
  • Set up COUNTER-IR to shadow the publisher group
  • JISC Project IRS to provide initial COUNTER-style reports
  • Talk to COUNTER about aggregating COUNTER stats at consortium level
  • Investigate SUSHI interoperability with repositories and OAI-PMH + OpenURLContextObjects
3.4  Institutional Repositories: Author Identification

Executive Summary
The group formulated two main recommendations:
Firstly: capitalise on knowledge from ongoing initiatives; not only from the library community but also from the Internet community and from standardisation activities within e.g. the universities.
Secondly: address the need for a sustainable model for author identification. The level of ambition should match the expected funding for the activity.

Summary of recommendations
To support these recommendations, the group proposes four initiatives:
1)  initiate two studies, one on the need for author Identification and one addressing the
potential business models for a author ID based services,
2)  establish a prototype for cross institution use of author ID, which can serve as a blueprint,
3)  arrange workshops for experts from the library field and from universities and from the Internet community identifying relevant initiatives and potential architectures for a service
4)  establish a working group looking into the flow of relevant metadata.

3.5 Institutional Repositories: Exchanging Research Information

Executive Summary
The objective of the strand “Exchanging Research Information” was to bring together CRIS (Current Research Information Systems) and OAR (Open Access Repositories). Both applications deal with a specific segment of the academic information domain – notably the specifications, products or outcomes of academic research. Substantial commonalities exist between the two. Rooted in different units of the university (research administration vs. library) they, however, also have their individual characteristics: CRIS primarily have an institutional scope and are mainly referring to context of research whereas OAR are referring to content of research and are per definition internationally oriented.
Given their affinity, achieving interoperability between CRIS and OAR is desirable and will benefit all parties involved, including the researchers. A joint approach will avoid double input and management of redundant data as well as redundant services and processes and will both enhance the efficiency and quality (mutual enrichment) of the services offered by CRIS and OAR to their users.
Looking at the current situation in the KE-countries, significant differences can be noticed between Denmark (unified system), the Netherlands (strong national CRIS solution METIS with first integration with repositories), the UK and Germany (heterogeneous landscapes of institutional and subject-based repositories and less standardized CRIS). Successful integrated solutions can be found at an institutional or subject-based level, but integration becomes less probable moving towards complex landscapes at the national and supra-national level.
As a specific consequence of this situation, ad hoc technical developments to support interoperability between CRIS and OAR at large scale are currently recommended to be highly focussed on a specific entity in the academic information domain (e.g. managing the full-text-link of a research paper). As a broader consequence, a sustainable and optimal solution for the combination of CRIS and OAR at large scale requires a thorough analysis and specification capable of representing the heterogeneity of the two respective landscapes. It requires a flexible, service-oriented approach based on an integrated institutional policy concerning the academic information domain, and targeting both organizational aspects (taking account of business processes and services) and technical aspects (implementation of service oriented architecture).
One step to take in this respect – and the first part of a follow-up activity of the strand – would be the delineation of the academic information domain, and notably the part within the academic information domain that is covered by CRIS and OAR. This would involve an analysis of the information elements (entities and attributes) and the workflows and services involved with CRIS and OAR. Another, parallel, step – and the second part of a follow-up activity of the strand – would focus on ad hoc technical development for managing a specific entity in the academic information domain in both CRIS and OAR. Once this work is done and the results of both steps are integrated, the definition of an optimized services model, integrating CRIS and OAR becomes more feasible and can be based on the principles of reuse of services (also on a supra-institutional level) and proper ownership of data. Such a new services model may have an impact on existing technical solutions (decomposition of systems) and even on organizational units (restructuring of business processes and workflows).

Summary of recommendations
The strand recommends to the KE Board to initiate, and provide ways of funding, for the follow-up activities identified above. An adequate instrument is a sequel of the workshop. The overall goal should be preserved but the strand should be split-up in two groups: one group is working on policies and services in the academic information domain with experts for high-level and middle-level research management, and the other group is working ‘hands-on’ to build a demonstrator for achieving interoperability between CRIS and OAR for a specific entity in the academic information domain. Responsibility for a joint report should ensure mutual exchange.

1) Work against the background of an integrated information policy and management in the institution, concerning the Academic Information Domain

  • Integration and optimizing of business processes and workflows
  • Institutional policy for integrated management
  • Researcher-centred approach
  • Open for re-use outside of the institution
  • Apply a service oriented architecture

2) Think against the background of a distributed architecture - ‘Contracts’ between data providers (e.g. service-level agreements)
3) Apply a service reference framework (e.g., e-Framework by JISC and DART)

  • Start a follow-up activity within the KE framework to delineate the entities, attributes, services and workflows of the academic information domain as far as CRIS and OAR are concerned, as a concrete follow-up to the strand’s discussion.

4) Work out an operational example of a technical solution

  • Start a second and parallel follow-up activity to develop – as an example and demonstrator - a technical solution for the management of a specific entity within the academic information domain (e.g. the full-text-link of a research paper), common to both CRIS and OAR.
3.6 Institutional Repositories: research paper metadata  

Executive Summary
The Knowledge Exchange organised a workshop in January 2007 on the interoperability of institutional repositories, at which the exchange of research paper metadata was one of six thematic strands. The group covering this topic included representatives from Denmark, Germany, the Netherlands and the UK. Current practice was reviewed, and found unsatisfactory, although it is difficult to be precise about exactly where the shortcomings are without a clearer idea of the services that are intended to use the metadata. The group also concluded that the other thematic strands at the workshop were likely to recommend work that would be relevant to research paper metadata. Therefore, the group recommended
further work that makes up an iterative programme, that should be considered in conjunction with those from other strands, and which should both clarify the aims and drivers toward greater interoperability, and achieve some tangible steps along the way.

Summary of recommendations
The group recommends that:

  • a small piece of work be done to build a framework within which the business drivers can be identified for improved sharing of richer1 descriptions of research papers.
  • a range of scenarios be agreed and documented, which would be made possible or easier or more effective by the sharing of richer descriptions of research papers.
  • A piece of work should be scoped and commissioned that:
    • compares the metadata formats in common use within KE countries with the scenarios previously collected, to identify the strengths and weaknesses of each with respect to those scenarios and the services they might imply.
    • based on this analysis, makes recommendations on how metadata exchange and interoperability may be improved; the recommendations should refer to both “quick wins” and steps toward longer term goals.
    • as well as looking at metadata structures, also looks for opportunities with respect to:

    i. time-stamped factual authority relations
    ii. dependable (resolvable, persistent) identifiers for entities described
    iii. cataloguing rules
    iv. shared vocabularies for common elements such as ”resource type” and the maintenance of such vocabularies
    v. shared understanding on the use of xml wrappers or containers
    This work should be based on the more detailed discussion in the main text of this report, and should not be undertaken in isolation from similar work following from other strands of the KE workshop.

  • an interoperability demonstrator / testbed should be established, in collaboration where appropriate within existing initiatives, to provide a realistic environment in which to monitor and evaluate the problems and progress of the area.

References:

  1. F.P. van Oostrom, Stemmen op schrift (Amsterdam, 2006)
  2. See www.surf.nl/copyright