Schweitzer Fachinformationen
Wenn es um professionelles Wissen geht, ist Schweitzer Fachinformationen wegweisend. Kunden aus Recht und Beratung sowie Unternehmen, öffentliche Verwaltungen und Bibliotheken erhalten komplette Lösungen zum Beschaffen, Verwalten und Nutzen von digitalen und gedruckten Medien.
This chapter draws upon discussions with developers and users in the UK Research Councils (RCUK) E-Science Programme, JISC VRE Programme and the user requirements studies they have conducted. We have ourselves carried out such surveys and have participated in several workshops on usability and requirements. Results of this work have been provided in a number of reports to JISC (Allan et al., 2006a, 2006b, 2006c, 2006d, 2006e, 2006f, 2006g).
We consider the activities involved in conducting research to be ultimately driven by knowledge creation. In October 2004, we made the following definitions:
■ data: bits and bytes arising from an observation (non-repeatable), an experiment (repeatable) or a computer simulation (calculated);
■ information: relationship between items of data of the form ‘A is always associated with B in some way’;
■ knowledge: understanding of causality in relationships, e.g. ‘B happens after A because of X’ – this knowledge is shared globally.
The research activity could be described at the highest level as creating knowledge from data. In sharing or reusing data, information and knowledge in collaborative research, we meet the problem of metadata and representation. Bits and bytes, no matter how well curated, are of no use unless they can be interpreted.
Figure 2.1 shows our own version of the steps in the research lifecycle which we believe to be appropriate in e-research activities.
For simplicity, we have omitted from this the all-important administrative activities involved with grant proposals and funding, project management, collaboration forming, actual collaboration and computing. While it is at least in theory possible and desirable to link VREs to institutional and organisational processes, this is currently very challenging.
This book does not discuss knowledge management, but has a lot to say about data and information, the latter typically in the form of published research outcomes such as journal papers and technical reports.
In discussions of data and information management, the term ‘metadata’ is frequently encountered. Metadata is often defined as data about data. Classical metadata will have information such as description, provenance and location referring to a data set or collection. Each of these can be a complex set of information; for instance, location could refer to one or more copies of the data with different access rights, the sort of information required for a link resolver such as OpenURL. Metadata is typically accessed in the form of a catalogue and is extremely useful for search purposes. A particularly important aspect of e-research is the capability to automatically generate and maintain accurate metadata as it has been found that many users do not like entering metadata or are prone to making mistakes.
XML, the eXtended Markup Language which is a generalisation of HTML for web pages, is now popular (W3C, 2007). It is only really applicable to data and information in the form of text and of no use for binary data such as images, except for representing their metadata. XML requires a ‘schema’ to be defined so that the meaning and relationship of its ‘tags’ can be published. For HTML this is not a problem as the underlying standard is well known and fixed. Another problem with XML is that it can only represent a tree-like organisational structure. As such, it is useful for catalogues, but not so useful for representing complex procedures that may include cyclic behaviour, such as a laboratory workflow.
Other XML-based computer languages must be used to convey the relationships required to represent and form deductions from data, i.e. knowledge. These include the Resource Description Framework (RDF) and the Web Ontology Language (OWL). Both these will be met in discussions of the Semantic Web. RDF and OWL are also important for data interoperability, as they permit terms (XML tags) to be given meaning (e.g. via a dictionary). UK e-science projects have used a family of XML-based technologies, most notably RDF, to provide a mechanism for representing resource metadata. Ontologies capture the meaning of metadata terms and their interrelationships. OWL provides a vocabulary for describing classes of RDF resources and their properties (including relations between classes, cardinality, etc.).
In addition to its use for information and metadata, XML forms the basis of web services that use the SOAP protocol, because it can be ingested and consumed by software written in other computer languages. Web services are a bit like e-mail, with a header and data packet. Binary data can be transported as attachments. Security assertions can be provided via the Security Assertion Markup Language (SAML), which again is XML-based. We will refer to web services again when discussing a service-oriented architecture (SOA) for e-research systems.
We have found that the key areas that need to be addressed are those of integrating information (in the form of publications, notes, etc.) and data; long-term archival and persistent access with appropriate access control; seamless search and discovery from a (portal) interface alongside other research tools; publication of data following peer review from personal and group information management systems; and collaborative working in discovering, interpreting and using data and information. These areas, with subject-specific differences in detail and usage pattern, are constituents in the generic research lifecycle and some aspects overlap with e-learning and digital information management. We refer to the infrastructure required for a VRE to support these areas as the ‘wider information environment’.
In fact, it turns out that while the lifecycle is similar, the activities of creating research data are extremely varied from one domain to another and there is little opportunity to share services at this level. Within an individual domain, such as bioinformatics, certain procedures can however be encapsulated and reused as workflows, e.g. with the Taverna tool in the myGrid and myExperiment projects. Each domain is therefore likely to have specialist tools built into a VRE customised for its specific research community.
We therefore focus on a simple all-embracing generic use case for ‘discovery to delivery’ in research which might be as follows:
A researcher wants to carry out a subject-specific search via one or more portal interfaces and to be able to find relevant publications and data associated with their studies and to be able to find other papers which cite them. They may also want to find associated grant references and appropriate funding opportunities for related work.
The researcher then wants to access and download some of the data sets and carry out a similar piece of work using a new model, new insight or adding new data to the previous study. In an experimental study, they might be repeating a recommended procedure on one or more new samples or applying an improved procedure to a benchmark sample. This might involve downloading and using an existing workflow description.
The researcher will afterwards discuss and share results with a peer group, using appropriate personal and group information management software, and will eventually create reports and publish the results together with data and information related to their model or hypothesis.
Klyne (2005) describes some more specific requirements for VREs that affect the provision of repository services and personal information systems from the Sakai VRE project. These include:
■ access to best practice documentation, and support for best practices, within the VRE;
■ capture and storing of collaborative discussions;
■ support in training new researchers;
■ searchable list of conferences, lectures and other events;
■ ability to locate other researchers;
■ selective delivery of information;
■ supporting grant applications;
■ forums and ‘spaces’ for internal communication and recruitment;
■ access to searchable databases of digital (or digitised) artefacts;
■ data repositories.
Sergeant et al. (2006) describe another set of stated requirements, this time from the EVIE VRE project:
■ find and acquire published information such as articles, conference proceedings, literature;
■ find out about funding opportunities, apply for funding, manage funded projects;
■ collaboration with partners in the university or at other institutions;
■ share or archive research results such as preprints, postprints, technical...
Dateiformat: ePUBKopierschutz: Adobe-DRM (Digital Rights Management)
Systemvoraussetzungen:
Das Dateiformat ePUB ist sehr gut für Romane und Sachbücher geeignet – also für „fließenden” Text ohne komplexes Layout. Bei E-Readern oder Smartphones passt sich der Zeilen- und Seitenumbruch automatisch den kleinen Displays an. Mit Adobe-DRM wird hier ein „harter” Kopierschutz verwendet. Wenn die notwendigen Voraussetzungen nicht vorliegen, können Sie das E-Book leider nicht öffnen. Daher müssen Sie bereits vor dem Download Ihre Lese-Hardware vorbereiten.Bitte beachten Sie: Wir empfehlen Ihnen unbedingt nach Installation der Lese-Software diese mit Ihrer persönlichen Adobe-ID zu autorisieren!
Weitere Informationen finden Sie in unserer E-Book Hilfe.
Dateiformat: PDFKopierschutz: Adobe-DRM (Digital Rights Management)
Das Dateiformat PDF zeigt auf jeder Hardware eine Buchseite stets identisch an. Daher ist eine PDF auch für ein komplexes Layout geeignet, wie es bei Lehr- und Fachbüchern verwendet wird (Bilder, Tabellen, Spalten, Fußnoten). Bei kleinen Displays von E-Readern oder Smartphones sind PDF leider eher nervig, weil zu viel Scrollen notwendig ist. Mit Adobe-DRM wird hier ein „harter” Kopierschutz verwendet. Wenn die notwendigen Voraussetzungen nicht vorliegen, können Sie das E-Book leider nicht öffnen. Daher müssen Sie bereits vor dem Download Ihre Lese-Hardware vorbereiten.
Bitte beachten Sie: Wir empfehlen Ihnen unbedingt nach Installation der Lese-Software diese mit Ihrer persönlichen Adobe-ID zu autorisieren!