Comment on:
The following comment refers to this/these guideline(s)
Guideline 7
Cross-phase quality assurance
Researchers carry out each step of the research process lege artis. When research findings are made publicly available (in the narrower sense of publication, but also in a broader sense through other communication channels), the quality assurance mechanisms used are always explained. This applies especially when new methods are developed.
Explanations:
Continuous quality assurance during the research process includes, in particular, compliance with subject-specific standards and established methods, processes such as equipment calibration, the collection, processing and analysis of research data, the selection and use of research software, software development and programming, and the keeping of laboratory notebooks.
If researchers have made their findings publicly available and subsequently become aware of inconsistencies or errors in them, they make the necessary corrections. If the inconsistencies or errors constitute grounds for retracting a publication, the researchers will promptly request the publisher, infrastructure provider, etc. to correct or retract the publication and make a corresponding announcement. The same applies if researchers are made aware of such inconsistencies or errors by third parties.
The origin of the data, organisms, materials and software used in the research process is disclosed and the reuse of data is clearly indicated; original sources are cited. The nature and the scope of research data generated during the research process are described. Research data are handled in accordance with the requirements of the relevant subject area. The source code of publicly available software must be persistent, citable and documented. Depending on the particular subject area, it is an essential part of quality assurance that results or findings can be replicated or confirmed by other researchers (for example with the aid of a detailed description of materials and methods).
Guideline 12
Documentation
Researchers document all information relevant to the production of a research result as clearly as is required by and is appropriate for the relevant subject area to allow the result to be reviewed and assessed. In general, this also includes documenting individual results that do not support the research hypothesis. The selection of results must be avoided. Where subject-specific recommendations exist for review and assessment, researchers create documentation in accordance with these guidelines. If the documentation does not satisfy these requirements, the constraints and the reasons for them are clearly explained. Documentation and research results must not be manipulated; they are protected as effectively as possible against manipulation.
Explanations:
An important basis for enabling replication is to make available the information necessary to understand the research (including the research data used or generated, the methodological, evaluation and analytical steps taken, and, if relevant, the development of the hypothesis), to ensure that citations are clear, and, as far as possible, to enable third parties to access this information. Where research software is being developed, the source code is documented.
Handling research data in the humanities and social sciences
In the humanities and social sciences, all documents, materials, images, audio and video recordings, texts, measurement and evaluation data that are generated or processed can be considered research data in the broadest sense of the word. They form an integral part of the research results and are important both from the point of view of verifiability and very often with regard to reuse in further research, too (e.g. in source editions or longitudinal social science studies).
Very frequently, data generated by research in the humanities and social sciences either cannot be replicated, or else its recovery is barely feasible from a practical point of view. Examples of the first instance include surveys of political attitudes at specific points in time or excavations in a certain archaeological context, while the second instance might include lengthy text editions or documentation of museum objects which would be virtually impossible to carry out or finance more than once.
In order to ensure quality assurance across all phases, effective and reliable data management is crucial in all research projects in the humanities and social sciences where research data (as defined above) is generated or processed on a significant scale. Given the increasing importance of larger volumes of data, data management itself has a growing impact on the quality of research results. For this reason, good research practice in the humanities and social sciences should not just involve devoting the necessary attention to data management, but also recognise the contributions made by researchers in this regard as a relevant performance criterion. Individual academic achievement is not only reflected in publications: it is increasingly linked to the processing of research data as well as the initial or further development of research software.
Even though data in general is becoming increasingly important in research, its role can vary greatly due to the enormous diversity of methods and project constellations within the humanities and social sciences. While certain projects explicitly aim to obtain and process large amounts of data, in other cases only certain sections or phases of the research involve data analysis. Although the effort involved can vary considerably, research data management always involves systematic preparation and organisation of the entire data handling procedure – from collection, processing and documentation through to storage, archiving and provision for reuse. The analytical steps carried out using various (software) tools are therefore an integral part of research data management.
Ideally, research projects can be based on standards and best practices that are recommended by learned societies or other relevant organisations or institutions. In recent years, a number of learned societies and DFG review boards have formulated guidelines and recommendations on the handling of research data, also addressing the specific requirements of various subject areas and research approaches. These can be found on the DFG website (see link below).
Data have to be reliably saved (or, in the case of analogue materials, put into safe storage) and – depending on context and re-usability – archived on a long-term basis. Ideally, in addition to securing the data, external access should also be made possible for verification purposes and for reuse, unless there are particular reasons to the contrary. If reuse is to be enabled, access must not be limited to merely viewing the data but should allow further processing according to current requirements. However, there are certain fundamental preliminary considerations to be made when it comes to archiving and reuse. Decisions have to be made about what is of “archival value”, what effort and/or cost can realistically be incurred and, lastly, what legal provisions have to be observed.
In some cases, the archiving and provision of the processed data is obligatory (e.g. heritage conservation) or is part of the goal of a project (e.g. source editions). In such cases, decisions still have to be made regarding the state (versioning) in which data is to be recorded and to what extent all data generated during the research process is to be included. In many other instances, when deciding whether and under what conditions data is to be made available for scientific reuse, it will be important to take into account the level of demand within the research community as well as assess the cost/benefit ratio of data preparation and documentation for dissemination purposes.
In the case of long-term social science projects, for example, it would be reasonable to expect a high level of demand for the data. In addition, the type of data or content will often determine whether the data is suitable for disclosure. This is often problematic in the case of sensitive data from qualitative interviews or video material for reasons of data privacy: such data is more difficult to anonymise or when anonymised, essential information may be lost.
Where data is released for reuse, the question of licensing arises. There are a range of different usage licences that regulate the extent to which data may be used by third parties, whether data may only be viewed or also altered, for example, and for what purposes it may be used. An overview of various usage licences and the rights involved is provided by the Consortium of European Social Science Data Archives, for example (see link below).
Finally, permission for reuse can also be granted for the investigation of a specific question only, for example, and not for any kind of analysis. Here, data producers can also set embargo periods for the reuse of data: the reason might be to ensure publication of the results obtained from the project and completion of any qualification work, for example.
As a matter of principle, researchers would be expected to answer the following questions individually for any given project and handle the research data accordingly: Do archiving and provision obligations apply? Does the research community have an interest in the data? Is the data suitable for publication, and if so, to what extent, in what form and at what point in time? In social science research projects in particular, it is good research practice to observe data protection requirements when carrying out studies on or with individuals from the start (see the current German Data Forum guide – the link is provided below).
The comment belongs to the following categories:
GL7 (Humanities and social sciences) , GL12 (Humanities and social sciences)
Keywords:
performance assessmentusage rightsarchivingpublicationdata protection/data privacyreplication/reproductiondocumentationrepositoryFAIR principlesresearch dataresearch software