At the Digital Infrastructures for Research Conference in Krakow, Poland, we had the opportunity to talk with Sarah Jones from the Digital Curation Centre in Edinburgh. The background of Sarah Jones originally lies in physical archives. She was working with the collections at the University of Glasgow and the National Health Board archive. Through an archive masters course she got involved with digital curation. She has worked on various projects but for the last eight years she has been with the Digital Curation Centre that is specifically dealing with research data management in the United Kingdom's higher education sector. The Digital Curation Centre is helping universities to support researchers. The Centre is working largely with intermediaries, with library, IT and research offices within universities to build capacity and skills within those institutions.
Sarah Jones is working with all types of research data, what is important for evidencing. The research findings are published by the institution. The data range from audio-visual material over spreadsheets to Big Data. When people are working with Big Data, there is usually the European infrastructure or an infrastructure at a much higher level. Therefore, they are not as reliant on the university services. It is important to coordinate across those different levels of provision.
The Digital Curation Centre doesn't run a repository service. The Centre doesn't store data nor curates it. Its role consists in being a national coordination and advisory body. The actual storage will be either in the institution or through services like EUDAT. The Digital Curation Centre supports the best practices. A practical thing for an individual researcher is often writing a Data Management Plan (DMP). In the United Kingdom there are a number of funding requirements. As part of the grant application a researcher will have to write a DMP. The Centre has a service called DMP Online that helps researchers to write that Data Management Plan. It is really a process of reflection to think about what data they are going to create; what standards they should use; how they are going to store the data, so they can collaborate with their partners in other institutions; what happens in the long term about publishing the data, making sure it is fair that others can find that data and that it is interoperable and reusable. The Data Management Plan is really a way for researchers to reflect on that. Through all of the different infrastructure providers they can get the actual support and services to manage the data.
The Data Management Plan is obligatory for grant awarding in the United Kingdom. The majority of the funders in the United Kingdom are Research Councils. Most of them have specific requirements at the application stage or they encourage plans to be written as part of the overall project.
In Horizon 2020, in the European Programme, there are now also a number of programmes in which you need to have a Data Management Plan. Perhaps they looked at the United Kingdom and thought this was something they could use. Sarah Jones thinks that the European Commission is aware of the UK policy. The United Kingdom has been doing a lot of work on research data management for several years now, the same as in The Netherlands. Sarah Jones thinks other countries have looked at those models and picked up on them. In terms of the European Commission, they have their Open Data pilot which initially started with a few work areas that were part of the pilot. This has expanded each year when new Work Programmes came out. From 2017 it will be open data by default. All areas need to consider being part of the pilot. This is not to say that all data needs to be made open because not all data can be, but everybody needs to think about open data and what they can do with it in their individual project. It is a really big development. What Sarah Jones particularly likes about the European Commission's policy in terms of the Data Management Plan, is the fact that it is a deliverable; that it is something that is part of the project and that it gets updated as a kind of living document throughout that research project.
There are a number of infrastructure providers in the European context. Two of those, OpenAire and EUDAT, the Digital Curation Centre is member of. The Digital Curation Centre supports these infrastructure providers on the Data Management Planning aspects and some training in general on research data management. EUDAT and OpenAire will recommend tools like DMP Online. Researchers can use that. Just because DMP Online is an open source project some countries have taken the code. They run and host national instances, so you may find other versions of DMP Online like DMP Tuuli in the Finnish case. There may well be a local tool that can be used but otherwise people can use the main instance of DMP Online. You will find that there are guidelines provided by different organisations within individual countries as well, like DANS in The Netherlands. They provide a lot of support on data management in general. Through OpenAire and EUDAT, there are guidelines and support on data management planning.
Collaboration with big infrastructure projects such as EUDAT and OpenAire is critical for the Digital Curation Centre. The issues around managing data and curating it in the long term, making sure others can reuse the data, those issues are global. Sarah Jones is confirmed that the United Kingdom is not facing different challenges as opposed to other countries. The more the different projects and infrastructure providers come together, the better it is really. One thing the Digital Curation Centre found is that the Research Data Alliance is a good vehicle for doing that. There were some initiatives which started in the United Kingdom. The Digital Curation Centre, for instance, started to create a metadata standards' registry. There was a catalogue on the website of the Digital Curation Centre which started to record different metadata standards in different disciplines. This is something that the Digital Curation Centre was able to take to a group of the Research Data Alliance that was interested in metadata standards and get more community input on it. This is now in the Research Data Alliance but it is something that the Digital Curation Centre is embedding into the DMP Online tool as well so that the Centre can help researchers to identify relevant standards and that it can also capture the standards in a machine-readable format to be fed out to other services. Mutual collaboration, sharing outputs and reusing each other's output is very critical.
In a presentation during the Digital Infrastructures for Research Conference it was said that in Australia researchers do not really do anything about data storage or about preserving data after a project has finished because there is no incentive to do this. Sarah Jones thinks it can be very difficult. There really is a need to focus on the rewards and recognition because at the moment, particularly in the United Kingdom context, there are a lot of policies. There is a strong requirements landscape but there are no benefits to balance that. Sometimes, data don't get reused. Researchers are being forced to share their data but they see them lying in repositories, not getting used while they put a lot of time and investment into documenting that data to share it. It does lead people to question: "What is the value of this and why are we doing it?" You really need to see the reuse of data that needs to be incentivized but you also need to see recognition for researchers for good practice. It needs to become part of promotion criteria. It needs to be recognized on grant applications. If somebody says: "I am not just recording my publications. Here is a data output", they need to get recognition for that in the same, or possibly in a greater scale, than a publication because a data set can be a much more all-encompassing output.