Friday 1 June 2012 RSP Scholarly Communications: New Developments in Open Access, RIBA Encouraging data publication - the JISC Managing Research Data Programme Simon.
Download ReportTranscript Friday 1 June 2012 RSP Scholarly Communications: New Developments in Open Access, RIBA Encouraging data publication - the JISC Managing Research Data Programme Simon.
Friday 1 June 2012 RSP Scholarly Communications: New Developments in Open Access, RIBA Encouraging data publication - the JISC Managing Research Data Programme Simon Hodson JISC Programme Manager, Managing Research Data Finally, a lamentable element of the culture in social psychology and psychology research is for everyone to keep their own data and not make them available to a public archive. This is a problem on a much larger scale, as has recently become apparent. Even where a journal demands data accessibility, authors usually do not comply (Wicherts et al. 2006). Archiving and public access to research data not only makes this kind of data fabrication more visible, it is also a condition for worthwhile replication and metaanalysis. Recommendation Far more than is customary in psychology research practice, research replication must be made part of the basic instruments of the discipline. Research data that underlie psychology publications must be held on file for at least five years after publication, and be made available on request to other scientific practitioners. This rule is to apply not only to raw laboratory data, but also to completed questionnaires, audio and video recordings, etc. The publication must state where the raw data reside and how to access them. INTERIM REPORT REGARDING THE BREACH OF SCIENTIFIC INTEGRITY COMMITTED BY PROF. D.A. STAPEL Tilburg, 31 October 2011 Data Reuse: asking new questions Papers based upon reuse of archived observations now exceed those based on the use described in the original proposal. – http://archive.stsci.edu/hst/bibliography/pubstat.html Combining data from disparate sources ‘New technologies for sharing data and for combining data from disparate sources are particularly valuable in multidisciplinary fields such as earth science and nanoscience. ... The challenge of federating, mining, analysing and interpreting these data will be a key focus in coming years.’ http://www.rin.ac.uk/ourwork/using-and-accessinginformation-resources/physicalsciences-case-studies-use-anddiscovery- Data Management and Data Publication Good data management is good for research – More efficient research process, avoidance of data loss Data sharing / data publication is good for research – Verification of research findings – Benefits of data reuse: new questions – Metastudies; integration of data in interdisciplinary research Research funder policies, legislative frameworks, good practice, open data agenda – The outputs of publicly funded research should be publicly available. – The evidence underpinning research findings should be available for validation Alignment with university missions. – Universities want to provide excellent research infrastructure. – Universities want to have better oversight of research outputs. Research Data Principles (1) Publicly funded research data are a public good, produced in the public interest, which should be made openly available in a timely manner with as few restrictions as possible. (RCUK, EPSRC) Data with acknowledged long-term value should be preserved and remain accessible and usable for future research. (RCUK, EPSRC) Sharing research data is an important contributor to the impact of publicly funded research. (EPSRC) Research data of future historical interest, and all research data that represent records of the University, including data that substantiate research findings should be deposited. (Edinburgh). Published results should always include information on how to access the supporting data. (RCUK) Research Data Principles (2) To recognise the intellectual contributions of researchers who generate, preserve and share key research datasets, all users of research data should acknowledge the sources of their data and abide by the terms and conditions under which they are accessed. (RCUK, EPSRC) There are legitimate reasons for restricting access to data (legal, ethical, commercial). Researchers and research institutions should ensure data is well managed throughout the lifecycle in order to guard against inappropriate release of research data. (RCUK, EPSRC) RCUK Common Principle on Research Data http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx EPSRC Research Data Principles http://www.epsrc.ac.uk/about/standards/researchdata/Pages/principles.aspx University of Edinburgh RDM Policy http://www.ed.ac.uk/schoolsdepartments/information-services/about/policies-and-regulations/researchdata-policy EPSRC Research Data Policy Expectations Research organisations to have RDM policy, advocacy and support functions. (i, iii) Research data to be effectively managed and curated throughout the life-cycle (viii) Research organisations to maintain public catalogue of research data holdings, adequate metadata and permanent identifier (v) Publications to indicate how research data can be accessed (ii) Data to be retained for 10 years from last access (vii) Research data management to be adequately resourced from appropriate funding streams (ix) Roadmap in place by 1 May 2012 Compliance by 1 May 2015 Dryad Data Repository JDAP: Joint Data Archiving Policy Joint Data Archiving Policy: http://datadryad.org/jdap Joint declarations, Feb 2010, in American Naturalist, Evolution, the Journal of Evolutionary Biology, Molecular Ecology, Heredity, and other key journals in evolution and ecology: http://www.journals.uchicago.edu/doi/full/10.1086/650340 This journal requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as GenBank, TreeBASE, Dryad, or the Knowledge Network for Biocomplexity. Allows embargos of up to one year; allows exceptions for, e.g., sensitive information such as human subject data or the location of endangered species. ‘Data that have an established standard repository, such as DNA sequences, should continue to be archived in the appropriate repository, such as GenBank. For more idiosyncratic data, the data can be placed in a more flexible digital data library such as the National Science Foundation-sponsored Dryad archive at http://datadryad.org.' ‘Some BioMed Central journals now additionally encourage or require authors, as a condition of publication, to include in some article types a section that provides a permanent link to the data supporting the results reported in the article. … The aim is to provide links in a consistent place within an article to supporting data - regardless of the location or format of the data and to make it clear to readers when they can also access the data as well as the article.’ Adopted by 20 BMC journals between Aug 2011 and Mar 2012 (most encourage…) http://www.biomedcentral.com/ab out/supportingdata Challenges and Questions… What research data should be kept and for how long? – Relates to the use cases: verification and reuse… How do we best support the good management and publication of research data? Where should research data be archived? – Discipline data centres, institutional data repositories… – How do we ensure discoverability? Where this requires change in practice, how do we motivate researchers to make data available in a usable form? Barriers to data sharing… Recognition Recognition Recognition Practical: – Lack of infrastructure. – Lack of support or expertise. – Technical challenges (data types, metadata, systems to support RDM) Behavioural: – Concern that data may be misused. – Concern that will lose scientific edge. – Concern that will not be credited. – Lack of career rewards for data publication. See ODE report, using Parse.Insight findings: http://www.alliancepermanentaccess.org/wpcontent/uploads/downloads/2011/11/ODE-ReportOnIntegrationOfDataAndPublications-1_1.pdf RIN Report, ‘To Share or not to share’, http://www.rin.ac.uk/our-work/data-management-and-curation/share-or-notshare-research-data-outputs Supporting the Research Data Lifecycle Store Plan Reuse Annotate Create Access Discover Describe Identify Publish Use Appraise Hand Over? Discard Select A holistic approach… Leadership and Policy Development Publication, Citation and Discovery Mechanisms RDM Systems and Infrastructure Guidance and Training Support for Data Management Planning Citing and linking to research data DCC Briefing Paper: Ball, A., Duke, M. (2011). ‘Data Citation and Linking’. DCC Briefing Papers. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/briefingpapers/ DCC How to Guide: Ball, A. & Duke, M. (2011). ‘How to Cite Datasets and Link to Publications’. DCC How-to Guides. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/howguides Dryad-UK Project Dryad-UK Expand the number of journals: BMJ Open, titles from PLoS and BioMed Central: Prepare a business model for long term funding of the data repository: e.g. supported by payments from journals, in turn recouped from subscription or author-pays OA fees. Research Funder Research Project Gold OA Fee Journal Repository Costs Estimate costs of archiving (curation and preservation) datasets in Dryad: $25-75 per publication Estimate full costs of research and publication per OA article: $2500 Costs of data archiving in Dryad 1-3% of costs of producing the article. See Piwowar http://researchremix.wordpress.com/page/2/ and Vision ‘Open Data and the Social Contract of Scientific Publishing’ http://www.bioone.org/doi/full/10.1525/bio.2010.60.5.2 New Funding Model (officially adopted May 2012) A range of payment plans to be offered in order to account to differing usage models: 1. Journal-based • One or more journals from a publisher prepay based on number of research articles published • Deposit price of ca. $25 per research paper 2. Vouchers • Any organisation may pay in advanced for a fixed number of deposits • Deposit price of ca. $50 per research paper 3. Pay-as-you-go • Any organisation may pay retrospectively for deposits • Deposit price of ca. $50 + surcharge per research paper 4. Author-pays • For authors submitting via journals not having one of the above plans • Deposit price of ca. $50 + surcharge per research paper • Additional curation charges if not involving an integrated journal Slide Credit: Brian Hole New Governance Model (officially adopted May 2012) 1. Incorporation of Dryad as a US tax-exempt not-for-profit organisation. 2. A twelve-person Board of Directors, vested with legal and financial responsibility for Dryad a. Directors to serve for staggered three-year terms and able to serve multiple terms b. Directors to be elected by (but not necessarily from) the members c. Board members to appoint the officers of the Board, as well as to appoint exofficio members as required. 3. Dryad executive staff to be hired as employees, reporting to the Board of Directors. 4. Membership of Dryad to be open to any legitimate organisation that supports its mission. 5. Members to pay a modest annual fee, not to be depended on for revenue. 6. Members to vote on amendments to the Articles and By Laws, and serve as an advisory body to the Board of Directors. Slide Credit: Brian Hole Data and publications ODE Report on integration of data and publications: http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/2011/11/ODE-ReportOnIntegrationOfDataAndPublications-1_1.pdf Data and publications Linking / integrating data an publications: helps the data to be better discoverable helps the data to be better interpretable provides the author better credit for the data and reversely: the data add depth to the article and facilitate better understanding. ParseInsight findings: 85% of researchers are in favour of linking data with literature. Recognition of data as a ‘first class research object’, requiring preservation, recognition, validation and dissemination just like articles. ODE Report on integration of data and publications: http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/2011/11/ODE-ReportOnIntegrationOfDataAndPublications-1_1.pdf Data Publication Initiatives Initiatives for linking data to literature, for data papers and for peer review of data. BMC Research Notes (‘encourages the publication of software tools, databases and data sets and a key objective of the journal is to ensure that associated data files will, wherever possible, be published in standard, reusable formats’) http://www.biomedcentral.com/bmcresnotes/ Earth System Science Data aims ‘to publish data according to the conventional fashion of publishing articles, applying the established principles of quality assessment through peer-review to datasets’ http://www.earth-system-science-data.net/ RDMF8 ‘Engaging with the publishers’: http://www.dcc.ac.uk/events/research-datamanagement-forum-rdmf/rdmf8-engaging-publishers PRIME Publisher, Repository and Institutional Metadata Exchange UCL LIBRARY SERVICES [email protected] 2012 INSTITUTE OF ARCHAEOLOGY 75 YEARS OF LEADING GLOBAL ARCHAEOLOGY www.ubiquitypress.com / @ubiquitypress PRIME: Project focus • Developing a system to exchange metadata between: • the UCL Discovery EPrints institutional repository • the Archaeology Data Service subject repository • the Journal of Open Archaeology Data • Focusing on archaeology data only to pilot the system • Building on other successful JISC projects: • DryadUK • REWARD • SWORD-ARM [email protected] www.ubiquitypress.com / @ubiquitypress PRIME: Use Case #1 • A UCL Researcher deposits data in an external subject repository. • The subject repository sends the metadata and DOI of the data to the UCL institutional repository so that it has a record of the output. [email protected] www.ubiquitypress.com / @ubiquitypress PRIME: Use Case #2 • A UCL Researcher deposits data in their institutional repository. • The institutional repository sends the metadata and DOI of the data to the appropriate subject repository so that it has a record of the output. [email protected] www.ubiquitypress.com / @ubiquitypress PRIME: Use Case #3 • • • A UCL Researcher submits an article to a journal, and is asked to archive the data as a precondition of publication. The journal sends the metadata to the subject repository so that the author does not have to re-enter it. The subject repository sends the metadata and DOI of the data to the institutional repository so that it has a record of the output, and the DOI back to the journal to link the article with the data. [email protected] www.ubiquitypress.com / @ubiquitypress PREPARDE: Peer REview for Publication & Accreditation of Research Data in the Earth sciences • • Lead Institution: University of Leicester Partners – – – – – – – British Atmospheric Data Centre (BADC) US National Centre for Atmospheric Research (NCAR) California Digital Library (CDL) Digital Curation Centre (DCC) University of Reading Wiley-Blackwell Faculty of 1000 Ltd • • • • • Project Lead: Project Manager: Length of Project: Project Start Date: Project End Date: • • Total Funding Requested from JISC: £135, 025 Total Institutional Contributions: £80,207 Dr Jonathan Tedds (University of Leicester, [email protected]) Dr Sarah Callaghan (BADC, [email protected] ) 12 months 1st July 2012 31st June 2013 Geoscience Data Journal, Wiley-Blackwell and the Royal Meteorological Society • • • supported by NERC – in particular the British Atmospheric Data Centre partnership formed between Royal Meteorological Society & academic publishers Wiley-Blackwell • develop a mechanism for the formal publication of data in the Open Access Geoscience Data Journal builds on JISC funded OJIMS (Overlay Journal Infrastructure for Meteorological Sciences) project Example of (potential) steps/workflow required for a researcher to publish a data paper Items in orange will be investigated in PREPARDE • Author guidelines for data papers and submissions • Repository accreditation • Scientific review of data • Linking mechanisms • Divisions of responsibility between journals and data repositories. Solutions will be tested with partners. Three workshops in early 2013. Recommendations to broader community PREPARDE objectives • capture and manage workflows required to operate the Geoscience Data Journal – from submission of a new data paper and dataset, through review and to publication • develop procedures and policies for authors, reviewers and editors – allow the Geoscience Data Journal to accept data papers as submissions for publication – focus on guidelines for scientific reviewers who will review the datasets • incorporate some technical developments at the point of submission – data visualisation checks – interface improvements – enhance the resulting data publications • put in place procedures needed for data publication in the California Digital Library • interact with the wider scientific and data community – provide recommendations on accreditation requirements for data repositories • engage the user and stakeholder community – promote long-term sustainability and governance of data journals Project team, roles and responsibilities • University of Leicester (UoL): project lead, academic liaison and administration – • British Atmospheric Data Centre (BADC): project management – • feedback on development of peer review guidelines and workflows for the Geoscience Data Journal contribute to the data repository accreditation report and workshops F1000: contribute broader perspective from the biomedical sciences – – • informal (crowd-sourced) review contribute a broader international perspective from outside the Earth Sciences US National Centre for Atmospheric Research (NCAR): use cases of data review methods – – • work with partners to develop and implement workflows & support technical enhancements to enhance authorial and readership experiences California Digital Library (CDL): investigate a lightweight data paper convention – – • provide technical input into the cross-linking, workflows and scientific review work packages Wiley-Blackwell: provide publishing platform – • liaise with range of academic stakeholders internationally => input to workshops and guidelines extend the impact from PREPARDE to the biomedical community through launch of F1000 Research use of F1000 Advisory Panel and F1000 Faculty & co-organise a workshop Digital Curation Centre: assess peer review models & overlap with data repository appraisal and ingest processes – – – respective roles of main stakeholders Workflows and interoperability with data centres and publishers Trusted Digital Repository certification. Journal Data Availability Policies Research shows that journal data availability and sharing policies are influential upon researcher data archiving. – Piwowar HA (2011) Who Shares? Who Doesn't? Factors Associated with Openly Archiving Raw Research Data. PLoS ONE 6(7): e18657; doi:10.1371/journal.pone.0018657 Useful for researchers, research support staff, data repository managers, librarians and policy makers to have readily accessible summary of journal policies (à la Sherpa RoMEO). Feasibility study and analysis of business models for a registry of journal research data availability and sharing policies. Thank You! First JISC MRD Programme, 2009-11: http://bit.ly/jiscmrd2009-11 JISC MRD Outputs Page: http://bit.ly/jiscmrd2009-11-outputs Second JISC MRD Programme, 2011-13: http://bit.ly/jiscmrd2009-11 Programme Blog: http://researchdata.jiscinvolve.org/ E-mail: [email protected] Twitter: simonhodson99 Acknowledgements for slides, content: Andrew Treloar, Eefke Smit, Brian Hole (DryadUK, PRIME), Jonathan Tedds (and others from PREPARDE).