Bloomsbury Conference UCL, London 6.25.10 Fourth Bloomsbury Conference on e-Publishing and e-Publications Valued Resources: Roles and Responsibilities of Digital Curators and Publishers Conceptualizing Library Data Curation.
Download ReportTranscript Bloomsbury Conference UCL, London 6.25.10 Fourth Bloomsbury Conference on e-Publishing and e-Publications Valued Resources: Roles and Responsibilities of Digital Curators and Publishers Conceptualizing Library Data Curation.
Bloomsbury Conference UCL, London 6.25.10 Fourth Bloomsbury Conference on e-Publishing and e-Publications Valued Resources: Roles and Responsibilities of Digital Curators and Publishers Conceptualizing Library Data Curation and Publishing Services at Purdue University Charles Watkinson Director Purdue University Press D. Scott Brandt Assoc Dean for Research Purdue University Libraries Bloomsbury Conference UCL, London 6.25.10 Structure of the Presentation I. Some Background & Context II. Exploring Library’s Role in the “Data Deluge” III. Data Curation Profiles: what we’re learning IV. What a Publisher can learn from the Profile “Data curation is the activity of managing and promoting the use of data from the point of creation, to ensure its fitness for contemporary purposes and availability for discovery and reuse.” Bloomsbury Conference UCL, London 6.25.10 Purdue University and Purdue Libraries • ~38K students, ~1.8K faculty • Strengths in science, technology, agriculture, & engineering. • 12 subject-oriented Libraries + units • University press a unit (only 11% of US presses report within Libraries) Dean Assoc Dean for Research D2C2 Data Research Scientist Interdiscip Research Librarian Directors of Office of Copyright, Finance, and the University Press Assoc for Academic Affairs Library Faculty -32- Assoc Dean for Digital Programs and Information Access Assoc Dean for Planning & Administration Bloomsbury Conference UCL, London 6.25.10 published data/ datasets unpublished research published research published research non-traditional non-traditional traditional secondary/ tertiary resources analyzed data/ datasets Analyzed data might need to be reviewed prior to publication, or in case of questions after publication. It is increasingly linked as “supplementary data” by publishers processed data/ datasets Quite often data must be scrubbed/anonymized, or processed to format prior to analysis; some disciplines share this data widely within their communities (e.g., astronomy, physics, etc.) “raw” data/ datasets Some raw data are shared readily (e.g., genetics), but also quite often are discarded, depending on discipline Modified from: Brandt, D.S. “Scholarly Communication” (in To Stand the Test of Time: Long-Term Stewardship of Digital Data Sets in Science and Engineering.: Final Report of Workshop New Collaborative Relationships: Academic Libraries in the Digital Data Universe. ARL, Washington, DC, September 2006.) Bloomsbury Conference UCL, London 6.25.10 PUL response to “data deluge” • Investigating research data needs and building relationships with faculty, in order to: • Design, build, assess prototype infrastructure, tools and services to handle digital data. • This approach recognizes the disciplinaryspecific nature of faculty needs, though there is a tension between this and the practical requirements of building a sustainable suite of services/digital infrastructure. Bloomsbury Conference UCL, London 6.25.10 Our organization to achieve this vision Distributed Liaison Centralized Services Support across the research lifecycle Faculty Liaison subject librarians Publishing e-Pubs & Press Data Management D2C2 Rights Management University Copyright Office disciplinary faculty Bloomsbury Conference UCL, London 6.25.10 1. Investigating Research Data Needs • Strategy 1: Embedding data scientists in research projects; D2C2 provides this expert consultancy. • Strategy 2: Creating tools to structure conversations about data; Data Curation Profiles help liaison librarians structure their conversations. DCP D2C2 librarians researchers Bloomsbury Conference UCL, London 6.25.10 2. Solving Problems and Developing Prototype Tools, Systems, Services Study Concept & Design Data Collection Data Processing Data Access & Dissemination Analysis Research Outcomes • Ingest, Preservation and Access for Water Quality Datasets in an Institutional Repository • Developing a Data Management and Curation Workflow for Camp Calcium • Developing a Content Organization Framework for Regenstrief Center Healthcare Delivery Hub • Enabling end-to-end geospatial data modeling workflows via INPort: The Isotope Networks Portal • Leveraging Relational Information in the HUBs using Linked Data • Investigate and Implement Persistence for HUB Resources • DataCite (founding member) • Integrating Spatial Educational • Prototype publications linked to Experiences (ISEE) into Crop, Soil, and data through e-Pubs and Purdue Environmental Science Curricula • INTEROP: Developing Community-based University Press. DRought Information Network Protocols and Tools for Multi -disciplinary Regional Scale Applications Adapted from: e-Science and the Life Cycle Model of Research http://datalib.library.ualberta.ca/~humphrey/lifecycle-science060308.doc Bloomsbury Conference UCL, London 6.25.10 Data Curation Profiles Bloomsbury Conference UCL, London 6.25.10 Profiling Data • Research Data Lifecycle (what’s the story of the data from producer's perspective) • Data Management / Storage • Disposition of the Data • Data Dissemination and Sharing • Data Preservation and Repositories • Roles for Libraries, Librarians, and Publishers Sample Profile link Bloomsbury Conference UCL, London 6.25.10 Disposition of the Data • Willingness / Motivations to share – feelings/reservations/willingness towards sharing • Access control – need to restrict or control access to/from others • Target data for sharing – stage in the lifecycle the data should be shared • Value of the data – real or potential value, from their perspective • Embargo (and reasons why/why not) Bloomsbury Conference UCL, London 6.25.10 What data curators can learn • Advancing university-based cyberinfrastructure is dependent on our understanding of how to support data practices and needs • Sharing is at the heart of success: collecting, storing, and making use of data can only come after the means for sharing are in place • We cannot collect and curate all data, particularly in a way that facilitates effective re-use – We will need to work with researchers to develop selection and appraisal guidelines, and data services from: M. Cragin. (2009) “Data Sharing, Small Science, and Institutional Repositories.” UK e-Science All Hands Meeting: Oxford, UK Bloomsbury Conference UCL, London 6.25.10 Data Curation Proliferation DCP 12 workshops dataconservancy.org Bloomsbury Conference UCL, London 6.25.10 What publishers can learn • Researchers want to disseminate outputs, but ranges in scope, format, use • They are generally willing to share data with others, but not without certain restrictions, or benefits for themselves • They hold on to their data but do not do much to curate it; what is most easily or willingly shared is not always the data that has the most re-use value Bloomsbury Conference UCL, London 6.25.10 Purdue UP lesson learned 1 “Researchers want to disseminate outputs, but ranges in scope, format, use” • Print books and subscription-based journals, PUP’s traditional focus, are not enough • PUP / Libraries need to offer a range of different channels to fit different needs • PUP / Libraries need a venue to experiment with hybrid or new models Bloomsbury Conference UCL, London 6.25.10 “A Continuum of Scholarly Content” in the IR Student Admin Unaffiliated Source of scholarship Faculty (with thanks to J.G. Bankier, Berkeley Electronic Press) Book Pre Print Datasets Faculty Journal /Primary Post Print Faculty Conference Nonresearch Research Finding research Committee Meetings output Research Reports Newsletter Dissertation Masters Thesis Graduate Journal Honor Papers Undergrad Conference Undergrad Journal Red stars = Purdue UP? Blue stars = Purdue e-Pubs? Admin Report Alumni Magazine Historical Collection Commencement address Low Symposium Society Journal Policy Report Scholarly Impact of Content High Bloomsbury Conference UCL, London 6.25.10 Purdue UP lesson learned 2 “Researchers willing to share data with others, but not without certain restrictions/benefits” • PUP provides a layer of editorial services for credentialing that can incentivize data sharing • PUP needs to make it easy to link to and cite data in publications (Datacite so important!) • PUP / Libraries need to be nuanced in their Open Access messages (OA is not always right strategy) Bloomsbury Conference UCL, London 6.25.10 Read the full text of the book on your portable device Follow in-text URLs to supplementary data View spreadsheets on-site or download them from your personal computer Bloomsbury Conference UCL, London 6.25.10 Purdue UP lesson learned 3 “What is most easily/willingly shared is not always data that has the most re-use value” • Move away from producing data supplements for publications to producing supplementary publications to drive re-use of data • Take advantage of being “inside the tent” to have deeper conversations with scholars about what is most important data for reuse Bloomsbury Conference UCL, London 6.25.10 Next Steps • Spreading the use of DCPs so that we can get a more complete picture of faculty behavior variations around data • More clearly defining library-based publishing services, and building relevant skills and tools in Libraries and Press • Communicating to faculty the full range of library services they have access to, and changing their old views of what Purdue Libraries and Purdue UP “do” Bloomsbury Conference UCL, London 6.25.10 Thank you! D. Scott Brandt [email protected] Charles Watkinson [email protected]