Transcript Slide 1
HathiTrust and Its Role in Building the Universal Collection John Wilkin 2 June 2009 Presentation structure • Quick background on where we are • A few pieces of what’s in the hopper • Development work underway • New collaborative structures • Explore HathiTrust as a vehicle for collaboration in the realm of collections www.hathitrust.org Mission and Goals • to contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge – materials converted from print – improve access …to meet the needs of the co-owning institutions – reliable and accessible electronic representations – coordinate shared storage strategies – “public good” … free-riders. – simultaneously …centralized …open www.hathitrust.org current members • • • • • • • • • • • • • California Digital Library Indiana University Michigan State University Northwestern University The Ohio State University Penn State University Purdue University UC Berkeley UC Davis UC Irvine UCLA UC Merced UC Riverside • • • • • • • • • • • • UC San Diego UC San Francisco UC Santa Barbara UC Santa Cruz The University of Chicago University of Illinois University of Illinois at Chicago The University of Iowa University of Michigan University of Minnesota University of Wisconsin-Madison University of Virginia www.hathitrust.org Governance Model • Executive Committee • Strategic Advisory Board • Coordinated input from groups of members – Hathi/CIC Steering Committee – UC library directors www.hathitrust.org Executive Committee • Paul Courant, University Librarian and Dean of Libraries, University of Michigan • Laine Farley, Executive Director, California Digital Library • Paula Kaufman, University Librarian and Dean of Libraries, University of Illinois at Champaign-Urbana • John King, Vice Provost for Academic Information, University of Michigan • Brian Schottlaender, University Librarian, University of California, San Diego Libraries • Patricia Steele, Dean of Libraries, Indiana University • Brad Wheeler, Chief Information Officer, Indiana University • John Wilkin, Executive Director of HathiTrust and Associate University Library, Library Information Technology, University of Michigan www.hathitrust.org Strategic Advisory Board – Ed Van Gemert (Chair), Director of Libraries, University of Wisconsin-Madison – John Butler, Associate University Librarian for Information Technology, University of Minnesota – Patricia Cruse, Director, Preservation, California Digital Library – Robin Dale, Associate University Librarian for Collections and Library Information Systems, University of California, Santa Cruz – R. Bruce Miller, University Librarian, University of California, Merced – Sarah Pritchard, University Librarian, Northwestern University – Paul Soderdahl, Director, Library Information Technology, University of Iowa – John Wilkin, Executive Director, HathiTrust (ex officio) www.hathitrust.org Preservation: OAIS Reference Model GROOVE (JHOVE) MARC record extensions (Aleph) Rights DB Page Turner HathiTrust API OAI GeoIP DB CNRI Handles [Solr] Google [OCA] In-house Conversion GRIN Internal Data Loading METS/PREMIS object TIFF G4/JPEG2000 OCR MD5 checksums Isilon Site Replication TSM MD5 checksum validation www.hathitrust.org METS object PNG OCR PDF growth trajectory? www.hathitrust.org accomplishments to date 1. 2. 3. 4. 5. 6. 25 partners successful ingest and millions of vols online mirroring and backup rich access collection builder Catalog beta and WCL working group www.hathitrust.org What next? • Data API and other strategies for increased openness • Internet Archive/OCA ingest followed by misc. non-Google ingest • Full text search over entire repository • Extending out services through Shib • Creating research corpus • Deeper collaborative strategies www.hathitrust.org Where next with collaboration? • Begin sharing actual development, cf. ingest of Internet Archive content – Specifications – Validation routines? – Packaging? • Collaboratively develop a collaborative framework – SAB and working group charges www.hathitrust.org Working groups? • • • • • • • • Security Collection management Non-Consumptive Research Digital preservation Discovery (bibliographic and full text) Externally-facing repository APIs Bibliographic metadata management Rights Management www.hathitrust.org Universal collection • What is a collection? • Bibliographic identity • Certification (and for specific or purposes) – Object as content – Object as artifact www.hathitrust.org Toward a Cloud Library • Shared Print repository or repositories with all the best attributes (service, treatment, management) • Shared digital repository with all the best attributes (compliance with TRAC, accessible in every sense, a foundation for services) • … and even some coordination between the two • … and even (particularly for in-copyright works where we don’t have permissions) a viewable copy in GBS www.hathitrust.org Expectations and plans? • How would we define our requirements for satisfaction with each? • What would the business model be? • How would we build our local collections in light of the presence of something like this? • What would we do on the “core” or shared collections? www.hathitrust.org Next steps for libraries • • • • • Case study library: NYU Library ReCap storage facility in Princeton, NJ HathiTrust digital repository CLIR as broker and OCLC Research as agent Futures that depend on looking beyond the local to the shared, from the shared as “you” to the shared as “we” www.hathitrust.org Needed infrastructure • More refined bibliographic identification • Relationship of digital to partner print holdings, including withdrawn volumes • Certification of digital • Rights determination • Rights clearance www.hathitrust.org Further info/updates • http://www.hathitrust.org/ • RSS feed for updates • [email protected] www.hathitrust.org