eXtensible Catalog eXtensible Catalog: Tools for the creation and use of RDA, FRBRized and linked data David Lindahl eXtensible Catalog Organization University of Rochester, River Campus.
Download ReportTranscript eXtensible Catalog eXtensible Catalog: Tools for the creation and use of RDA, FRBRized and linked data David Lindahl eXtensible Catalog Organization University of Rochester, River Campus.
eXtensible Catalog eXtensible Catalog: Tools for the creation and use of RDA, FRBRized and linked data David Lindahl eXtensible Catalog Organization University of Rochester, River Campus Libraries Rochester, NY LITA National Forum September 30, 2011 Funders and Sponsors Major Funding • Andrew W. Mellon Foundation Sponsors • Consortium of Academic and Research Libraries in Illinois (CARLI) • Kyushu University • University of North Carolina at Charlotte • University of Rochester 2 User Research Problem: • User research is of limited value if a library doesn’t have control over its discovery environment • Our solution: – Develop our own software (eXtensible Catalog) – Offer a modular architecture (4 “toolkits”) – Build in tons of configurability – Use established standards and protocols – Give it away (open source) XC User Research Approach • What articles, books and other resources had researchers used most recently? – How did they know the items existed? – How did they obtain them? – How did they use them? • How do they keep current in their fields? User Research Findings • Users want to choose between versions of a resource, see relationships between resources – Underlying XC metadata is based on FRBR model: works, expressions, manifestations, etc. – Use some RDA data elements in FRBR structure – Metadata services to aggregate/group FRBR entities in the User Interface 7 User Research Findings • Users have preferred material and format types, depending upon their projects – Show online materials only – Exclude microforms • Users want to know why items appear on a search result list – Show keywords in context 8 Acting on User Research Findings 9 XC: “Taking Control” of metadata More Control over Metadata More Options for Customizing the User Interface 10 XC Schema DCMI • Dublin Core terms (all) • RDA – subset of elements and role designators • XC elements (newly-defined) – when necessary to contain MARC vocabularies, linking fields, etc. RDA XC 11 Discovery Interface Translating User Research Findings into XC Functionality 13 14 15 16 17 18 19 20 21 22 FRBR Structure - Pyramid Work Expression Manifestation Holdings Expression Manifestation Holdings Manifestation Holdings Holdings 23 FRBR Structure - Hourglass Work Expression Work Expression Work Expression Manifestation Holdings Holdings Holdings 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Software Overview Discovery, Metadata Management, and Connectivity XC Software Drupal MST OAI NCIP Toolkit Toolkit Toolkit Toolkit Metadata Services - Cleanup - Format Convert ILS Connectivity Synchronize data with XC User Interface - Search - Browse User Interface Features More Metadata Services ILS Export Scripts XSLT Scripts ILS Connectivity - Circ. status - Account info ILS connectors Each toolkit is eXtensible with add-on packages 40 XC Software Drupal MST OAI NCIP Toolkit Toolkit Toolkit Toolkit Metadata Services - Cleanup - Format Convert ILS Connectivity Synchronize data with XC User Interface - Search - Browse ILS Connectivity - Circ. status - Account info Voyager “Driver” Voyager “Driver” Metadata User Interface Live Circ. Data Voyager ILS 41 Drupal Toolkit Drupal MST OAI NCIP Toolkit Toolkit Toolkit Toolkit Metadata Services - Cleanup - Format Convert ILS Connectivity Synchronize data with XC User Interface - Search - Browse ILS Connectivity - Circ. status - Account info 42 Drupal Toolkit Features Drupal Toolkit User Interface - Search - Browse • Search/Browse • Customization and theming • Platform for applications – Library website – Modules add functionality 43 Drupal Toolkit In Use Cute.Catalog @ Kyushu University Drupal Toolkit User Interface - Search - Browse 44 Drupal Toolkit In Use Cute.Catalog @ Kyushu University Drupal Toolkit User Interface - Search - Browse 45 Drupal Toolkit In Use “Creating Communities” @ Denver Public Library Drupal Toolkit User Interface - Search - Browse 46 Drupal Toolkit In Use “Creating Communities” @ Denver Public Library Drupal Toolkit User Interface - Search - Browse 47 Metadata Services Toolkit (MST) Drupal MST OAI NCIP Toolkit Toolkit Toolkit Toolkit Metadata Services - Cleanup - Format Convert ILS Connectivity Synchronize data with XC User Interface - Search - Browse ILS Connectivity - Circ. status - Account info 48 MST Features • Collect metadata from repositories • Process metadata with services: MST Toolkit– Normalize Metadata Services – Convert - Cleanup - Format Convert – Merge – Add identifiers • Platform for building new services 49 MST In Use Demonstration Server @ Rochester MST Toolkit Metadata Services - Cleanup - Format Convert 50 MST In Use Demonstration Server @ Rochester MST Toolkit Metadata Services - Cleanup - Format Convert 51 MST In Use Perseus Digital Library @ Tufts University (dev.) MST Toolkit Metadata Services - Cleanup - Format Convert 52 MST In Use Perseus Digital Library @ Tufts University (dev.) MST Toolkit Metadata Services - Cleanup - Format Convert 53 MST In Use Union Catalogue @ Ministerio de Cultura, Madrid, Spain MST Toolkit Metadata Services - Cleanup - Format Convert 54 Metadata Services Toolkit ILS ILS IR Digital Repository Discovery Service OAI-PMH XC Metadata Services Toolkit OAI-PMH MST decides which services and in which order to process incoming records MARC Normalization Cleanup DC Normalization DC to XC Transformation MARC to XC Transformation Format conversion XC Aggregation Merge XC Authority Add Identifiers Creating XC Schema data from MARC • Parse MARCXML records into linked FRBR-based records • Holdings can be separate or embedded • Manage uplinks MARCXML Bibliographic XC Work Work Expressed XC Expression Expression Manifested XC Manifestation OO4 “Uplink” MARCXML Holdings Manifestation Held XC Holdings 56 Following one MARC record through XC 1. Convert 2. Normalize 3. Transform 4. Aggregate match merge W W ? M M W E MARC MARCXML (dirty) MARCXML (clean) Index E ? M EM M XC 5. Index M ? M M M Other XC records XC WEM WEM Data is ready for search and faceted browse Steps: 1. Convert from raw MARC to MARCXML (minor cleanup) 2. Normalize MARCXML (major cleanup) 3. Transform from MARCXML to XC (FRBRize) 4. Aggregate at each FRBR level (match and merge) 5. Index records / create WEMs (one for each unique Manifestation) 57 Metadata Services Toolkit (MST) Drupal MST OAI NCIP Toolkit Toolkit Toolkit Toolkit Metadata Services - Cleanup - Format Convert ILS Connectivity Synchronize data with XC User Interface - Search - Browse ILS Connectivity - Circ. status - Account info 58 Connectivity Tools • OAI Toolkit – Synchronizes metadata with XC – Cleans up MARC data – Uses export scripts Toolkit OAI • NCIP 2 Toolkit – – – – ILS Connectivity Synchronize data with XC Looks up circulation status Places requests (renew, hold) Retrieves user account information Enables resource sharing NCIP Toolkit ILS Connectivity - Circ. status - Account info • Evergreen ILS OCLC Worldcat Navigator • SirsiDynix Symphony PALCI’s EZBorrow – Test bed available now! 59 NCIP 2 Toolkit: Testbed NCIP Toolkit ILS Connectivity - Circ. status - Account info NCIP 2 Toolkit: Testbed NCIP Toolkit ILS Connectivity - Circ. status - Account info RDA and FRBR Helping libraries make the transition 63 64 U.S. RDA Test Coordinating Committee Overall Recommendation: “…the Coordinating Committee recommends that RDA should be implemented by LC, NAL, and NLM no sooner than January 2013.” 65 Bottom line…by January 2013… Libraries will be able to use RDA in MARC and RDA in a non-MARC environment at the same time. XC provides one option for doing this 66 U.S. RDA Test Coordinating Committee Recommended Tasks and Action Item: “Solicit demonstrations of prototype input and discovery systems that use the RDA element set (including relationships)...” Timeframe for completion: within 18 months. 67 Breaking down the Recommendation “Solicit demonstrations of prototype input and discovery systems that use the RDA element set (including relationships)...” prototype input discovery RDA element set including relationships What XC Provides XC is near production-ready MARC data (bulk) XC has a discovery interface Uses subsets of RDA elements and roles to date Primary relationships between work, expression and item so far 68 XC: Facilitating the Transition XC enables risk-free experimentation with RDA data while the library community develops a successor to MARC XC can serve as a “bridge” between using RDA in MARC-based systems and in emerging applications 69 Linked Data in XC Library of Congress statement, May 13, 2011 Transforming our Bibliographic Framework “Experiment with Semantic Web and linked data technologies to see what benefits to the bibliographic framework they offer our community and how our current models need to be adjusted to take fuller advantage of these benefits.” 71 Semantic Web and Linked Data • The Semantic Web refers to a set of technologies that allow computers to understand the meaning of information on the web • Linked data is a mechanism for exposing, sharing and connecting data on the web 72 Semantic Web and Linked Data • If everything has a unique identifier, then information from one website can be related to information from another via a computer program • Everything includes people, places, things, vocabularies, metadata elements, web documents, … 73 Getting Started To create Linked Data, we need: –Software to transform legacy data –Analysis: mapping of legacy metadata to Linked Data properties 74 Converting MARC to Linked Data • What XC software can do: – – – – Convert MARC codes to vocabulary values Remove extraneous data Normalize inconsistencies Map most MARC fields/subfields and parse to appropriate FRBR Group 1 entity records 75 Best Practices for Linked Data By attempting to follow best practices in XC for Linked Data, we hope to facilitate eventual output of XC metadata in RDF. - Unique identifiers for XC metadata records - Data elements from registered schemas - Registered vocabularies 76 RDF Triple Subject Predicate Object Poets, American This resource has subject URIs for each? 77 RDF Triple – Record identifiers Subject Predicate Object oai:mst.rochester.edu: MST/ MARCToXCTransformation/ 10081 This resource has subject Poets, American 78 Identifiers for XC Schema records <?xml version="1.0" encoding="UTF-8"?> <xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdarole="http://rdvocab.info/roles"> <xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"> <dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject> <rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author> <rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork> <xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894-1962.</xc:subject> <xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject> </xc:entity> </xc:frbr> A persistent, globally unique identifier for each XC Schema record 79 RDF Triple - Registered Data Elements Subject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ 10081 This resource Predicate Object http://www. extensiblecatalog.inf o/Elements/subject has subject Poets, American 80 XC Schema Elements DCMI Dublin Core terms (DCMI) - all RDA – subset of elements and role designators XC elements (newly-defined) – when necessary to enable XC system functionality RDA XC 81 XC Schema “work” record: data elements <?xml version="1.0" encoding="UTF-8"?> <xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdarole="http://rdvocab.info/roles"> <xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"> <dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject> <rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author> <rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork> <xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894-1962.</xc:subject> <xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject> </xc:entity> </xc:frbr> Data elements from registered namespaces for DC terms, RDA roles and vocab, and XC 82 RDF Triple - Registered Vocabularies Subject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ 10081 This resource Predicate http://www. extensiblecatalog.inf o/Elements/subject has subject Object http://id.loc.gov/authoritie s/sh85103735#concept Poets, American 83 XC Work record with embedded URI <?xml version="1.0" encoding="UTF-8"?> for LCSH “Poets, American” <xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" … xmlns:subjid=“id.loc.gov/authorities”> <xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"> … <xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject> <xc:subject xsi:type="dcterms:LCSH” subjid=“sh85103735#concept”>Poets, American</xc:subject> <xc:temporal>20th century</xc:temporal> <xc:type>Biography</xc:type> 84 </xc:entity> RDF Triple Subject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ 10081 This resource Predicate http://www. extensiblecatalog.inf o/Elements/subject has subject Object http://id.loc.gov/authoritie s/sh85103735#concept Poets, American 85 XC Software is “Linked Data Ready” • Converts metadata to FRBR entities with RDA elements and roles • Adds identifiers for “things” • Provides a platform for service development • Synchronizes with existing tools – Cataloging staff client – Institutional repository 86 Download XC software at eXtensibleCatalog.org