Topic: LD4P3 (Linked data for production: closing the loop) team presenting on their integration of Wikidata information around musical works into a Cornell library catalog prototype with a focus on the Wikidata aspects of work
Presenters: Huda Khan, Cornell University; Steven Folsom, Cornell University, Kevin Kishimoto, Stanford University, Astrid Usong, Stanford University
Wizard of Oz example as theme! Is a journey taken with brain, heart & nerve
Background
LD4P3: Closing the loop: aims to create a working model of a complete cycle for library metadata creation, sharing & reuse.
Discovery:
Using linked data to support & enhance discovery
Work on integrating into production
BAM-WOW: Trying to leverage work of MLA LInked Data Working Group
Capturing thematic catalog identifiers in Wikidata: information not usually found in catalogs.
Music Library Association LInked Data Working Group (LDWG):
Emphasis on exploring & experimenting with linked data
15 members, mostly MLA music catalogers
Goals include
build practical skills
learn & understand concepts & theory
build connections
Initially focused on BIBFRAME but then branched out into Wikidata
Projects focused on individual or institutional interest
One project: thematic catalog number concordance in Wikidata
Thematic catalog is a music reference book that aims to be a comprehensive list of composer’s works–like catalogues raisonnés in art
Each work includes other info, such as historical information, musical characteristics
Most important: thematic authors often assign a number (identifier) to each work
Different authors assign numbers that do no line up
Antonio Vivaldi
wrote more than 800 compositions and most instrumental (500 concertos with title concerto and 100 sonatas with title sonata)
Different catalogs for Vivaldi have different numbering depending on who published
Example of one work: Vivaldi, Antonio, $d 1678-1741. $t Estro armonico. $n N. 6 (uniform title). Known by many designations! (op. 3, no. 6 ; RV 356, etc.)
Title pages for the same work show differing numbering systems–a frustration of many music catalogers!
Needed way to look these up easily in a structured data manner, so can use them easily
Data Harvesting & conversion to Wikidata
Found list of Vivaldi works by RV on Internet: copy & paste into spreadsheet #1
Searched id.loc.gov for works by Vivaldi and exported results (Atom to CSV)
Batch searched LCCNS in OCLC Connexion & export authority records
Import batch NAR file into MARCedit & converted fields/subfields into spreadsheet (spreadsheet #2)
Queried Wikidata to find what items existed for Vivaldi works and exported to a spreadsheet (spreadsheet #3)
Imported thee spreadsheets into OpenRefine to join data from on various matchpoints
Humans verified/corrected data in final spreadsheet
Created/edit Wikidata items using OpenRefine Wikidata extension (most of them didn’t have items)
Wikidata item for Violin concerto in A minor (RV 356): https://www.wikidata.org/wiki/Q116050394
Started working from spreadsheet from Kevin
Screenshots from catalog grabbing some information from Wikidata and supplementing the catalog
Included works have work info buttons next to them that give you knowledge panel with some data–can click through to author/work browse page
How does this work?
Solr index that sits behind the catalog has a field where you can obtain the author title headings associated with an item.
Then query id.loc.gov with LCCN to find Wikidata entity
On Wikidata works, have catalog codes number and which edition it is in
Participants: 5 total (3 grad students, 2 staff members)
Timeline: December 2022
Think alouds with feedback questions
Virtual
Given tasks to find specific information
Usability testing: Outcomes:
Very much appreciated & received
Easy to find included work knowledge panels
Wanted identifying info & labels higher up
Usability testing: Wikidata properties
Which would you find interesting?
Catalog numbers (all)
Instrumentation (three)
Librettist (three)
Tonality (two)
Opus (one)
Caveat: not a survey, but relied on participants’ memories but provide starting points to ask more questions
Lessons & questions: Design
Generating use cases: Would be great to display information, but what about search?
Would require more focus and work: entire design/indexing work cycle on its own
Generic (catalog numbers very helpful) vs distinctive title (incorporating multiple languages would be useful)
Typical design: bio panel and then holdings to the right
Existing author buttons are driven by presence of an authority record, then they can look for equivalencies in external data
Did design brainstorming
Add expandable section for each included work
Inline option?
Landed on work info button
Lessons & questions: Models
Prototype makes it look easy, but is jumping across multiple sources of information.
Library catalog item
LC item
Wikidata item
There’s often not a one-to-one correspondence while jumping to multiple data sources: how to deal with discrepancies
Tag your WEMI levels: Follow the yellow brick road…To Where?
Lessons & questions: Data
Data connections are like yellow brick road
“Selections” in music uniform title is often a catch-all: a bucket that doesn’t always map to LC heading.
Goal: Find points of connections
Somewhere over the rainbow…
BAMWOW into production: models
Everything seen here is in prototype and the next step is to bring it in production
Is there ever a need for a work info button for the main entry or is inline integration preferable?
If in-line is preferred, should we commit to sorting fields into a WEMI-like order?
What will happen when expanding to non-music works (such as Wizard of Oz uniform title example)
BAMWOW into production: Data Quality
Catalogs are built in ways to disallow connections
Wikidata has qualifiers on many statements that we may want to take advantage of
Trying to figure out how to drop questionable statements if there are constant violations
More possibilities and questions:
Adding Identifiers in MARC: Cornell catalog
FOLIO working group for entities
Would thematic catalogs provide additional context, such as musical incipits
Questions
Is it accessing Wikidata dynamically?
Yes, it is accessing Wikidata dynamically. No Wikidata is stored in Cornell’s catalog; when page is rendered, there’s a call to Wikidata and it’s brought into the page
Have you determined the core set of statements for sheet music, in this case, Vivaldi? Every work item should/must have? Have you considered when you have the physical sheet in the collection to add a statement to highlight Cornell library’s collection? Or is this not a focus of the project?
We’re not focusing on published manifestations, but rather the intellectual works themselves
Have you run up against inconsistently used qualifiers or properties in Wikidata? Did that create challenges for querying?
Yes! Some of the inconsistencies we ‘correct’ and others we just leave and add our own data. Often the inconsistencies are misinterpretations of the properties constraints
Regarding the catalog numbers, the most inconsistent thing is which prefix is used. Our group is trying to use a language-agnostic form–take the number that is in the book (in most cases)
Is the question of useful information solely based on the question of searching/identifying for the work? I find all of the information useful, though not always necessary for searching
Regarding Wikidata properties that were chosen, we basically had a group discussion in one of our LDWG meetings: “Which Wikidata properties would be cool to add to a MARC-based catalog?” These choices were based on properties we were using in our own project
Don’t intend to be any sort of authority on topic, but are what they came up with
Can you show again how your catalog displayed the Wikidata info retrieved by the query?
How was the info button inserted in your catalog? Is it on the discovery layer only?
It’s client-side code in the prototype, so yes. It checks for information in the Solr index and then matches with the included works list and places it appropriately
Did the question arise of what people want a library catalog to do? Some of these examples suggest to me that a kind of mini-Wikipedia page would be more useful. (And is my question constrained by the relatively primitive nature of current library catalogs?)
A lot of what we focused on is identifying/disambiguation use cases, but there are some properties that lean more toward a broader context for the work
Some of these attributes are also in MARC authority records. Are you just ignoring them if they are there? What if they are there and not in Wikidata?
Prototype just displays everything right now, but we considered similar questions fr Discogs. In that case, the catalog actually checks for equivalent fields that are already populated, and only shows supplemental Discogs info when those fields are not populated
There’s also Syndetics info in the Cornell catalog, it’s interesting how much smarter we can be with the Discogs vs. that
Trial on a sizeable scale to determine what is possible
Has about 100,000 items on Wikidata for items about or connected to the collection
50,000 items are about books in the Welsh bibliography
Example of book in bib in library catalog (MARC-21), fields and text strings
Subjects and authors can link to authority files, but no guarantee it’s a unique item
A lot of publishers (for example) are text strings, which need to be mapped to Wikidata “things”
In Wikidata text strings => items (places, publishers, language, etc.) (example: Cardiff)
Three editions in a library catalog are three records, no guarantee they’re presented the same way
In Wikidata there is a central literary work, then editions or translations are separate items, and all editions connect to same data, connectivity and structured data compared to MARC-21 catalog records
Mapping Book Data
Chose easiest data, 50,000 most popular authors
Exported to CSV
Needed to disambiguate authors and publishers
Use OpenRefine to match authors, multiple ways to do this Also used OpenRefine to match authors to existing authors in Wikidata (names, titles, dates, etc. all possible matching points)
VIAF etc. also allowed them to create items for authors
Not possible to match for all authors, some items still have “author string”
Then created Wikidata item for Works
They then need to be connected to Editions/Translations/Etc.
Did with a combo of uploading directly and QuickStatements to add additional data
Lot of help from Simon Cobb, visiting Wikidata scholar
Challenges
Finding information on authors and publishers
Looking at national bibliography of Wales (lots of modern books, earlier books don’t have unique identifiers, obscure publishers, etc.)
Finding good information challenging, focused on 100 most common publishers
Potential copyright and license issues with catalog data
Some catalogers nervous about re-using data purchased from OCLC or other sources
Third party data may have copyright issues (gray area)
There may also be licensing issues
Scale of project, Welsh bibliography has 1,000,000+ works
Data maintenance? How do we automate catalog updates => wikidata
If we want to round trip that data how do we do it, and how do we monitor quality of data added in Wikidata
Benefits
Having richer, more accurate, structured data
Easily accessible and reusable data
People can interact and explore a huge collection
Connecting with and building a larger dataset
Can crowdsource improvements to data to community
Working towards Wikicite and sharing data
Some quite lovely diagrams and visualizations about publishers and publishing can be created
Can begin to explore relationships between authors and items
And authors/items/the world/history of publishing in Wales
Identifiers
Can connect identifiers from different datasets
One of the most powerful/useful things you can do, especially when advocating sharing data
Useful for round-tripping, pulling identifiers back into own catalog
The more institutions that match data to external datasets the more we can share and enrich catalogs
Already seeing data being enriched in this way
People adding identifiers
Wikidata is multi-lingual
Can be very powerful for people working in a country with more than one language
Encourages reuse of data
Potential future projects
Recently created rich metadata for ~1000 manuscripts in a separate project
Added subject and genre, which can allow visualization of manuscripts organized by subject and genre
Shows how you could very powerful search and discovery tools by linking to entities for particular genres and subjects
A lot of books shared on Wikidata will be digitized and OCR’ed
Once you have OCR data you can use AI to determine things like subject and genre
Can also pull out entities from text (names, places, events, etc.)
National Library of the Netherlands has done this for their newspapers
Text can then be tagged and connected to items and external identifiers
Use of IIIF can allow you to overlay information onto text
Structured data can transform how libraries look at data
Questions
Saving time in MARC to Wikidata workflows
People want a programmatic way to do this, but creating mapping for authors without unique identifiers or works can be difficult. Make the initial cataloging as clean as possible (example: adding ISNI identifiers)
Is modelling the manuscript extensively (slide) labor intensive?
Were able to semi-automatically take out names and match them to people, many already in Wikidata. Fairly labor intensive but there may be ways to automate much of the work to a good degree of accuracy. Would be tricky for a giant collection of books.
Any plans to apply process to materials beyond books?
Always trying new datasets, discussing musical scores. Would love to do sound recordings or video, but you can’t share the actual recordings (likely to have copyright issues) which takes away some benefits of sharing data.
Any advice on thesis and subject headings in Wikidata?
University of Edinburgh has shared a thesis collection (Ewan McAndrew)
Adam: Looking at converting EDTs into Wikidata, both proposing new subjects to Library of Congress, and also creating Wikidata items (often already items, but creating when needed). Trying to figure if LCSH headings can be mapped, especially free floating subdivisions (use main part, use entire field). If Wikidata URIs can go in MARC fields, that would document exact Wikidata item).
Library of Congress has done mapping work, some have links some don’t. So many subject headings don’t have items that could be created.
National Library of Wales has volunteers tagging photographs, would be cleaner and easier with identifiers.
When do you switch to Wikibase for things you can’t describe with Wikidata? Is this all scalable? Think about what you’re trying to achieve, are Wikibase or Wikidata the best option?
Advice for mapping books not in English?
Tried to make sure the language was there, and labels were correct (English versus Welsh)
MARC to Wikidata mappings?
Will do a lot of heavy lifting when it’s done, people are working on it. Universal mapping would be very useful, take care of basic stuff.
Any pushback from Wikidata folks?
No pushback, asked in several areas if it would be acceptable to add that much information. No pushback or complaints, but uploads will get bigger and bigger and changes may be needed. One of the reasons this was done was to advocate for structured data generally and Wikidata can be used to take a sample and show how what can be possible for the future at a larger scale. That doesn’t mean everything in Wikidata, but it’s a fantastic showcase.
Are some of the visualizations online?
Shared slides in the agenda, a couple may be on commons
Some charts could be interpreted as music
Did a hackathon where someone turned data into music
Has anyone considered putting preferred terms (over problematic subject headings) in Wikidata?
Not something Jason has had to deal with on Wikidata
Jim: would be interested in collaborating on how to open up these preferred labels. It seems the lists for preferred labels are closed or internally managed currently.
Working on Alison Turnbull:
She studied from 1975-1977 at the Academia Arjona in Madrid, from 1977-1978 at the West Surrey College of Art and Design, and from 1978-1981 at the Bath Academy of Art in Corsham.[1]
This list is automatically generated from data in
Wikidata and is periodically updated by
Listeriabot. Edits made within the list area will be removed on the next update!
Topic: LD4P3 (Linked data for production: closing the loop) team presenting on their integration of Wikidata information around musical works into a Cornell library catalog prototype with a focus on the Wikidata aspects of work
Presenters: Huda Khan, Cornell University; Steven Folsom, Cornell University, Kevin Kishimoto, Stanford University, Astrid Usong, Stanford University
Wizard of Oz example as theme! Is a journey taken with brain, heart & nerve
Background
LD4P3: Closing the loop: aims to create a working model of a complete cycle for library metadata creation, sharing & reuse.
Discovery:
Using linked data to support & enhance discovery
Work on integrating into production
BAM-WOW: Trying to leverage work of MLA LInked Data Working Group
Capturing thematic catalog identifiers in Wikidata: information not usually found in catalogs.
Music Library Association LInked Data Working Group (LDWG):
Emphasis on exploring & experimenting with linked data
15 members, mostly MLA music catalogers
Goals include
build practical skills
learn & understand concepts & theory
build connections
Initially focused on BIBFRAME but then branched out into Wikidata
Projects focused on individual or institutional interest
One project: thematic catalog number concordance in Wikidata
Thematic catalog is a music reference book that aims to be a comprehensive list of composer’s works–like catalogues raisonnés in art
Each work includes other info, such as historical information, musical characteristics
Most important: thematic authors often assign a number (identifier) to each work
Different authors assign numbers that do no line up
Antonio Vivaldi
wrote more than 800 compositions and most instrumental (500 concertos with title concerto and 100 sonatas with title sonata)
Different catalogs for Vivaldi have different numbering depending on who published
Example of one work: Vivaldi, Antonio, $d 1678-1741. $t Estro armonico. $n N. 6 (uniform title). Known by many designations! (op. 3, no. 6 ; RV 356, etc.)
Title pages for the same work show differing numbering systems–a frustration of many music catalogers!
Needed way to look these up easily in a structured data manner, so can use them easily
Data Harvesting & conversion to Wikidata
Found list of Vivaldi works by RV on Internet: copy & paste into spreadsheet #1
Searched id.loc.gov for works by Vivaldi and exported results (Atom to CSV)
Batch searched LCCNS in OCLC Connexion & export authority records
Import batch NAR file into MARCedit & converted fields/subfields into spreadsheet (spreadsheet #2)
Queried Wikidata to find what items existed for Vivaldi works and exported to a spreadsheet (spreadsheet #3)
Imported thee spreadsheets into OpenRefine to join data from on various matchpoints
Humans verified/corrected data in final spreadsheet
Created/edit Wikidata items using OpenRefine Wikidata extension (most of them didn’t have items)
Wikidata item for Violin concerto in A minor (RV 356): https://www.wikidata.org/wiki/Q116050394
Started working from spreadsheet from Kevin
Screenshots from catalog grabbing some information from Wikidata and supplementing the catalog
Included works have work info buttons next to them that give you knowledge panel with some data–can click through to author/work browse page
How does this work?
Solr index that sits behind the catalog has a field where you can obtain the author title headings associated with an item.
Then query id.loc.gov with LCCN to find Wikidata entity
On Wikidata works, have catalog codes number and which edition it is in
Participants: 5 total (3 grad students, 2 staff members)
Timeline: December 2022
Think alouds with feedback questions
Virtual
Given tasks to find specific information
Usability testing: Outcomes:
Very much appreciated & received
Easy to find included work knowledge panels
Wanted identifying info & labels higher up
Usability testing: Wikidata properties
Which would you find interesting?
Catalog numbers (all)
Instrumentation (three)
Librettist (three)
Tonality (two)
Opus (one)
Caveat: not a survey, but relied on participants’ memories but provide starting points to ask more questions
Lessons & questions: Design
Generating use cases: Would be great to display information, but what about search?
Would require more focus and work: entire design/indexing work cycle on its own
Generic (catalog numbers very helpful) vs distinctive title (incorporating multiple languages would be useful)
Typical design: bio panel and then holdings to the right
Existing author buttons are driven by presence of an authority record, then they can look for equivalencies in external data
Did design brainstorming
Add expandable section for each included work
Inline option?
Landed on work info button
Lessons & questions: Models
Prototype makes it look easy, but is jumping across multiple sources of information.
Library catalog item
LC item
Wikidata item
There’s often not a one-to-one correspondence while jumping to multiple data sources: how to deal with discrepancies
Tag your WEMI levels: Follow the yellow brick road…To Where?
Lessons & questions: Data
Data connections are like yellow brick road
“Selections” in music uniform title is often a catch-all: a bucket that doesn’t always map to LC heading.
Goal: Find points of connections
Somewhere over the rainbow…
BAMWOW into production: models
Everything seen here is in prototype and the next step is to bring it in production
Is there ever a need for a work info button for the main entry or is inline integration preferable?
If in-line is preferred, should we commit to sorting fields into a WEMI-like order?
What will happen when expanding to non-music works (such as Wizard of Oz uniform title example)
BAMWOW into production: Data Quality
Catalogs are built in ways to disallow connections
Wikidata has qualifiers on many statements that we may want to take advantage of
Trying to figure out how to drop questionable statements if there are constant violations
More possibilities and questions:
Adding Identifiers in MARC: Cornell catalog
FOLIO working group for entities
Would thematic catalogs provide additional context, such as musical incipits
Questions
Is it accessing Wikidata dynamically?
Yes, it is accessing Wikidata dynamically. No Wikidata is stored in Cornell’s catalog; when page is rendered, there’s a call to Wikidata and it’s brought into the page
Have you determined the core set of statements for sheet music, in this case, Vivaldi? Every work item should/must have? Have you considered when you have the physical sheet in the collection to add a statement to highlight Cornell library’s collection? Or is this not a focus of the project?
We’re not focusing on published manifestations, but rather the intellectual works themselves
Have you run up against inconsistently used qualifiers or properties in Wikidata? Did that create challenges for querying?
Yes! Some of the inconsistencies we ‘correct’ and others we just leave and add our own data. Often the inconsistencies are misinterpretations of the properties constraints
Regarding the catalog numbers, the most inconsistent thing is which prefix is used. Our group is trying to use a language-agnostic form–take the number that is in the book (in most cases)
Is the question of useful information solely based on the question of searching/identifying for the work? I find all of the information useful, though not always necessary for searching
Regarding Wikidata properties that were chosen, we basically had a group discussion in one of our LDWG meetings: “Which Wikidata properties would be cool to add to a MARC-based catalog?” These choices were based on properties we were using in our own project
Don’t intend to be any sort of authority on topic, but are what they came up with
Can you show again how your catalog displayed the Wikidata info retrieved by the query?
How was the info button inserted in your catalog? Is it on the discovery layer only?
It’s client-side code in the prototype, so yes. It checks for information in the Solr index and then matches with the included works list and places it appropriately
Did the question arise of what people want a library catalog to do? Some of these examples suggest to me that a kind of mini-Wikipedia page would be more useful. (And is my question constrained by the relatively primitive nature of current library catalogs?)
A lot of what we focused on is identifying/disambiguation use cases, but there are some properties that lean more toward a broader context for the work
Some of these attributes are also in MARC authority records. Are you just ignoring them if they are there? What if they are there and not in Wikidata?
Prototype just displays everything right now, but we considered similar questions fr Discogs. In that case, the catalog actually checks for equivalent fields that are already populated, and only shows supplemental Discogs info when those fields are not populated
There’s also Syndetics info in the Cornell catalog, it’s interesting how much smarter we can be with the Discogs vs. that
Trial on a sizeable scale to determine what is possible
Has about 100,000 items on Wikidata for items about or connected to the collection
50,000 items are about books in the Welsh bibliography
Example of book in bib in library catalog (MARC-21), fields and text strings
Subjects and authors can link to authority files, but no guarantee it’s a unique item
A lot of publishers (for example) are text strings, which need to be mapped to Wikidata “things”
In Wikidata text strings => items (places, publishers, language, etc.) (example: Cardiff)
Three editions in a library catalog are three records, no guarantee they’re presented the same way
In Wikidata there is a central literary work, then editions or translations are separate items, and all editions connect to same data, connectivity and structured data compared to MARC-21 catalog records
Mapping Book Data
Chose easiest data, 50,000 most popular authors
Exported to CSV
Needed to disambiguate authors and publishers
Use OpenRefine to match authors, multiple ways to do this Also used OpenRefine to match authors to existing authors in Wikidata (names, titles, dates, etc. all possible matching points)
VIAF etc. also allowed them to create items for authors
Not possible to match for all authors, some items still have “author string”
Then created Wikidata item for Works
They then need to be connected to Editions/Translations/Etc.
Did with a combo of uploading directly and QuickStatements to add additional data
Lot of help from Simon Cobb, visiting Wikidata scholar
Challenges
Finding information on authors and publishers
Looking at national bibliography of Wales (lots of modern books, earlier books don’t have unique identifiers, obscure publishers, etc.)
Finding good information challenging, focused on 100 most common publishers
Potential copyright and license issues with catalog data
Some catalogers nervous about re-using data purchased from OCLC or other sources
Third party data may have copyright issues (gray area)
There may also be licensing issues
Scale of project, Welsh bibliography has 1,000,000+ works
Data maintenance? How do we automate catalog updates => wikidata
If we want to round trip that data how do we do it, and how do we monitor quality of data added in Wikidata
Benefits
Having richer, more accurate, structured data
Easily accessible and reusable data
People can interact and explore a huge collection
Connecting with and building a larger dataset
Can crowdsource improvements to data to community
Working towards Wikicite and sharing data
Some quite lovely diagrams and visualizations about publishers and publishing can be created
Can begin to explore relationships between authors and items
And authors/items/the world/history of publishing in Wales
Identifiers
Can connect identifiers from different datasets
One of the most powerful/useful things you can do, especially when advocating sharing data
Useful for round-tripping, pulling identifiers back into own catalog
The more institutions that match data to external datasets the more we can share and enrich catalogs
Already seeing data being enriched in this way
People adding identifiers
Wikidata is multi-lingual
Can be very powerful for people working in a country with more than one language
Encourages reuse of data
Potential future projects
Recently created rich metadata for ~1000 manuscripts in a separate project
Added subject and genre, which can allow visualization of manuscripts organized by subject and genre
Shows how you could very powerful search and discovery tools by linking to entities for particular genres and subjects
A lot of books shared on Wikidata will be digitized and OCR’ed
Once you have OCR data you can use AI to determine things like subject and genre
Can also pull out entities from text (names, places, events, etc.)
National Library of the Netherlands has done this for their newspapers
Text can then be tagged and connected to items and external identifiers
Use of IIIF can allow you to overlay information onto text
Structured data can transform how libraries look at data
Questions
Saving time in MARC to Wikidata workflows
People want a programmatic way to do this, but creating mapping for authors without unique identifiers or works can be difficult. Make the initial cataloging as clean as possible (example: adding ISNI identifiers)
Is modelling the manuscript extensively (slide) labor intensive?
Were able to semi-automatically take out names and match them to people, many already in Wikidata. Fairly labor intensive but there may be ways to automate much of the work to a good degree of accuracy. Would be tricky for a giant collection of books.
Any plans to apply process to materials beyond books?
Always trying new datasets, discussing musical scores. Would love to do sound recordings or video, but you can’t share the actual recordings (likely to have copyright issues) which takes away some benefits of sharing data.
Any advice on thesis and subject headings in Wikidata?
University of Edinburgh has shared a thesis collection (Ewan McAndrew)
Adam: Looking at converting EDTs into Wikidata, both proposing new subjects to Library of Congress, and also creating Wikidata items (often already items, but creating when needed). Trying to figure if LCSH headings can be mapped, especially free floating subdivisions (use main part, use entire field). If Wikidata URIs can go in MARC fields, that would document exact Wikidata item).
Library of Congress has done mapping work, some have links some don’t. So many subject headings don’t have items that could be created.
National Library of Wales has volunteers tagging photographs, would be cleaner and easier with identifiers.
When do you switch to Wikibase for things you can’t describe with Wikidata? Is this all scalable? Think about what you’re trying to achieve, are Wikibase or Wikidata the best option?
Advice for mapping books not in English?
Tried to make sure the language was there, and labels were correct (English versus Welsh)
MARC to Wikidata mappings?
Will do a lot of heavy lifting when it’s done, people are working on it. Universal mapping would be very useful, take care of basic stuff.
Any pushback from Wikidata folks?
No pushback, asked in several areas if it would be acceptable to add that much information. No pushback or complaints, but uploads will get bigger and bigger and changes may be needed. One of the reasons this was done was to advocate for structured data generally and Wikidata can be used to take a sample and show how what can be possible for the future at a larger scale. That doesn’t mean everything in Wikidata, but it’s a fantastic showcase.
Are some of the visualizations online?
Shared slides in the agenda, a couple may be on commons
Some charts could be interpreted as music
Did a hackathon where someone turned data into music
Has anyone considered putting preferred terms (over problematic subject headings) in Wikidata?
Not something Jason has had to deal with on Wikidata
Jim: would be interested in collaborating on how to open up these preferred labels. It seems the lists for preferred labels are closed or internally managed currently.
Working on Alison Turnbull:
She studied from 1975-1977 at the Academia Arjona in Madrid, from 1977-1978 at the West Surrey College of Art and Design, and from 1978-1981 at the Bath Academy of Art in Corsham.[1]
This list is automatically generated from data in
Wikidata and is periodically updated by
Listeriabot. Edits made within the list area will be removed on the next update!