-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create default metadata profile #896
Comments
We're using the sandbox described in https://groups.google.com/forum/#!topic/islandora/Me8J0aXhjjw to build something up. I had created a bunch of vocabularies, then exported the config, before a vagrant destroy. For three vocabularies, I imported the following files:
And that doesn't even attach them to the repository object... or fill the vocabs with values. I have some csv's of values to go in to these, scraping or cherry-picking from existing vocabularies. Not sure what to do with them... |
|
I believe I have added the small number of field names in Drupal and mapped them to the correct Typein that sandbox. @rosiel had already added the few that are controlled/reconcile, and I didn't touch those. If we are able to give you a list of LCSH (with or without their URIs), or some other list that would need to be included in one of the taxonomies above, can someone spend some time tomorrow morning showing us how to import a CSV to those? Also, how to import a CSV of metadata objects into CLAW/what format the CSV should be in? |
Oh! I didn't see the sandbox there. Okay. So, to import a CSV of taxonomy terms we need to define a migration config. We can talk about it during tomorrow's "XML2CSV to Islandora CLAW" call. If you want to send me a CSV I can prep for it. |
Thanks @rtilla1 ! and @seth-shaw-unlv , i didn't mean for this to override your modelling, i just wanted to explore the "Vocabulary Encoding Schemes" in the DCMI document [1]. It seems to be how we're supposed to model mimetypes - as classes, not strings. I based it off of the third image in this document [2] which has a custom-minted "value URI" for a string "Biology"@en which comes from a (custom) vocabulary. In our case, the dc:format's value URI will be the custom minted entity-thing representing a mimetype (e.g. application/pdf) but that entity thing will have a rdf:value "application/pdf" and a dcam:memberOf http://purl.org/dc/terms/IMT. Anyway, saw the other values of "Vocabulary Encoding Schemes" included DCMI type so thought i'd map that out to see if we like it... the "MARC Resource Types Scheme" vocabulary seems like a better fit with less data loss, so we might use it instead. But unlike mimetypes, the Marc Resource Types Schemes have URIs for the values so we might be able to model those more straightforwardly. Thing with mimetypes, is that there are wayyyyyy too many for a dropdown list so i wrote in the description of the vocab "This is a subset add if needed" - which parallels what we're doing with LCSH/subjects. Hence I made a trial vocab for LCSH subjects. Do we want to model parent/child relations of subject terms? It's built in to the taxonomy hierarchy in Drupal. The problem is: LCSH terms often have multiple parents, so the analogy breaks down. [1] http://dublincore.org/documents/dcmi-terms/#section-4 |
I believe I have added the small number of field names in Drupal and mapped them to the correct Typein that sandbox. @rosiel had already added the few that are controlled/reconcile, and I didn't touch those. If we are able to give you a list of LCSH (with or without their URIs), or some other list that would need to be included in one of the taxonomies above, can someone spend some time tomorrow morning showing us how to import a CSV to those? Also, how to import a CSV of metadata objects into CLAW/what format the CSV should be in? |
@rtilla1 It doesn't really matter what structure the CSV is, since you can update the migration to match (also, we haven't defined one for the new demo content model, so you tell us! 😁 ). For terms, all you really need is the LCSH term. You don't even really need a CSV. For example, I have a proof-of-concept module that defines a new content model and configures a migration based on some MADS RDFXML records and a CSV of object records. One of those migrations looks for all the Topic records and creates "subject" nodes. The image metadata objects are migrated from a CSV. The columns of the CSV are defined as part of the source section and their mappings as part of the process section. This example uses the term value to look up the subject node, but it could be changed to use the URI fairly easily if that is what you have. I may come back and plop some of the code bits directly into this comment later. |
@rosiel and @rtilla1, on the Sandbox we have two taxonomies for types (MARC types and DCMI types) with separate repository object fields for each. How would you feel about the following changes?
|
Based on the migration sprint wrap-up call, it looks like Yes to the first two and "maybe" for the last one. We will have to try it out. Also, @whikloj suggested using a view for the auto-complete to support disambiguation. We'll see how it goes. |
In case anyone is interested, the current WIP branch I'm using is @ https://github.com/seth-shaw-unlv/islandora_demo/tree/issue-896. I think it is almost ready for a PR, although it requires PR Islandora/controlled_access_terms#8. |
@seth-shaw-unlv Can you please push this as PR. I can test this. |
I think attaching fields directly to the Repository Item object as a default would work. The PR is close. We do have to consider the larger Application Profile architecture at some point. Not sure if this is the ticket to discuss that. In 7.x, we had cutomizable forms per content model. A user can select from multiple profiles as well. In 8.x, how can we handle that? Would it be cloning a content type (Say Repository Item - MODS, Repository Item - DC)? Or developing the metadata profile into its own bundle or content type, then using the inline form entity insert it. Here, RDF mapping can get tricky. |
@Natkeeran "cloning a content type (Say Repository Item - MODS, Repository Item - DC)" is what we were planning to do here (although probably dropping the "Repository Item - " prefix). |
Any objections to closing this now that https://github.com/Islandora-CLAW/islandora_demo/pull/2 is merged? |
@seth-shaw-unlv feel free to close this. If we have improvements to the default metadata profile to make, we'll do separate issues. |
See https://docs.google.com/spreadsheets/d/18u2qFJ014IIxlVpM3JXfDEFccwBZcoFsjbBGpvL0jJI/edit#gid=0
The 'Repository Item' content type is pretty bare bones. Let's add to it using the spreadsheet above as guidance. This means adding fields for what's listed (either as strings, entities, or taxonomy term references), and map to RDF as best you can (I'm sure there'll be discussions, we can talk it out using this issue).
After configuring the metadata profile for the 'Repository Item' content type, export its field and field storage definitions into the
islandora_demo_feature
feature. You'll also want to re-export the RDF mapping with any changes you've made.The text was updated successfully, but these errors were encountered: