Dear colleagues,
as an attempt to keep the discussion on very practical terms, I would like to follow-up on Dag’s message below. Trying for a moment not to focus too much on the specific ontology, can we compile a short list of “kernel” fields that:

- are readily available to users who will be registering PGRFA in GLIS, e.g. genebanks
- can be used to discriminate two different PGRFA
- convey enough information to the querying user or application
- can be marked as mandatory to register a new PGRFA in GLIS

My initial proposal would be, using MCPD terminology:

INSTCODE
ACCENUMB
GENUS
SPECIES
SPAUTHOR
SUBTAXA
SUBAUTHOR

I would also include CROPNAME but, without referencing a controlled vocabulary, it cannot be used as key field given the wide variations in common crop names we are already experiencing in Easy-SMTA. It can very well be an optional field instead, although it would be great to have it translated…

What about the collection information? Would they be useful to discriminate two different PGRFA coming from the same institute? Isn’t the Accession Number enough?

Do we add SAMPSTAT?

Once we define this “kernel” description, we can extend it as we se fit with optional fields.

Best regards

M


On 10 Mar 2015, at 22:02, Dag Endresen <[log in to unmask]> wrote:

Dear all,

Francisco has encouraged me to share the mapping table  between the MCPD [1] and the Darwin Core standard [2] as input to the discussions for week 3 on metadata. When this mapping was developed [3], we started with the Darwin Core and created an extension to include the descriptors from the MCPD not already covered by terms established in the Darwin Core. The mapping has later developed into a SKOS vocabulary of terms [4]. An overview of this mapping (Germplasm Vocabulary) was also presented at the ECPGR Information and documentation network meeting last year [5] (see slide number 27).
 
MCPD to DWC mapping
MCPD
DWC

 

dwc.datasetID

 

dwc.occurrenceID
INSTCODE
dwc.institutionCode
ACCENUMB
dwc.catalogNumber
COLLNUMB
dwc.recordNumber
COLLCODE
g.collectingInstituteID
COLLNAME
dwc.recordedBy
COLLINSTADDRESS
dwc.recordedBy
COLLMISSID
dwc.collectionCode
GENUS
dwc.genus
SPECIES
dwc.specificEpithet
SPAUTHOR
dwc.scientificNameAuthorship
SUBTAXA
dwc.infraspecificEpithet, dwc.taxonRank
SUBTAUTHOR
dwc.scientificNameAuthorship
CROPNAME
dwc.vernacularName
ACCENAME
g.breedingIdentifier
ACQDATE
g.acquisitionDate
ORIGCTY
dwc.countryCode
COLLSITE
dwc.locality
DECLATITUDE
dwc.decimalLatitude
LATITUDE
dwc.verbatimLatitude
DECLONGITUDE
dwc.decimalLongitude
LONGITUDE
dwc.verbatimLongitude
COORDUNCERT
dwc.coordinateUncertaintyInMeters
COORDDATUM
dwc.geodetic.Datum
GEOREFMETH
dwc.georeferenceSources
ELEVATION
dwc.minimumElevationinMeters
COLLDATE
dwc.eventDate
BREDCODE
g.breederInstituteID
BREDNAME
g.breedingInstitute
SAMPSTAT
g.biologicalStatus
ANCEST
g.ancestralData, g.purdyPedigree
COLLSRC
g.acquisitionSource
DONORCODE
g.donorInstituteID
DONORNAME
g.donorInstitute
DONORNUMB
g.donorsIdentifier
OTHERNUMB
dwc.otherCatalogNumbers
DUPLSITE
g.safetyDuplicationInstituteID
DUPLINSTNAME
g.safetyDuplicationInstitute
STORAGE
g.storageCondition
MLSSTAT
g.mlsStatus
REMARKS
dwc.occurrenceRemarks
Extensions
Term
Description
dwc.relatedResourceID
Allow for definition f any relation type between the current entity and another entity
dwc.relationshipOfResource
dwc.relationshipRemarks
dwc.relationshipAccordingTo
dwc.relationshipEstablishedDate
dc.references
Allows for additional targets to be associated to the Entity for multiple resolution

 


I would also like to highlight two other recent developments that I find particular relevant to the discussion for metadata this week. The first activity is championed by Ramona Walls and seeks to develop and ontological anchoring for the Darwin Core terminology [6,7] for specimens such as the genebank accessions, PGRFA. The second activity is the development championed by Steve Baskauf of an ontology for describing relationships between Darwin Core entities [8] and an RDF guide for Darwin Core [9].

Best regards
Dag Endresen

[1] Alercia A., S. Diulgheroff, and M. Mackay (2012). FAO/Bioversity Multi-crop passport descriptors v.2 [MCPD v.2]. Food and Agriculture Organization of the United Nations (FAO), and Bioversity International, Rome, Italy. 11 pp. Available at [http://www.bioversityinternational.org/index.php?id=19&user_bioversitypublications_pi1%5BshowUid%5D=6901]

[2] Wieczorek J, Bloom D, Guralnick R, Blum S, Döring M, Giovanni R, Robertson T, Vieglais D. (2012). Darwin Core: An Evolving Community-Developed Biodiversity Data Standard. PLoS ONE 7:e29715. [http://doi.org/10.1371/journal.pone.0029715]

[3] Endresen DTF and Knüpffer H (2012). The Darwin Core extension for genebanks opens up new opportunities for sharing genebank datasets. Biodiversity Informatics 8:12-29. [http://doi.org/10.17161/bi.v8i1.4095]

[4] Germplasm vocabulary [http://terms.tdwg.org/wiki/Germplasm] [https://code.google.com/p/darwincore-germplasm/]

[5] http://www.slideshare.net/DagEndresen/european-agrobidioversity-ecpgr-network-meeting-on-eurisco-central-crop-databases-and-users-prague-may-2014

[6] Walls R, Deck J, Guralnick R, et al. (2014). Semantics in Support of Biodiversity Knowledge Discovery: An Introduction to the Biological Collections Ontology and Related Ontologies. PLoS ONE 9(3):e89606. [http://doi.org/10.1371/journal.pone.0089606]

[7] http://biocodecommons.org/

[8] Baskauf SJ, & Webb CO (2014). Darwin-SW: Darwin Core-based terms for expressing biodiversity data as RDF. Semantic web [#995-2206]. [http://www.semantic-web-journal.net/system/files/swj995.pdf]

[9] Baskauf SJ, Wieczorek J, Deck J, Webb CO (2014). An RDF guide for the Darwin Core standard. Semantic web [#636-1846] [http://www.semantic-web-journal.net/system/files/swj635.pdf]


-----------------------------
From: Lopez, Francisco (AGDT) <[log in to unmask]>
Sent: 10 March 2015 16:56
To: Dag Endresen
Subject: Week 3 - MCPD to DWC mapping - the DwC germplasm extension
 
Dear Dag,
                I think that it is very relevant, for the discussions of week 3 we started today, that we show this mapping table with all the participants (DwC germplasm extension).  As it is very much based on your work,  I think that it is more appropriate for you to circulate it.  Unless you prefer we do it.

 

Thank you.
Regards,
Francisco



To unsubscribe from the GLIS-PGRFA-L list, click the following link:
https://listserv.fao.org/cgi-bin/wa?SUBED1=GLIS-PGRFA-L&A=1



To unsubscribe from the GLIS-PGRFA-L list, click the following link:
https://listserv.fao.org/cgi-bin/wa?SUBED1=GLIS-PGRFA-L&A=1