Dear Colleagues,

 

Some of the terminology used in the discussion, such as “ontology” has for someone coming from the biological background a very different meaning than in the database context. It is very interesting and educational  for me to follow the conversations, but I may not understand all issues. Some abbreviations used in the conversations I do not understand (and I am too lazy to look them up).

 

I have the impression that the we now approach more around the genebank accessions with DOIs and that makes very  good sense to me, as the Treaty and in particular the Multilateral System are based on them.

 

It seems to me that those who work with more static collections (herbarium specimens, museum entries, DNA strings, etc.) are more advanced regarding databases than the genebank people, who work with these volatile living samples. Perhaps the databases as such are more compatible with dead material. Hence, the Darwin Core and the other systems quoted are much more rigid, structured, logically and hierarchically elaborated than what we are used to in genebanks.

 

The idea to start the GLIS from the well established Multicrop Passport Descriptors from FAO makes very good sense, as all genebank people in the world will have been exposed to this now already traditional concept. To pick some core descriptors, as suggested by Marco and improved by others in this discussion, is in my opinion very good. It will also keep the links alive to the World Information and Early Warning System (WIEWS ) on Plant Genetic Resources for Food and Agriculture (PGRFA) (http://apps3.fao.org/wiews/wiews.jsp) which is for me, besides EURISCO and GRIN, the most useful database when looking for global germplasm sources. Definitely, we will need each dataset (DOI entry) to allow for more than one alternate identifier.

 

Many regards,

 

Axel

 

Axel Diederichsen
Biodiversity and Collections - Biodiversité et collections

Telephone | Téléphone ++1-306-385-9465

 

 

 

From: Global Information System on PGRFA [mailto:[log in to unmask]] On Behalf Of Lopez, Francisco (AGDT)
Sent: March-12-15 4:12 AM
To: [log in to unmask]
Subject: Week 3 - MCPD to DWC mapping - the DwC germplasm extension

 

Dear Marco,

                              I would like add MLSSTAT to this list of initial set of fields to be associated with the global identifier to know whether the material is the Multilateral System or not.  

 

I also think that DONORCODE and DONORNUMBER as well as OTHERNUMB (eg. collecting number in GRIN-Global) could be also very useful to find out potential duplicates and for the Global system to facilitate relationships when the material is transferred.

 

Following on the discussions we had in the previous weeks, I also support the inclusion of SAMPSTAT in the list.

 

Thank you.

Regards,

Francisco

 

From: Global Information System on PGRFA [mailto:[log in to unmask]] On Behalf Of Marsella Marco
Sent: 12 March 2015 10:13
To: [log in to unmask]
Subject: Re: Week 3 - MCPD to DWC mapping - the DwC germplasm extension

 

Dear colleagues,

as an attempt to keep the discussion on very practical terms, I would like to follow-up on Dag’s message below. Trying for a moment not to focus too much on the specific ontology, can we compile a short list of “kernel” fields that:

 

- are readily available to users who will be registering PGRFA in GLIS, e.g. genebanks

- can be used to discriminate two different PGRFA

- convey enough information to the querying user or application

- can be marked as mandatory to register a new PGRFA in GLIS

 

My initial proposal would be, using MCPD terminology:

 

INSTCODE

ACCENUMB

GENUS

SPECIES

SPAUTHOR

SUBTAXA

SUBAUTHOR

 

I would also include CROPNAME but, without referencing a controlled vocabulary, it cannot be used as key field given the wide variations in common crop names we are already experiencing in Easy-SMTA. It can very well be an optional field instead, although it would be great to have it translated…

 

What about the collection information? Would they be useful to discriminate two different PGRFA coming from the same institute? Isn’t the Accession Number enough?

 

Do we add SAMPSTAT?

 

Once we define this “kernel” description, we can extend it as we se fit with optional fields.

 

Best regards

 

M

 

 

On 10 Mar 2015, at 22:02, Dag Endresen <[log in to unmask]> wrote:

 

Dear all,

Francisco has encouraged me to share the mapping table  between the MCPD [1] and the Darwin Core standard [2] as input to the discussions for week 3 on metadata. When this mapping was developed [3], we started with the Darwin Core and created an extension to include the descriptors from the MCPD not already covered by terms established in the Darwin Core. The mapping has later developed into a SKOS vocabulary of terms [4]. An overview of this mapping (Germplasm Vocabulary) was also presented at the ECPGR Information and documentation network meeting last year [5] (see slide number 27).

 

MCPD to DWC mapping

MCPD

DWC

 

dwc.datasetID

 

dwc.occurrenceID

INSTCODE

dwc.institutionCode

ACCENUMB

dwc.catalogNumber

COLLNUMB

dwc.recordNumber

COLLCODE

g.collectingInstituteID

COLLNAME

dwc.recordedBy

COLLINSTADDRESS

dwc.recordedBy

COLLMISSID

dwc.collectionCode

GENUS

dwc.genus

SPECIES

dwc.specificEpithet

SPAUTHOR

dwc.scientificNameAuthorship

SUBTAXA

dwc.infraspecificEpithet, dwc.taxonRank

SUBTAUTHOR

dwc.scientificNameAuthorship

CROPNAME

dwc.vernacularName

ACCENAME

g.breedingIdentifier

ACQDATE

g.acquisitionDate

ORIGCTY

dwc.countryCode

COLLSITE

dwc.locality

DECLATITUDE

dwc.decimalLatitude

LATITUDE

dwc.verbatimLatitude

DECLONGITUDE

dwc.decimalLongitude

LONGITUDE

dwc.verbatimLongitude

COORDUNCERT

dwc.coordinateUncertaintyInMeters

COORDDATUM

dwc.geodetic.Datum

GEOREFMETH

dwc.georeferenceSources

ELEVATION

dwc.minimumElevationinMeters

COLLDATE

dwc.eventDate

BREDCODE

g.breederInstituteID

BREDNAME

g.breedingInstitute

SAMPSTAT

g.biologicalStatus

ANCEST

g.ancestralData, g.purdyPedigree

COLLSRC

g.acquisitionSource

DONORCODE

g.donorInstituteID

DONORNAME

g.donorInstitute

DONORNUMB

g.donorsIdentifier

OTHERNUMB

dwc.otherCatalogNumbers

DUPLSITE

g.safetyDuplicationInstituteID

DUPLINSTNAME

g.safetyDuplicationInstitute

STORAGE

g.storageCondition

MLSSTAT

g.mlsStatus

REMARKS

dwc.occurrenceRemarks

Extensions

Term

Description

dwc.relatedResourceID

Allow for definition f any relation type between the current entity and another entity

dwc.relationshipOfResource

dwc.relationshipRemarks

dwc.relationshipAccordingTo

dwc.relationshipEstablishedDate

dc.references

Allows for additional targets to be associated to the Entity for multiple resolution

 

 

I would also like to highlight two other recent developments that I find particular relevant to the discussion for metadata this week. The first activity is championed by Ramona Walls and seeks to develop and ontological anchoring for the Darwin Core terminology [6,7] for specimens such as the genebank accessions, PGRFA. The second activity is the development championed by Steve Baskauf of an ontology for describing relationships between Darwin Core entities [8] and an RDF guide for Darwin Core [9].

 

Best regards

Dag Endresen

[1] Alercia A., S. Diulgheroff, and M. Mackay (2012). FAO/Bioversity Multi-crop passport descriptors v.2 [MCPD v.2]. Food and Agriculture Organization of the United Nations (FAO), and Bioversity International, Rome, Italy. 11 pp. Available at [http://www.bioversityinternational.org/index.php?id=19&user_bioversitypublications_pi1%5BshowUid%5D=6901]

[2] Wieczorek J, Bloom D, Guralnick R, Blum S, Döring M, Giovanni R, Robertson T, Vieglais D. (2012). Darwin Core: An Evolving Community-Developed Biodiversity Data Standard. PLoS ONE 7:e29715. [http://doi.org/10.1371/journal.pone.0029715]

[3] Endresen DTF and Knüpffer H (2012). The Darwin Core extension for genebanks opens up new opportunities for sharing genebank datasets. Biodiversity Informatics 8:12-29. [http://doi.org/10.17161/bi.v8i1.4095]

[4] Germplasm vocabulary [http://terms.tdwg.org/wiki/Germplasm] [https://code.google.com/p/darwincore-germplasm/]

[5] http://www.slideshare.net/DagEndresen/european-agrobidioversity-ecpgr-network-meeting-on-eurisco-central-crop-databases-and-users-prague-may-2014

[6] Walls R, Deck J, Guralnick R, et al. (2014). Semantics in Support of Biodiversity Knowledge Discovery: An Introduction to the Biological Collections Ontology and Related Ontologies. PLoS ONE 9(3):e89606. [http://doi.org/10.1371/journal.pone.0089606]

[7] http://biocodecommons.org/

[8] Baskauf SJ, & Webb CO (2014). Darwin-SW: Darwin Core-based terms for expressing biodiversity data as RDF. Semantic web [#995-2206]. [http://www.semantic-web-journal.net/system/files/swj995.pdf]

[9] Baskauf SJ, Wieczorek J, Deck J, Webb CO (2014). An RDF guide for the Darwin Core standard. Semantic web [#636-1846] [http://www.semantic-web-journal.net/system/files/swj635.pdf]

 

 

-----------------------------

From: Lopez, Francisco (AGDT) <[log in to unmask]>
Sent: 10 March 2015 16:56
To: Dag Endresen
Subject: Week 3 - MCPD to DWC mapping - the DwC germplasm extension

 

Dear Dag,

                I think that it is very relevant, for the discussions of week 3 we started today, that we show this mapping table with all the participants (DwC germplasm extension).  As it is very much based on your work,  I think that it is more appropriate for you to circulate it.  Unless you prefer we do it.

 

Thank you.

Regards,

Francisco

 

 


To unsubscribe from the GLIS-PGRFA-L list, click the following link:
https://listserv.fao.org/cgi-bin/wa?SUBED1=GLIS-PGRFA-L&A=1

 

 


To unsubscribe from the GLIS-PGRFA-L list, click the following link:
https://listserv.fao.org/cgi-bin/wa?SUBED1=GLIS-PGRFA-L&A=1

 


To unsubscribe from the GLIS-PGRFA-L list, click the following link:
https://listserv.fao.org/cgi-bin/wa?SUBED1=GLIS-PGRFA-L&A=1



To unsubscribe from the GLIS-PGRFA-L list, click the following link:
https://listserv.fao.org/cgi-bin/wa?SUBED1=GLIS-PGRFA-L&A=1