# Outline ### Controlled Vocabularies 1. Need Consistent Terminology 1. ConceptScheme (Collection) 2. Controlled Vocabulary 3. Value Domain/Enumerations/Value Lists/Code Lists/Pick Lists/Concept Groups 2. Resources: 1. [# ANSI/NISO Z39.19-2005 (R2010) Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies](https://www.niso.org/publications/ansiniso-z3919-2005-r2010) 2. Research Vocabularies Australia: https://vocabs.ardc.edu.au/ 3. GBIF Vocabulary Server: https://registry.gbif.org/vocabulary/search 4. CGI Vocabulary Registry: https://geosciml.org/resource/def/voc/ 5. GSQ's Vocabulary System: https://vocabs.gsq.digital/v 3. Metadata - [obsidian://open?vault=MainVault&file=Data%2FSchemes%2FValue%20Domains](obsidian://open?vault=MainVault&file=Data%2FSchemes%2FValue%20Domains) 4. Examples: 1. Proportion - https://cgi.vocabs.ga.gov.au/object?uri=http://resource.geosciml.org/classifier/cgi/proportionterm 2. Country Codes - 3. Usage Example: MinextForms 5. Revisions to the TDWG Ratification Process for Value Domains 1. Simplify and Maintain level of rigor without the time-intensive activities, must lower bar without losing quality 2. Community focused process 3. Metadata Scheme 4. Example of vocabulary gone through the ratification process and presented using the current TDWG documentation process: Boolean 5. How can we host vocabularies in a structured and consistent fashion 6. How to we present and distribute vocabularies to the community 1. See Proportion RDF, CSV, and MinExt Implementation 6. DWC DP Assertions 1. References/Resources 1. Tables: https://github.com/ben-norton/dwc-dp-transformation-utils/blob/main/output/dwc-dp/dwc-dp-tables.csv 2. Columns: https://github.com/ben-norton/dwc-dp-transformation-utils/blob/main/output/dwc-dp/dwc-dp-columns.csv 3. MaterialAssertion Tableschema: https://github.com/gbif/rs.gbif.org/blob/master/sandbox/experimental/data-packages/dwc-dp/0.1/table-schemas/material-assertion.json ## Definitions | Term | Definition | Source | | --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- | | Codelist | Value domain including a code for each permissible value | ISO 19136-1:2020 | | Codelist | A type of controlled vocabulary that is comprised of a finite list of standard terms and meanings that represent the allowable values for a data item | Definition Source: NCI | | Controlled Vocabulary | A collection of selected words and phrases related to a particular domain of knowledge used to permit consistency of metadata annotation and improved retrieval following a search, in which homonyms, synonyms, and similar ambiguities of meaning present in natural language are disambiguated. | FABIO | | Controlled Vocabulary | An information content entity that is a collection of other information content entities that have been created to identify or annotate things in a specified domain, and where the intention of its creators is that the collection has a one-to-one correspondence with those things. | Information Artifact Ontology | | Controlled Vocabulary | Finite set of values that represent the only allowed values for a data item | [ISO 11179](#iso_11179) | | Controlled vocabulary | Prescribed set of values that represent the only allowed values for a data item (term) | Adopted from ISO 21090 and ISO 25964-1:2011 | | Controlled Vocabulary | Supplemental vocabulary used to uniquely define potentially ambiguous words or Business Terms | ISO 15000-5:2014 | | Controlled Vocabulary | Vocabulary for which the entries, i.e., definition/term pairs, are controlled by a Source Authority based on a rule base and process for addition/deletion of entries | ISO/IEC 5394:2024(en) Information technology — Criteria for concept systems | | Value Domain | Set of permissible values | [ISO 11179](#iso_11179) | | Value Domain | Specified by a description or specification, such as a rule, a procedure, a range (i.e. interval), or by a set of permissible values | | | Vocabulary | A collection of "terms" for a particular purpose. | https://www.w3.org/TR/ld-glossary/#ontology | | Vocabulary | A set of words, either constituting a language or more specifically used to describe a particular domain of knowledge. | FABIO | | Vocabulary | Terminological dictionary which contains designations and definitions for one or more specific subject fields | ISO 1087:2019(en) Terminology work and terminology science — Vocabulary | | pick list | A graphical user interface device that allows the user to select from<br>a pre-set list of terms. Typically the list of terms is shown when the<br>user clicks on a down arrow next to the entry box for the term. | ANSI/NISO Z39.19-2005 | #### ISO Definitions [ISO/IEC 11179-3 Section 11.3.2.5](http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html) **Value_Domain** is a class each instance of which models a value domain  (3.2.140), a set of permissible values (3.2.96). A value domain provides  representation, but has no implication as to the [Data Element Concept](https://dss.aristotlecloud.io/help/concepts/aristotle_mdr/dataelementconcept) (3.2.29) with which the values are associated, nor what the values mean. A Value_Domain is an abstract class which is used to denote a  collection of Permissible_Values associated with a Conceptual_Domain  ([11.3.2.1](http://11.3.2.1)). A Value_Domain has two possible subclasses: an  Enumerated_Value_Domain ([11.3.2.6](http://11.3.2.6)) and a Described_Value_Domain  ([11.3.2.8](http://11.3.2.8)). A Value_Domain must be either one or both an Enumerated  Valued or a Described_Value_Domain. **Example:** 'ISO 3166 Codes for the representation of  names of countries' describes seven distinct Value_Domains for the  single Conceptual_Domain 'names of countries'. The seven Value_Domains  are:    'short name in English',    'official name in English',    'short name in French',    'official name in French',    'alpha-2 code',    'alpha-3 code'    and 'numeric code'. [ISO/IEC 11179-3 Section 11.3.2.6](http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html) **Enumerated_Value_Domain** is a class each instance of which models an  enumerated value domain (3.2.61), a value domain (3.2.140) that is  specified by a list of all its permissible values (3.2.96). The  Enumerated_Value_Domain class is a concrete subclass of the abstract  class Value_Domain. [ISO/IEC 11179-3 Section 11.3.2.8](http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html) **Described_Value_Domain** is a class each instance of which models a  described value domain (3.2.49), a value domain (3.2.140) that is  specified by a description or specification, such as a rule, a  procedure, or a range (i.e. interval) , rather than as an explicit set  of permissible values (3.2.96). It is a concrete subclass of the  abstract class Value_Domain. As a subclass of Value_Domain, a  Described_Value_Domain inherits the attributes and relationships of the  former. ## Diagrams ```mermaid erDiagram A["Value Domain"] B["Data Value"] C["Term"] A ||--|{ B : "" B ||--|| C : "" ``` ```mermaid classDiagram class A["Value Domain"]{ dcterms:title dcterms:description valueDomainIdentifier dcterms:publisher dcat:version owl:versionInfo dcterms:language skos:changeNote skos:scopeNote skos:historyNote skos:editorialNote vann:usageNote isDefinedBy } class B["Data Value"]{ skos:prefLabel skos:altLabel skos:hiddenLabel skos:definition } class C["Term"]{ tdwg:termURI skos:prefLabel skos:altLabel skos:definition skos:example vann:usageNote } A -- B : "Consists Of" C --> A : "Populated By" ``` Value Domain consists of set of Data Values Term is limited to member of Value Domain ### Mineralogy Extension Forms - Value Domains Model ![[TDWG/TAG/2025-06-10 Meeting (attachments)/value_domains_model.png|value_domains_model]] ### Best Practices (Work in Progress) #### Semantic Rules | Rule | Scope | Source | Remarks | | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- | --------------------- | ------------------------------------------------ | | If the same term is commonly used to mean different concepts, then its name is explicitly qualified to resolve this ambiguity.<br> | Requirement for controlled vocabularies | ANSI/NISO Z39.19-2005 | NOTE: This rule does not apply to synonym rings. | | If multiple terms are used to mean the same thing, one of the<br>terms is identified as the preferred term in the controlled<br>vocabulary and the other terms are listed as synonyms or<br>aliases. | Requirement for controlled vocabularies | ANSI/NISO Z39.19-2005 | | | | | | | #### Best Practices List 1. Every value must be defined. 2. Translations must be supported 3. Definitions must be mutually exclusive where each value has a single unambiguous and meaningful definition 4. There must be minimal semantic overlap between value definitions 5. All values must be of the parent term. 1. For example: North Carolina or Other cannot belong to a controlled vocabulary for the term dwc:country 6. As a set, all the values in a controlled vocabulary must cover 100% of uses cases without the use of "other" 7. All values must be semantically independent without any dependencies on other values to provide meaning to a value 8. Ideally, controlled vocabularies are actively managed by an authoritative body 9. Controlled vocabularies adhere to a formal versioning process and system 10. All values belonging to a controlled vocabulary should be roughly equivalent level of granularity 11. Each value must be identified by an IRI/UUID 12. Required fields: prefLabel, hiddenLabel, definition, identifier #### Recommendations 1. Controlled vocabularies utilize a date-based version system at the vocabulary level 2. Controlled vocabularies are published as SKOS resources ##### Notes 1. Quality assessments of controlled vocabularies can be divided into two types: 1. Structural 2. Contextural 2. Structural assessments are strictly an assessment of the structure of a controlled vocabulary. Structural assessments may be automated which allows programmatic validation using a set of rules that control the structure of vocabularies. Structural assessments do not evaluate the content of a controlled vocabulary. 3. Contextural assessments evaluate the content of a vocabulary. This often requires domain expertise and cannot be readily automated. #### Sources | Title | Abbreviation | URL | Description | | -------------------------------------------------------------------------------------------------------------------- | --------------------- | ------------------------------------------------------------------- | ----------- | | ANSI/NISO Z39.19-2005 Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies | ANSI/NISO Z39.19-2005 | https://ils.unc.edu/courses/2015_fall/inls151_002/Readings/NISO.pdf | |