This document specifies a framework for representing data recorded in terminological data collections (TDCs). This framework includes a metamodel and methods for describing specific terminological markup languages (TMLs) expressed in XML. The mechanisms for implementing constraints in a TML are defined, but not the specific constraints for individual TMLs.
This document is designed to support the development and use of computer applications for terminological data and the exchange of such data between different applications. This document also defines the conditions that allow the data expressed in one TML to be mapped onto another TML.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 704 Terminology work - Principles and methods
ISO 1087-1 Terminology work - Vocabulary - Part 1: Theory and application
ISO 26162 Systems to manage terminology, knowledge and content - Design, implementation and maintenance of terminology management systems
ISO 30042:2008 Systems to manage terminology, knowledge and content - TermBase eXchange (TBX)
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 1087-1 and the following apply.
3.1
basic information unit
information unit (3.12) attached to a component (3.3) of the metamodel and that can be expressed by means of a single data category (3.6)
3.2
complementary information; Cl
information supplementary to that described in terminological entries (3.22) and shared across the terminological data collection (3.21)
Note: Domain hierarchies, institution descriptions, bibliographic references and references to text corpora are typical examples of complementary information.
3.3
component
elementary description unit of a metamodel to which data categories (3.6) can be associated to form a data model
3.4
compound information unit
information unit (3.12) attached to a component (3.3) of the metamodel that is expressed by means of several grouped data categories (3.6), that, taken together, express a coherent unit of information
3.5
conceptual domain
set of valid value meanings associated with a data category (3.6)
Note: For example, the data category /part of speech/ could have the following conceptual domain: / noun/, /verb/, /adjective/, /adverb/, and so forth.
3.6
data category
elementary descriptor used in a linguistic description or annotation scheme
Note: In this document, data categories are indicated in between forward slashes (/), e.g. /definition/.
3.7
data category repository; DCR
electronic repository of data category specifications (3.9) to be used as a reference for the definition of linguistic annotation schemes or any other representation model for language resources
Note: A DCR for language resources is available at http://www.datcatinfo.net.
3.8
data category selection; DCS
set of data categories (3.6) selected from a DCR (3.7)
3.9
data category specification
set of attributes used to fully describe a given data category (3.6)
Note: The abbreviation “DCS” is associated with data category selection and is not used for data category specification.
3.10
expansion tree
structured group of XML elements that implement a level of the metamodel in a given TML (3.23)
3.11
global information; GI
technical and administrative information applying to the entire terminological data collection (3.21)
Note: For example, the title of the terminological data collection, revision history, owner or copyright information.
3.12
information unit; IU
elementary piece of information attached to a structural level of the metamodel
3.13
language section; LS
part of a terminological entry (3.22) containing information related to one language
Note: One terminological entry may contain information on one or more languages.
3.14
object language
language being described
3.15
persistent identifier; PID
unique Uniform Resource Identifier (URI) that assures permanent access for a digital object by providing access to it independently of its physical location or current ownership
3.16
structural node
instance of component (3.3) within the representation of a terminological data collection (3.21)
3.17
structural skeleton
abstract description of an instance of a terminological data collection (3.21) in conformity with the metamodel
3.18
style
specification for the implementation of a data category (3.6) in XML
3.19
term component section; TCS
part of a term section (3.20) giving linguistic information about the components of a term
3.20
term section; TS
part of a language section (3.13) giving information about a term
GB/T 29181-2024 Terminology work - Computer applications - Terminological markup framework
1 Scope
This document specifies a framework for representing data recorded in terminological data collections (TDCs). This framework includes a metamodel and methods for describing specific terminological markup languages (TMLs) expressed in XML. The mechanisms for implementing constraints in a TML are defined, but not the specific constraints for individual TMLs.
This document is designed to support the development and use of computer applications for terminological data and the exchange of such data between different applications. This document also defines the conditions that allow the data expressed in one TML to be mapped onto another TML.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 704 Terminology work - Principles and methods
ISO 1087-1 Terminology work - Vocabulary - Part 1: Theory and application
ISO 26162 Systems to manage terminology, knowledge and content - Design, implementation and maintenance of terminology management systems
ISO 30042:2008 Systems to manage terminology, knowledge and content - TermBase eXchange (TBX)
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 1087-1 and the following apply.
3.1
basic information unit
information unit (3.12) attached to a component (3.3) of the metamodel and that can be expressed by means of a single data category (3.6)
3.2
complementary information; Cl
information supplementary to that described in terminological entries (3.22) and shared across the terminological data collection (3.21)
Note: Domain hierarchies, institution descriptions, bibliographic references and references to text corpora are typical examples of complementary information.
3.3
component
elementary description unit of a metamodel to which data categories (3.6) can be associated to form a data model
3.4
compound information unit
information unit (3.12) attached to a component (3.3) of the metamodel that is expressed by means of several grouped data categories (3.6), that, taken together, express a coherent unit of information
3.5
conceptual domain
set of valid value meanings associated with a data category (3.6)
Note: For example, the data category /part of speech/ could have the following conceptual domain: / noun/, /verb/, /adjective/, /adverb/, and so forth.
3.6
data category
elementary descriptor used in a linguistic description or annotation scheme
Note: In this document, data categories are indicated in between forward slashes (/), e.g. /definition/.
3.7
data category repository; DCR
electronic repository of data category specifications (3.9) to be used as a reference for the definition of linguistic annotation schemes or any other representation model for language resources
Note: A DCR for language resources is available at http://www.datcatinfo.net.
3.8
data category selection; DCS
set of data categories (3.6) selected from a DCR (3.7)
3.9
data category specification
set of attributes used to fully describe a given data category (3.6)
Note: The abbreviation “DCS” is associated with data category selection and is not used for data category specification.
3.10
expansion tree
structured group of XML elements that implement a level of the metamodel in a given TML (3.23)
3.11
global information; GI
technical and administrative information applying to the entire terminological data collection (3.21)
Note: For example, the title of the terminological data collection, revision history, owner or copyright information.
3.12
information unit; IU
elementary piece of information attached to a structural level of the metamodel
3.13
language section; LS
part of a terminological entry (3.22) containing information related to one language
Note: One terminological entry may contain information on one or more languages.
3.14
object language
language being described
3.15
persistent identifier; PID
unique Uniform Resource Identifier (URI) that assures permanent access for a digital object by providing access to it independently of its physical location or current ownership
3.16
structural node
instance of component (3.3) within the representation of a terminological data collection (3.21)
3.17
structural skeleton
abstract description of an instance of a terminological data collection (3.21) in conformity with the metamodel
3.18
style
specification for the implementation of a data category (3.6) in XML
3.19
term component section; TCS
part of a term section (3.20) giving linguistic information about the components of a term
3.20
term section; TS
part of a language section (3.13) giving information about a term