1 Scope
This document gives the types of human chromosomal genetic markers and the selected short tandem repeat sequence motifs for the establishment of the Court Science DNA Database; it specifies the data structure and basic requirements of the file format for data exchange between the National Court Science DNA Database and external systems.
This document applies to the design, development and testing of the Court Science DNA Database and external systems for data exchange with the Court Science DNA Database (e.g. DNA Laboratory Management Information System, DNA data analysis software, etc.).
2 Normative references
The contents of the following documents constitute essential provisions of this document through normative references in the text. Among them, note the date of the reference documents, only the date of the corresponding version applicable to this document; do not note the date of the reference documents, the latest version (including all the revision of the list) applicable to this document.
GB/T 2312 Basic set of characters for Chinese character encoding for information exchange
GB 18030 Character set for Chinese encoding of information technology
3 Terminology and definitions
The following terms and definitions apply to this document.
3.1
Gene locuslocus
The position occupied by a gene on a chromosome or the - segment of genomic DNA.
[Source:GB/T 37226- 2018,2.2]
3.2
allele
A different form of a gene located at the same position on a pair of homologous chromosomes.
3.3
short tandem repeat; STR
A class of DNA tandem repeat sequences that are widely found in eukaryotic genomes, usually consisting of two to six bases in a repeat unit, and usually with five to 60 repeats.
Note: In the human genome, there are autosomal STRs, Y-chromosomal STRs and x-chromosomal STRs, depending on the type of chromosome in which they are found.
3.4
repeat region seqoence
repeat region seqoence
The part of a short tandem repeat sequence (3.3) that consists of a tandem of repeat units, generally from the 5' end of the first repeat unit to the 3' end of the last repeat unit.
3.5
Repetition structure repeat stroceture
The form of the composition of the repeating units in a sequence of repeating regions (3.4).
4 Abbreviations
The following abbreviations apply to this document.
DNA:Deoxyribonucleic Acid
DDEM :DNA Database Exchange Message
NDNAD:National DNA Database (NDNAD)
XML; Extensible Markup Lang uage
5 DNA Database selection of genetic motifs
5.1 Selection of genetic markers
The DNA typing data in the Tribunal's Scientific DNA Database uses the human chromosomal amelogenin gene, short tandem repeat (STR) and mitochondrial DNA assay results.
Note:The data structure of the mitochondrial DNA assay results is not specified in this document.
5.2 Selection of genetic motifs
5.2.1 Class A motifs (core motifs): STR motifs that should be included in the DNA typing data recorded in the Tribunal's scientific DNA database and should conform to the provisions of Appendices A to C.
5.2.2 Class B motifs (preferred motifs): The STR motifs that should be preferred after all Class A motifs are included in the DNA typing data and should conform to the provisions of Appendices A to C.
5.2.3 Category C motifs (alternative motifs): additional STR motifs that are allowed to be entered into the Tribunal's scientific DNA database and shall conform to
Appendix A to Appendix C.
6 DNA Database Common Exchange Information File
6.1 Document Purpose and structure
The Tribunal Scientific DNA Database uses DDEM files for the exchange of information with external systems, where the import of data into the database is achieved by defining information packages. Using XML as a reference, all XML defined data types can be mapped to SQL92 or SQL99 defined data types using the XML mapping summary.
The DDEM file contains two parts: the DDEM file header and the sample (Specimen), as shown in Figure 1.
6.2 DDEM file header.
The DDEM file header section contains the following information:
a) Version;
b) Type of information;
c) Authorised Recording Laboratory Name;
d) Authorised Recording Laboratory ID;
e) Name of the source laboratory;
f) Source Lab ID;
g) Recording number:
h) Date of submission;
i) Recording lot number;
j) Reagent product name;
k) Reagent product number;
I) Reagent kit barcode number;
m) Sequencer manufacturer;
n) Sequencer vendor ID.
6.3 Samples
The sample section of the DDEM file contains the following information:
a) Sample number;
b) Examiner number;
c) Case number;
d) Sample classification;
e) whether the sample is partially typed;
f) sample comment;
g) genetic locus information.
6.4 DDEM file types
The DDEM data file types and descriptions are as follows.
a) + decimal: represents a number of arbitrary precision, values defined as decimal in the XML document are not stored in SQL92 or SQL99.
are not stored.
b) String type: consists of --- groups of characters, which can be any letters, symbols and numbers, but the DNA database does not support the pipe symbol "|"
and semi-colon ";"; some symbols have special meanings in XML, such as "<" and ">", if these special characters are used in DDEM files, they should be replaced by other If these special characters are used in the DDEM file, they should be replaced by other representations; the alternative representations should be in accordance with Table 1.
c) Date/time type: used to indicate a specified time, using the ISO 8601 subset format, in the form of "CCYY-MM-
DDTHH :mm:ss", where: "CC" for century, "YY" for year, "MM" for month, "DD" for day, "T " is the day
T" is the separator between day and time, "HH" "mm" *ss" means hour, minute and second respectively; if a more precise representation of time is needed, it is also possible to
If a more precise representation of time is required, the seconds can also be expressed as a fraction, *. .s.. *, this method is optional; in SQL92 or SQL99, the date stored in the XML document is stored in a date/time type or short-term/time type.
6.5 Document Content and Format Requirements
6.5.1 File Format
A DDEM file written in XML contains - - a DDEM file header and - - one or more samples. Examples of DDEM file headers are given in Appendix D.
An example DDEM file header is given in Appendix D, and an example DDEM file is given in Appendix E. Appendix F gives an example of an XML schema definition file for interpreting and verifying XML files.
Note: For ease of description, the information in Appendices D to F is not really in XML format. Tab characters, carriage returns and spaces have been added for ease of readability. In DDEM files, each line has a carriage return and a line feed character for easy viewing in a text compiler. In XML format, lick-and-comment is supported. Leading spaces are not supported in the comment field.
6.5.2 DDEM file header format
The details of each part of the DDEM file header shall conform to Table 2.
6.5.3 Format of the sample section of the DDEM file
6.5.3.1 The details of the sample section in the DDEM file shall conform to Table 3....
6.6 Requirements for motif data in DDEM files
6.6.1 The DDEM file should contain assay data for the enamel protein gene.
6.6.2 When the DDEM file contains test data for only one category of STR (autosomal STR, Y-chromosome STR and one of the X-chromosome STR motifs), it should contain all the category A motifs of that category of STR; when the number of motifs exceeds the number of category A motifs, the new motifs should be selected from category B motifs first; when the number of motifs exceeds the sum of category A and B motifs When the number of motifs exceeds the total number of motifs in Class A and Class B, the new motifs shall be selected from Class C motifs.
6.6.3 If the DDEM file contains data from more than two types of STR assays (e.g. both autosomal and Y chromosomal STR motifs), all A motifs of the type of STR involved should be included; when the number of motifs exceeds the number of A motifs, the STR motifs of the new type should meet the requirements of 6.6.2.
1 Scope
2 Normative references
3 Terminology and definitions
4 Abbreviations
5 DNA Database selection of genetic motifs
6 DNA Database Common Exchange Information File
1 Scope
This document gives the types of human chromosomal genetic markers and the selected short tandem repeat sequence motifs for the establishment of the Court Science DNA Database; it specifies the data structure and basic requirements of the file format for data exchange between the National Court Science DNA Database and external systems.
This document applies to the design, development and testing of the Court Science DNA Database and external systems for data exchange with the Court Science DNA Database (e.g. DNA Laboratory Management Information System, DNA data analysis software, etc.).
2 Normative references
The contents of the following documents constitute essential provisions of this document through normative references in the text. Among them, note the date of the reference documents, only the date of the corresponding version applicable to this document; do not note the date of the reference documents, the latest version (including all the revision of the list) applicable to this document.
GB/T 2312 Basic set of characters for Chinese character encoding for information exchange
GB 18030 Character set for Chinese encoding of information technology
3 Terminology and definitions
The following terms and definitions apply to this document.
3.1
Gene locuslocus
The position occupied by a gene on a chromosome or the - segment of genomic DNA.
[Source:GB/T 37226- 2018,2.2]
3.2
allele
A different form of a gene located at the same position on a pair of homologous chromosomes.
3.3
short tandem repeat; STR
A class of DNA tandem repeat sequences that are widely found in eukaryotic genomes, usually consisting of two to six bases in a repeat unit, and usually with five to 60 repeats.
Note: In the human genome, there are autosomal STRs, Y-chromosomal STRs and x-chromosomal STRs, depending on the type of chromosome in which they are found.
3.4
repeat region seqoence
repeat region seqoence
The part of a short tandem repeat sequence (3.3) that consists of a tandem of repeat units, generally from the 5' end of the first repeat unit to the 3' end of the last repeat unit.
3.5
Repetition structure repeat stroceture
The form of the composition of the repeating units in a sequence of repeating regions (3.4).
4 Abbreviations
The following abbreviations apply to this document.
DNA:Deoxyribonucleic Acid
DDEM :DNA Database Exchange Message
NDNAD:National DNA Database (NDNAD)
XML; Extensible Markup Lang uage
5 DNA Database selection of genetic motifs
5.1 Selection of genetic markers
The DNA typing data in the Tribunal's Scientific DNA Database uses the human chromosomal amelogenin gene, short tandem repeat (STR) and mitochondrial DNA assay results.
Note:The data structure of the mitochondrial DNA assay results is not specified in this document.
5.2 Selection of genetic motifs
5.2.1 Class A motifs (core motifs): STR motifs that should be included in the DNA typing data recorded in the Tribunal's scientific DNA database and should conform to the provisions of Appendices A to C.
5.2.2 Class B motifs (preferred motifs): The STR motifs that should be preferred after all Class A motifs are included in the DNA typing data and should conform to the provisions of Appendices A to C.
5.2.3 Category C motifs (alternative motifs): additional STR motifs that are allowed to be entered into the Tribunal's scientific DNA database and shall conform to
Appendix A to Appendix C.
6 DNA Database Common Exchange Information File
6.1 Document Purpose and structure
The Tribunal Scientific DNA Database uses DDEM files for the exchange of information with external systems, where the import of data into the database is achieved by defining information packages. Using XML as a reference, all XML defined data types can be mapped to SQL92 or SQL99 defined data types using the XML mapping summary.
The DDEM file contains two parts: the DDEM file header and the sample (Specimen), as shown in Figure 1.
6.2 DDEM file header.
The DDEM file header section contains the following information:
a) Version;
b) Type of information;
c) Authorised Recording Laboratory Name;
d) Authorised Recording Laboratory ID;
e) Name of the source laboratory;
f) Source Lab ID;
g) Recording number:
h) Date of submission;
i) Recording lot number;
j) Reagent product name;
k) Reagent product number;
I) Reagent kit barcode number;
m) Sequencer manufacturer;
n) Sequencer vendor ID.
6.3 Samples
The sample section of the DDEM file contains the following information:
a) Sample number;
b) Examiner number;
c) Case number;
d) Sample classification;
e) whether the sample is partially typed;
f) sample comment;
g) genetic locus information.
6.4 DDEM file types
The DDEM data file types and descriptions are as follows.
a) + decimal: represents a number of arbitrary precision, values defined as decimal in the XML document are not stored in SQL92 or SQL99.
are not stored.
b) String type: consists of --- groups of characters, which can be any letters, symbols and numbers, but the DNA database does not support the pipe symbol "|"
and semi-colon ";"; some symbols have special meanings in XML, such as "<" and ">", if these special characters are used in DDEM files, they should be replaced by other If these special characters are used in the DDEM file, they should be replaced by other representations; the alternative representations should be in accordance with Table 1.
c) Date/time type: used to indicate a specified time, using the ISO 8601 subset format, in the form of "CCYY-MM-
DDTHH :mm:ss", where: "CC" for century, "YY" for year, "MM" for month, "DD" for day, "T " is the day
T" is the separator between day and time, "HH" "mm" *ss" means hour, minute and second respectively; if a more precise representation of time is needed, it is also possible to
If a more precise representation of time is required, the seconds can also be expressed as a fraction, *. .s.. *, this method is optional; in SQL92 or SQL99, the date stored in the XML document is stored in a date/time type or short-term/time type.
6.5 Document Content and Format Requirements
6.5.1 File Format
A DDEM file written in XML contains - - a DDEM file header and - - one or more samples. Examples of DDEM file headers are given in Appendix D.
An example DDEM file header is given in Appendix D, and an example DDEM file is given in Appendix E. Appendix F gives an example of an XML schema definition file for interpreting and verifying XML files.
Note: For ease of description, the information in Appendices D to F is not really in XML format. Tab characters, carriage returns and spaces have been added for ease of readability. In DDEM files, each line has a carriage return and a line feed character for easy viewing in a text compiler. In XML format, lick-and-comment is supported. Leading spaces are not supported in the comment field.
6.5.2 DDEM file header format
The details of each part of the DDEM file header shall conform to Table 2.
6.5.3 Format of the sample section of the DDEM file
6.5.3.1 The details of the sample section in the DDEM file shall conform to Table 3....
6.6 Requirements for motif data in DDEM files
6.6.1 The DDEM file should contain assay data for the enamel protein gene.
6.6.2 When the DDEM file contains test data for only one category of STR (autosomal STR, Y-chromosome STR and one of the X-chromosome STR motifs), it should contain all the category A motifs of that category of STR; when the number of motifs exceeds the number of category A motifs, the new motifs should be selected from category B motifs first; when the number of motifs exceeds the sum of category A and B motifs When the number of motifs exceeds the total number of motifs in Class A and Class B, the new motifs shall be selected from Class C motifs.
6.6.3 If the DDEM file contains data from more than two types of STR assays (e.g. both autosomal and Y chromosomal STR motifs), all A motifs of the type of STR involved should be included; when the number of motifs exceeds the number of A motifs, the STR motifs of the new type should meet the requirements of 6.6.2.
Contents of GB/T 41009-2021
1 Scope
2 Normative references
3 Terminology and definitions
4 Abbreviations
5 DNA Database selection of genetic motifs
6 DNA Database Common Exchange Information File