Skip to main content
G2P Knowledge Centre logo
Login or use OpenID
Need an account? Contact us
GEN2PHEN logo
  • Home
  • News
  • Events
  • Community
  • Data
  • About GEN2PHEN
Home » Groups » Web services and exchange formats

LSDB - Controlled vocabulary terms

  • View
  • Revisions
Contributed by:Juha Muilu
Originally posted:8th January 2010: 12:01 pm
Last updated:19th July 2011: 11:36 am
Short URL:http://gen2phen.org/node/11356
Interest group icon Web services and exchange formats
Interest group icon VarioML
Public document Public - anyone can view
Tweet

Note: This is a working document. Final information will be collected into the VarioML wiki pages.

 

Misc

Evidence codes

  • GO Evidence codes
  • dbSNP validation codes for variants
  • Evidence code ontology

Mutation events

  •  Types of mutation events and their discreet affects on RNA and polypeptide

Inheritance pattern (of phenotype, note: look also inheritance ontology. http://www.human-phenotype-ontology.org/index.php/hpo_docu.html)

  • 'familial'
  • 'familial, consanguineous parents'
  • 'familial, autosomal dominant'
  • 'familial, autosomal recessive'
  • 'familial, X-linked'
  • 'sporadic'
  • 'sporadic, consanguineous parents'
  • 'sporadic, consanguineous parents (1st degree)'
  • 'sporadic, consanguineous parents (2nd degree)'
  • 'sporadic, consanguineous parents (3rd degree)'
  • 'sporadic, non-consanguineous parents'
  • 'sporadic, consanguinity parents?'
  • 'sporadic? (parents not tested)'

Genetic source

  • 'de novo'
  • 'de novo, maternal chromosome'
  • 'de novo, paternal chromosome'
  • 'de novo, from either parent'
  • 'inherited'
  • 'inherited, maternal chromosome'
  • 'inherited, paternal chromosome'
  • 'inherited, from either parent
  • somatic

 

Variant

Consequence

See for example sequence ontology: List of variant effect terms

 

From DMuDB:

Complex frameshift: Frameshift involving insertions and deletions
Exon deletion: Deletion encompassing a whole exon or exons, frameshift status unknown
Exon duplication: Duplication of one of more exons, frameshift status unknown
Frameshift: Deletion or insertion causing reading frame shift
In-frame deletion: Deletion of a whole codon or codons.  Can include deletion of one or 
more exons
In-frame duplication: A duplication that does not change the reading frame.  Can include 
one or more exons
In-frame insertion: An insertion of a whole codon or codons.  Can include one or more 
exons
Intronic variant: A variant in an intron which has not been shown to affect splicing
Missense: Substitution resulting in a change to a different amino acid
Nonsense: Substitution resulting in a change to a stop codon
Out of frame deletion: Deletion of part of a codon or number of codons resulting in a 
frameshift.  Can include one or more exons
Out of frame duplication: A duplication that changes the reading frame.  Can include one or more 
exons
Out of frame insertion: An insertion of part of a codon or number of codons resulting in a 
frameshift.  Can include one or more exons
Silent: A nucleotide change that does not change the amino acid
Splice site variant: A mutation that affects splicing

Number of independent observations of a DNA variant (Frequency in XML)

Example values from the Data sharing between LSDBs paper

found once (should it be "found at least once" ?. Same with other terms)
2–10 times 
11–99 times
over 100 times

Origin (note this field will be replaced by genetic source and inheritance pattern . JM DEC 2010)

Source LOVD

in vitro (cloned)
familial
familial, consanguineous parents
familial, autosomal dominant
familial, autosomal recessive
familial, X-linked
sporadic
sporadic, consanguineous parents
sporadic, consanguineous parents (1st degree)
sporadic, consanguineous parents (2nd degree)
sporadic, consanguineous parents (3rd degree)
sporadic, non-consanguineous parents
sporadic, consanguinity parents?
sporadic? (parents not tested)
uniparental disomy
de novo
de novo, somatic mosaicism
de novo, germline mosaicism
de novo, germline and somatic mosaicism
de novo, in patient
de novo, in patient (maternal allele)
de novo, in patient (paternal allele)
de novo, in mother
de novo, in mother (grandmaternal allele)
de novo, in mother (grandpaternal allele)
de novo, in father
de novo, in father (grandmaternal allele)
de novo, in father (grandpaternal allele)
uniparental disomy, maternal allele
uniparental disomy, paternal allele

 

Tissue distribution (Note. This will be normalized with other fields JM May 2011)

  • constitutional
  • mosaci
  • mosaic in germline

 

Parental origin

Source LOVD

Parent #1
Parent #2 
Paternal (inferred)
Paternal (confirmed)
Maternal (inferred)
Maternal (confirmed)
de novo
de novo, on paternal allele 
de novo, maternal allele

Pathogenicity

See the paper From LOVD:

No known pathogenicity
Probably no pathogenicity
Unknown
Probably pathogenic
Pathogenic

From DMuDB:

Non-Pathogenic
Probably Not Pathogenic
Not Known
Probably Pathogenic
Pathogenic
Unclassified

From CMGS/VKGL paper as by Alamut:


Class 1 – Certainly not pathogenic
Class 2 – Unlikely to be pathogenic but cannot be formally proven
Class 3 – Likely to be pathogenic but cannot be formally proven
Class 4 - Certainly pathogenic

Class 5 - Unknown (not in spec)

 

Other comments:

Not Known implies that a submitter has given no data on pathogenicity. Unclassified implies that the submitter has specifically indicated that they are unable to classify the pathogenicity of the variant.

Patient

Gender

Source iso5218 (codes 0,1,2,9)

not known 
male
female
not applicable

Geographical region

  • Country (iso-3166 codes)
  • dbSNP population classes
  • See also geonames datatabase / web servcies

Ethnicity

  • Continetal population groups from MeSH

Detection Technique

From DMuDB:

ARMS
CF20: CF Common Mutation Test
CF29: Analysis of 29 mutations using the Elucigene CF29 kit
CSCE: Conformation sensitive capillary electrophoresis
DGGE: Denaturing gradient gel electrophoresis
dHPLC: Denaturing high performance liquid chromatography
Heteroduplex analysis
Loss of heterozygosity analysis
Meta-PCR
MLPA: Multiplex ligation-dependent probe amplification
MS-PCR: Mutagenically separated PCR
Multiplex PCR
Not Known: The information has not been recorded or provided
Not Specified: Test information cannot be determined
PCR-PAGE
PTT: Protein Trucation Test
RNA: RNA work performed
Sequencing
SNPlex: The SNPlex™ Genotyping System from ABI
SSCP
SSCP/Heteroduplex
Tags:
  • WP3
  • Login to post comments

Comments

Comments

#1 LOVD 2.0 handles the

Submitted by Ivo F.A.C. Fokkema on Mon, 11/01/2010 - 17:07.

LOVD 2.0 handles the pathogenicity as a dual value, one for "submitter's opinion" and one for "curator's opinion". Each one has 5 options:

- => No known pathogenicity
-? => Probably no pathogenicity
? => Unknown
+? => Probably pathogenic
+ => Pathogenic

  • Login to post comments

#2 Thanks Ivo. I updated the

Submitted by Juha Muilu on Thu, 21/01/2010 - 09:27.

Thanks Ivo. I updated the wiki accordingly, As a new feature, all terms like these have optional evidence code suggested by Mauno. The code is ontology term (i.e. it has those optional attributes telling source of ontology and accession number of term). I am not sure what the evidence terms can be in this case, perhaps some kind of assessment methods?.

  • Login to post comments

#3 Hi Juha, evidence for

Submitted by Ivo F.A.C. Fokkema on Thu, 21/01/2010 - 18:25.

Hi Juha, evidence for pathogenicity could be long list of different things. Mostly, wet lab research can confirm pathogenicity by proving loss-of-function or gain-of-function of the mutated protein. But there are also computer-generated predictions, or the "evidence" is a combination of different knowledge in the head of the curator.

  • Login to post comments

#4 Thanks Ivo. The evidence is

Submitted by Juha Muilu on Fri, 22/01/2010 - 20:42.

Thanks Ivo. The evidence is actually list. In one element there can be zero to many evidence codes.

  • Login to post comments

#5 I added some examples of the

Submitted by Glen Dobson on Wed, 09/06/2010 - 13:49.

I added some examples of the controlled terms in DMuDB. In general these are not supposed to be exhaustive or complete lists, but e.g. new techniques would be added when required.

  • Login to post comments

#6 Hi All I sent the following

Submitted by antbro on Tue, 15/06/2010 - 10:14.

Hi All
I sent the following in ane email some time ago, but none of my concerns seems to have been considered in the latest LSDB vocabulary. Please at least give them some serious consideration/discussion as I fear we may be making some fundamental errors in the current vocabulary that contradict basic genetics knowledge...

PATHOGENICITY
I think this has to be considered on TWO levels:
1) Evidence about the variant IN THE PATIENT
a) being a de novo mutation in a sporadic case argues for pathogenicity (likelihood depends upon how many genes are examined and the completeness of the gene scanning)
b) finding cosegregation of mutation and disease in the patient's family argues for pathogenicity (formally only argues for involvement of the genome region, and not the specific mutation)

2) Evidence about the variant IN GENERAL
a) sometimes there will be accepted 'fact' regarding pathogenicity (e.g. delta F508 in CF)
b) previous reports of being a normal variant (e.g., in dbSNP, LSDBs) would argue against pathogenicity
c) theoretical predictions (e.g., nature of amino-acid or splice site change) can suggest pathogenicity or neutrality
d) functional studies, gene knockouts, animal model data, and so on, can suggest pathogenicity

This split (between considerations of the variant in general, and considerations of the specific occurance of the variant in the patient) needs to be kept in mind across all aspects of the LSDB record. I'd like to see distinct sections in LSDBs for these two different aspects of a variant. Specifically;
- the 5 categories of pathogenicity are a good starting point, but they could be used to refer to the variant in general or the occurance of the variant in the patient (some mutations could be pathogenic in some genetic backgrounds and not in others)
- the genetic mechanism (autosomal dominant, autosomal recessive, X-linked, maternally imprinted, paternally imprinted) refers to the variant in general, whereas the zygosity (homozygote, heterozygote, compound heterozygote) refers to the occurance of the variant in the patient

INHERITANCE
I feel your list of inheritance terms actually covers a mixed bunch of different things...
- familial & sporadic (with consanguinity sub-categories): refer to the disease (not the variant)
- paternally inherited, maternally inherited, de novo mutation (from father), de novo mutation (from mother), consanguinous origin, mosaicism (germline and somatic), uniparental disomy, etc: refer to mode of inheritance (and this list can easily be made complete by some googling ...not forgetting mtDNA)
I definitely feel that all inheritance options should be made available via a controlled vocabulary list, and there may even need to be an option to select >1 item from the list

OTHER
Regarding the categories "Country of origin" and "Ethnicity", we first need to be very clear about what we are trying to achieve. I assume you are trying to ensure that the database eventually allow one to ask about the genetic history of the catalogued inherited variants. If so, then you would need fields to capture one or both of these two bits of information for many/all of the patients ancestors. It does not seem logical to try to capture this complexity by having one or two fields that refer to the 'origin/ethnicity of the variant'. To eventually query the genetic history of an inherited variant, one would integrate the recorded data on a) the patient's ancestors, and b) the mode of inheritance

In general, underlying all my comments is a sense that there needs to be crystal clear demarcation between different aspects of the data (patient data, family data, disease data, pathogenicity data, variant data, inheritance data, zygosity data, genetic mechanism data, method data) and the solid data model you have I assume this should be possible.

  • Login to post comments
  • Group home
  • Wiki

Web services and exchange formats

  • You must login in order to post into this group.
  • Web services and exchange formats
    • Group home
    • Wiki
G2P Knowledge Centre is part of GEN2PHEN and funded by the Health Thematic Area of the Cooperation Programme of the European Commission within the VII Framework Programme for Research and Technological Development.

© GEN2PHEN 2011
Follow @gen2phen
  • Contact Us