World Library  
Flag as Inappropriate
Email this Article

Fixation index

Article Id: WHEBN0004196163
Reproduction Date:

Title: Fixation index  
Author: World Heritage Encyclopedia
Language: English
Subject: F-statistics, Genetic history of Europe, Balding–Nichols model, Andalusian horse, Demographics of India
Collection:
Publisher: World Heritage Encyclopedia
Publication
Date:
 

Fixation index

The fixation index (FST) is a measure of population differentiation due to genetic structure. It is frequently estimated from genetic polymorphism data, such as single-nucleotide polymorphisms (SNP) or microsatellites. Developed as a special case of Wright's F-statistics, it is one of the most commonly used statistics in population genetics.

Definition

Two of the most commonly used definitions for FST are based on the variance of allele frequencies between populations, and on the probability of Identity by descent.

If \bar{p} is the average frequency of an allele in the total population, \sigma^2_S is the variance in the frequency of alleles in different subpopulations, and \sigma^2_T is the variance of allele frequencies in the total population, FST is defined as [1]

F_{ST} = \frac{\sigma^2_S}{\sigma^2_T} = \frac{\sigma^2_S}{\bar{p}(1-\bar{p})}

This definition, which is due to Wright, illustrates that FST measures the amount of genetic variance that can be explained by population structure.

Alternatively,[2]

F_{ST} = \frac{f_0-\bar{f}}{1-\bar{f}}

where f_0 is the probability of identity by descent of two individuals from the same subpopulation, and \bar{f} is the probability that two individuals from the total population are identical by descent. Using this definition, FST can be interpreted as measuring how much closer two individuals from the same subpopulation are, compared to the total population. If the mutation rate is small, this interpretation can be made more explicit by linking the probability of identity by descent to coalescent times: Let T0 and T denote the average time to coalescence for individuals from the same subpopulation and the total population, respectively. Then,

F_{ST} \approx 1-\frac{T_0}{T}

This formulation has the advantage that the expected time to coalescence can easily be estimated from genetic data, which led to the development of various estimators for FST.

Estimation

In practice, none of the quantities used for the definitions can be easily measured. As a consequence, various estimators have been proposed. A particularly simple estimator applicable to DNA sequence data is:[3]

F_{ST} = \frac{ \pi_\text{Between} - \pi_\text{Within} } { \pi_\text{Between} }

where \pi_\text{Between} and \pi_\text{Within} represent the average number of pairwise differences between two individuals sampled from different sub-populations ( \pi_\text{Between} ) or from the same sub-population ( \pi_\text{Within}). The average pairwise difference within a population can be calculated as the sum of the pairwise differences divided by the number of pairs. However, this estimator is biased when sample sizes are small or if they vary between populations. Therefore, more elaborate methods are used to compute FST in practice. Two of the most widely used procedures are the estimator by Weir & Cockerham (1984),[4] or performing an Analysis of molecular variance. A list of implementations is available at the end of this article.

Interpretation

This comparison of genetic variability within and between populations is frequently used in applied population genetics. The values range from 0 to 1. A zero value implies complete panmixis; that is, that the two populations are interbreeding freely. A value of one implies that all genetic variation is explained by the population structure, and that the two populations do not share any genetic diversity.

For idealized models such as Wright's finite island model, FST can be used to estimate migration rates. Under that model, the migration rate is

\hat{M}\approx\frac{1}{2}\left (\frac{1}{F_{ST}} -1 \right ) .

The interpretation of FST can be difficult when the data analyzed are highly polymorphic. In this case, the probability of identity by descent is very low and FST can have an arbitrarily low upper bound, which might lead to misinterpretation of the data. Also, strictly speaking FST is not a genetic distance, as it does not satisfy the triangle inequality. As a consequence new tools for measuring genetic differentiation continue being developed.

FST in humans

The International HapMap Project estimated FST for three human populations using SNP data. Across the autosomes, FST was estimated to be 0.12. The significance of this FST value in humans is contentious. As an FST of zero indicates no divergence between populations, whereas an FST of one indicates complete isolation of populations, Anthropologists often cite Lewontin's 1972 work which came to a similar value and interpreted this number as meaning there was little biological differences between human races.[5] On the other hand, while an FST value of 0.12 is lower than that found between populations of many other species, Henry Harpending argued that this value implies on a world scale a "kinship between two individuals of the same human population is equivalent to kinship between grandparent and grandchild or between half siblings".[6]

Autosomal genetic distances based on SNPs

Intercontinental autosomal genetic distances based on SNPs[7]
Europe (CEU) Sub-Saharan Africa (Yoruba) East-Asia (Japanese)
Sub-Saharan Africa (Yoruba) 0.153
East-Asia (Japanese) 0.111 0.190
East-Asia (Chinese) 0.110 0.192 0.007
Intra-European/mediterranean autosomal genetic distances based on SNPs[7][8]
Italians Palestinians Swedish Finns Spanish Germans Russians French Greeks
Palestinians 0.0064
Swedish 0.0064-0.0090 0.0191
Finns 0.0130-0.0230 0.0050-0.0110
Spanish 0.0010-0.0050 0.0101 0.0040-0055 0.0110-0.0170
Germans 0.0029-0.0080 0.0136 0.0007-0.0010 0.0060-0.0130 0.0015-0.0030
Russians 0.0088-0.0120 0.0202 0.0030-0.0036 0.0060-0.0120 0.0070-0.0079 0.0030-0.0037
French 0.0030-0.0050 0.0020 0.0080-0.0150 0.0010 0.0010 0.0050
Greeks 0.0000 0.0057 0.0084 0.0035 0.0039 0.0108

Programs for calculating FST

Modules for calculating FST

References

  1. ^ Holsinger, Kent E.; Bruce S. Weir (2009). "Genetics in geographically structured populations: defining, estimating and interpreting FST". Nat Rev Genet 10 (9): 639–650.  
  2. ^ Richard Durrett (12 August 2008). Probability Models for DNA Sequence Evolution. Springer.  
  3. ^ Hudson, RR.; Slatkin, M.; Maddison, WP. (Oct 1992). "Estimation of Levels of Gene Flow from DNA Sequence Data". Genetics 132 (2): 583–9.  
  4. ^ Weir, B. S.; Cockerham, C. Clark (1984). "Estimating F-Statistics for the Analysis of Population Structure". Evolution 38 (6): 1358.  
  5. ^  
  6. ^ Harpending, Henry (2002-11-01). "Kinship and Population Subdivision".  
  7. ^ a b Nelis, Mari; et al (2009-05-08). Fleischer, Robert C., ed. "Genetic Structure of Europeans: A View from the North–East". PLoS ONE 4 (5): e5472.  , see table
  8. ^ Tian, Chao; et al. (November 2009). "European Population Genetic Substructure: Further Definition of Ancestry Informative Markers for Distinguishing among Diverse European Ethnic Groups". Molecular Medicine 15 (11-12): 371–383.  , see table
  9. ^ Crawford, Nicholas G. (2010). "smogd: software for the measurement of genetic diversity".  

Further reading

  • Evolution and the Genetics of Populations Volume 2: the Theory of Gene Frequencies, pg 294–295, S. Wright, Univ. of Chicago Press, Chicago, 1969
  • A haplotype map of the human genome, The International HapMap Consortium, Nature 2005

External links

  • BioPerl - Bio::PopGen::PopStats
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
 
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
 
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.
 



Copyright © World Library Foundation. All rights reserved. eBooks from World Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.