The fixation index (F_{ST}) is a measure of population differentiation due to genetic structure. It is frequently estimated from genetic polymorphism data, such as singlenucleotide polymorphisms (SNP) or microsatellites. Developed as a special case of Wright's Fstatistics, it is one of the most commonly used statistics in population genetics.
Definition
Two of the most commonly used definitions for F_{ST} are based on the variance of allele frequencies between populations, and on the probability of Identity by descent.
If \bar{p} is the average frequency of an allele in the total population, \sigma^2_S is the variance in the frequency of alleles in different subpopulations, and \sigma^2_T is the variance of allele frequencies in the total population, F_{ST} is defined as ^{[1]}

F_{ST} = \frac{\sigma^2_S}{\sigma^2_T} = \frac{\sigma^2_S}{\bar{p}(1\bar{p})}
This definition, which is due to Wright, illustrates that F_{ST} measures the amount of genetic variance that can be explained by population structure.
Alternatively,^{[2]}

F_{ST} = \frac{f_0\bar{f}}{1\bar{f}}
where f_0 is the probability of identity by descent of two individuals from the same subpopulation, and \bar{f} is the probability that two individuals from the total population are identical by descent. Using this definition, F_{ST} can be interpreted as measuring how much closer two individuals from the same subpopulation are, compared to the total population. If the mutation rate is small, this interpretation can be made more explicit by linking the probability of identity by descent to coalescent times: Let T_{0} and T denote the average time to coalescence for individuals from the same subpopulation and the total population, respectively. Then,

F_{ST} \approx 1\frac{T_0}{T}
This formulation has the advantage that the expected time to coalescence can easily be estimated from genetic data, which led to the development of various estimators for F_{ST}.
Estimation
In practice, none of the quantities used for the definitions can be easily measured. As a consequence, various estimators have been proposed. A particularly simple estimator applicable to DNA sequence data is:^{[3]}

F_{ST} = \frac{ \pi_\text{Between}  \pi_\text{Within} } { \pi_\text{Between} }
where \pi_\text{Between} and \pi_\text{Within} represent the average number of pairwise differences between two individuals sampled from different subpopulations ( \pi_\text{Between} ) or from the same subpopulation ( \pi_\text{Within}). The average pairwise difference within a population can be calculated as the sum of the pairwise differences divided by the number of pairs. However, this estimator is biased when sample sizes are small or if they vary between populations. Therefore, more elaborate methods are used to compute F_{ST} in practice. Two of the most widely used procedures are the estimator by Weir & Cockerham (1984),^{[4]} or performing an Analysis of molecular variance. A list of implementations is available at the end of this article.
Interpretation
This comparison of genetic variability within and between populations is frequently used in applied population genetics. The values range from 0 to 1. A zero value implies complete panmixis; that is, that the two populations are interbreeding freely. A value of one implies that all genetic variation is explained by the population structure, and that the two populations do not share any genetic diversity.
For idealized models such as Wright's finite island model, F_{ST} can be used to estimate migration rates. Under that model, the migration rate is

\hat{M}\approx\frac{1}{2}\left (\frac{1}{F_{ST}} 1 \right ) .
The interpretation of F_{ST} can be difficult when the data analyzed are highly polymorphic. In this case, the probability of identity by descent is very low and F_{ST} can have an arbitrarily low upper bound, which might lead to misinterpretation of the data. Also, strictly speaking F_{ST} is not a genetic distance, as it does not satisfy the triangle inequality. As a consequence new tools for measuring genetic differentiation continue being developed.
F_{ST} in humans
The International HapMap Project estimated F_{ST} for three human populations using SNP data. Across the autosomes, F_{ST} was estimated to be 0.12. The significance of this F_{ST} value in humans is contentious. As an F_{ST} of zero indicates no divergence between populations, whereas an F_{ST} of one indicates complete isolation of populations, Anthropologists often cite Lewontin's 1972 work which came to a similar value and interpreted this number as meaning there was little biological differences between human races.^{[5]} On the other hand, while an F_{ST} value of 0.12 is lower than that found between populations of many other species, Henry Harpending argued that this value implies on a world scale a "kinship between two individuals of the same human population is equivalent to kinship between grandparent and grandchild or between half siblings".^{[6]}
Autosomal genetic distances based on SNPs
Intercontinental autosomal genetic distances based on SNPs^{[7]}

Europe (CEU)

SubSaharan Africa (Yoruba)

EastAsia (Japanese)

SubSaharan Africa (Yoruba)

0.153



EastAsia (Japanese)

0.111

0.190


EastAsia (Chinese)

0.110

0.192

0.007

IntraEuropean/mediterranean autosomal genetic distances based on SNPs^{[7]}^{[8]}

Italians

Palestinians

Swedish

Finns

Spanish

Germans

Russians

French

Greeks

Palestinians

0.0064









Swedish

0.00640.0090

0.0191








Finns

0.01300.0230


0.00500.0110







Spanish

0.00100.0050

0.0101

0.00400055

0.01100.0170






Germans

0.00290.0080

0.0136

0.00070.0010

0.00600.0130

0.00150.0030





Russians

0.00880.0120

0.0202

0.00300.0036

0.00600.0120

0.00700.0079

0.00300.0037




French

0.00300.0050


0.0020

0.00800.0150

0.0010

0.0010

0.0050



Greeks

0.0000

0.0057

0.0084


0.0035

0.0039

0.0108



Programs for calculating F_{ST}

Arlequin

Fstat

SMOGD^{[9]}

diveRsity

Microsatellite Analyzer (MSA)

VCFtools
Modules for calculating F_{ST}
References

^ Holsinger, Kent E.; Bruce S. Weir (2009). "Genetics in geographically structured populations: defining, estimating and interpreting FST". Nat Rev Genet 10 (9): 639–650.

^ Richard Durrett (12 August 2008). Probability Models for DNA Sequence Evolution. Springer.

^ Hudson, RR.; Slatkin, M.; Maddison, WP. (Oct 1992). "Estimation of Levels of Gene Flow from DNA Sequence Data". Genetics 132 (2): 583–9.

^ Weir, B. S.; Cockerham, C. Clark (1984). "Estimating FStatistics for the Analysis of Population Structure". Evolution 38 (6): 1358.

^

^ Harpending, Henry (20021101). "Kinship and Population Subdivision".

^ ^{}a ^{b} Nelis, Mari; et al (20090508). Fleischer, Robert C., ed. "Genetic Structure of Europeans: A View from the North–East". PLoS ONE 4 (5): e5472. , see table

^ Tian, Chao; et al. (November 2009). "European Population Genetic Substructure: Further Definition of Ancestry Informative Markers for Distinguishing among Diverse European Ethnic Groups". Molecular Medicine 15 (1112): 371–383. , see table

^ Crawford, Nicholas G. (2010). "smogd: software for the measurement of genetic diversity".
Further reading

Evolution and the Genetics of Populations Volume 2: the Theory of Gene Frequencies, pg 294–295, S. Wright, Univ. of Chicago Press, Chicago, 1969

A haplotype map of the human genome, The International HapMap Consortium, Nature 2005
External links

BioPerl  Bio::PopGen::PopStats
This article was sourced from Creative Commons AttributionShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, EGovernment Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a nonprofit organization.