Skip to main content
Advertisement

Genetic Variation and Population Structure in Native Americans

Figure 7

Unsupervised Analysis of Native American Population Structure

The colored plots at the left show the estimated population structure of Native Americans, obtained using STRUCTURE. The number of clusters in a given plot is indicated by the value of K on the right side of the figure. Next to the K = 7 plot, the population names and the major language stocks of the populations are also displayed. The left-to-right order of the individuals is the same in all plots. The diagram on the right summarizes the outcomes of 100 replicate STRUCTURE runs for each of several values of K. Each row represents a value of K, and within each row, each box represents a clustering solution that appeared at least 12 times in 100 replicates (see Methods). The number of appearances of a solution is listed above the box, and the boxes are arrayed from left to right in decreasing order of the frequencies of the solutions to which they correspond. The DISTRUCT plot shown on the left corresponds to the leftmost box on the right side of the figure. An approximate description of the clusters is located inside the box, with each row in the box representing a different cluster. The numbers 1, 2, and 3 are used to refer to the green cluster in the K = 2 DISTRUCT plot, the blue cluster in the K = 2 DISTRUCT plot, and the yellow cluster in the K = 9 DISTRUCT plot, respectively. The following population abbreviations are also used: A, Ache; Arh, Arhuaco; Cab, Cabecar; Chip, Chipewyan; E, Embera; G, Guaymi; K, Karitiana; Kog, Kogi; P, Pima; S, Surui; T, Ticuna (both Ticuna groups combined); W, Waunana. Clusters are indicated using set notation; for example {A} represents a cluster containing Ache only, and 2\{A,S} represents a cluster that corresponds to cluster 2 (the blue cluster for K = 2), excluding Ache and Surui. An asterisk indicates approximately 50% membership of a population in a cluster. A line is drawn from a box representing a solution with K clusters to a box representing a solution with K+1 clusters if the solution with K+1 clusters refines the solution with K clusters—that is, if all of the clusters in the solution with K+1 clusters subdivide the clusters in the solution with K clusters. In case of ties for the highest-frequency solution (K = 4 and K = 5), boxes are oriented in order to avoid the crossing of lines between them.

Figure 7

doi: https://doi.org/10.1371/journal.pgen.0030185.g007