See also Pearson Height Dataset and Anthropometric Dataset
Francis Galton, a cousin of Charles Darwin, studied the relationship between parent heights and the heights of their offspring. From his original article on regression, cited below: “My data consisted of the heights of 930 [sic] adult children and of their respective parentages, 205 in number. In every case I transmuted the female statures to their corresponding male equivalents and used them in their transmuted form… The factor I used was 1.08, which is equivalent to adding a little less than one-twelfth to each female height. It differs a very little from the factors employed by other anthropologists…”
The galtonfamiliesmain
dataset was created
under the direction of Dr. James A. Hanley from Galton’s original
paper notebooks. Eight families were left out for illustrative purposes.
The “female statures” are in their raw (untransmuted) form. Information
about the eight families is found in the galtonfamiliessub
dataset. The galtonfamiliesall
dataset has all of the
families together. The galtonparentheights
dataset contains
just the heights of the parents.
FamilyID
: family identifier, labeled 1 to 204 and 136A,
excluding the eightChildren
: number of children in the familyFather
: father’s measured height in inchesMother
: mother’s measured height in inchesChild
: whether the child was a son or a daughterHeight
: child’s measured height in inchesRows: 934
Columns: 6
$ FamilyID <chr> "1", "1", "1", "1", "2", "2", "2", "2", "3", "3", "4", "4", "…
$ Children <dbl> 4, 4, 4, 4, 4, 4, 4, 4, 2, 2, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6…
$ Father <dbl> 78.5, 78.5, 78.5, 78.5, 75.5, 75.5, 75.5, 75.5, 75.0, 75.0, 7…
$ Mother <dbl> 67.0, 67.0, 67.0, 67.0, 66.5, 66.5, 66.5, 66.5, 64.0, 64.0, 6…
$ Child <chr> "Son", "Daughter", "Daughter", "Daughter", "Son", "Son", "Dau…
$ Height <dbl> 73.2, 69.2, 69.0, 69.0, 73.5, 72.5, 65.5, 65.5, 71.0, 68.0, 7…
# A tibble: 6 × 6
FamilyID Children Father Mother Child Height
<chr> <dbl> <dbl> <dbl> <chr> <dbl>
1 1 4 78.5 67 Son 73.2
2 1 4 78.5 67 Daughter 69.2
3 1 4 78.5 67 Daughter 69
4 1 4 78.5 67 Daughter 69
5 2 4 75.5 66.5 Son 73.5
6 2 4 75.5 66.5 Son 72.5
Rows: 898
Columns: 6
$ FamilyID <chr> "1", "1", "1", "1", "2", "2", "2", "2", "3", "3", "4", "4", "…
$ Children <dbl> 4, 4, 4, 4, 4, 4, 4, 4, 2, 2, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6…
$ Father <dbl> 78.5, 78.5, 78.5, 78.5, 75.5, 75.5, 75.5, 75.5, 75.0, 75.0, 7…
$ Mother <dbl> 67.0, 67.0, 67.0, 67.0, 66.5, 66.5, 66.5, 66.5, 64.0, 64.0, 6…
$ Child <chr> "Son", "Daughter", "Daughter", "Daughter", "Son", "Son", "Dau…
$ Height <dbl> 73.2, 69.2, 69.0, 69.0, 73.5, 72.5, 65.5, 65.5, 71.0, 68.0, 7…
# A tibble: 6 × 6
FamilyID Children Father Mother Child Height
<chr> <dbl> <dbl> <dbl> <chr> <dbl>
1 1 4 78.5 67 Son 73.2
2 1 4 78.5 67 Daughter 69.2
3 1 4 78.5 67 Daughter 69
4 1 4 78.5 67 Daughter 69
5 2 4 75.5 66.5 Son 73.5
6 2 4 75.5 66.5 Son 72.5
FamilyID
: family identifier, families 13, 50, 84, 11,
120, 161, 189, and 202Children
: number of children in the familyFather
: value + 60 = father’s height in inchesMother
: value + 60 mother’s height in inchesChild
: whether the child was a son or a daughterHeight
: value + 60 = child’s height in inchesRows: 36
Columns: 6
$ FamilyID <dbl> 13, 13, 50, 50, 84, 84, 84, 84, 84, 111, 120, 120, 120, 120, …
$ Children <dbl> 2, 2, 2, 2, 4, 4, 4, 4, 4, 1, 11, 11, 11, 11, 11, 11, 11, 11,…
$ FatherR <dbl> 13.0, 13.0, 11.0, 11.0, 10.5, 10.5, 10.5, 10.5, 10.5, 9.0, 9.…
$ MotherR <dbl> 7.0, 7.0, 5.4, 5.4, 3.0, 3.0, 3.0, 3.0, 3.0, 3.5, 2.0, 2.0, 2…
$ Child <chr> "Son", "Daughter", "Son", "Daughter", "Son", "Son", "Son", "D…
$ HeightR <dbl> 11.0, 2.0, 13.0, 2.0, 10.0, 8.5, 5.5, 5.5, 3.5, 5.5, 12.0, 10…
# A tibble: 6 × 6
FamilyID Children FatherR MotherR Child HeightR
<dbl> <dbl> <dbl> <dbl> <chr> <dbl>
1 13 2 13 7 Son 11
2 13 2 13 7 Daughter 2
3 50 2 11 5.4 Son 13
4 50 2 11 5.4 Daughter 2
5 84 4 10.5 3 Son 10
6 84 4 10.5 3 Son 8.5
FamilyID
: family identifier, labeled 1 to 205 (205 =
family 136A)Father
: father’s height in inchesMother
: mother’s height in inchesRows: 205
Columns: 3
$ FamilyID <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18…
$ Father <dbl> 78.5, 75.5, 75.0, 75.0, 75.0, 74.0, 74.0, 74.0, 74.5, 74.0, 7…
$ Mother <dbl> 67.0, 66.5, 64.0, 64.0, 58.5, 68.0, 68.0, 66.5, 66.0, 65.5, 6…
# A tibble: 6 × 3
FamilyID Father Mother
<dbl> <dbl> <dbl>
1 1 78.5 67
2 2 75.5 66.5
3 3 75 64
4 4 75 64
5 5 75 58.5
6 6 74 68
Galton’s family data on human stature
Galton, Francis. (1886). Regression toward mediocrity in hereditary stature. The Journal of the Anthropological Institute of Great Britain and Ireland, 15, pp. 246-263.