History of statistics
|History of science|
The History of statistics can be said to start around 1749 although, over time, there have been changes to the interpretation of the word statistics. In early times, the meaning was restricted to information about states. This was later extended to include all collections of information of all types, and later still it was extended to include the analysis and interpretation of such data. In modern terms, "statistics" means both sets of collected information, as in national accounts and temperature records, and analytical work which requires statistical inference.
A number of statistical concepts have had an important impact on a wide range of sciences. These include the design of experiments and approaches to statistical inference such as Bayesian inference, each of which can be considered to have their own sequence in the development of the ideas underlying modern statistics.
By the 18th century, the term "statistics" designated the systematic collection of demographic and economic data by states. In the early 19th century, the meaning of "statistics" broadened to include the discipline concerned with the collection, summary, and analysis of data. Today statistics is widely employed in government, business, and all the sciences. Electronic computers have expedited statistical computation, and have allowed statisticians to develop "computer-intensive" methods.
The term "mathematical statistics" designates the mathematical theories of probability and statistical inference, which are used in statistical practice. The relation between statistics and probability theory developed rather late, however. In the 19th century, statistics increasingly used probability theory, whose initial results were found in the 17th and 18th centuries, particularly in the analysis of games of chance (gambling). By 1800, astronomy used probability models and statistical theories, particularly the method of least squares, which was invented by Legendre and Gauss. Early probability theory and statistics was systematized and extended by Laplace; following Laplace, probability and statistics have been in continual development. In the 19th century, statistical reasoning and probability models were used by social scientists to advance the new sciences of experimental psychology and sociology, and by physical scientists in thermodynamics and statistical mechanics. The development of statistical reasoning was closely associated with the development of inductive logic and the scientific method.
Statistics can be regarded as not a field of mathematics but an autonomous mathematical science, like computer science and operations research. Unlike mathematics, statistics had its origins in public administration. It is used in demography and economics. With its emphasis on learning from data and making best predictions, statistics has a considerable overlap with decision science and microeconomics. With its concerns with data, statistics has overlap with information science and computer science.
The term statistics is ultimately derived from the New Latin statisticum collegium ("council of state") and the Italian word statista ("statesman" or "politician"). The German Statistik, first introduced by Gottfried Achenwall (1749), originally designated the analysis of data about the state, signifying the "science of state" (then called political arithmetic in English). It acquired the meaning of the collection and classification of data generally in the early 19th century. It was introduced into English in 1791 by Sir John Sinclair when he published the first of 21 volumes titled Statistical Account of Scotland.1
Thus, the original principal purpose of Statistik was data to be used by governmental and (often centralized) administrative bodies. The collection of data about states and localities continues, largely through national and international statistical services. In particular, censuses provide frequently updated information about the population.
The first book to have 'statistics' in its title was "Contributions to Vital Statistics" by Francis GP Neison, actuary to the Medical Invalid and General Life Office (1st ed., 1845; 2nd ed., 1846; 3rd ed., 1857).citation needed
||This section includes a list of references, related reading or external links, but the sources of this section remain unclear because it lacks inline citations. (July 2012)|
The use of statistical methods dates back to least to the 5th century BCE. The historian Thucydides in his History of the Peloponnesian War 2 describes how the Athenians calculated the height of the wall of Platea by counting the number of bricks in an unplastered section of the wall sufficiently near them to be able to count them. The count was repeated several times by a number of soldiers. The most frequent value (in modern terminology - the mode ) so determined was taken to be the most likely value of the number of bricks. Multiplying this value by the height of the bricks used in the wall allowed the Athenians to determine the height of the ladders necessary to scale the walls.
In the Indian epic - the Mahabharata (Book 3: The Story of Nala) - King Rtuparna estimated the number of fruit and leaves (2095 fruit and 50,000,000 - five crores - leaves) on two great branches of a Vibhitaka tree by counting them on a single twig. This number was then multiplied by the number of twigs on the branches. This estimate was later checked and found to be very close to the actual number. With knowledge of this method Nala was subsequently able to regain his kingdom.
The earliest writing on statistics was found in a 9th-century book entitled: "Manuscript on Deciphering Cryptographic Messages", written by Al-Kindi (801–873 CE). In his book, Al-Kindi gave a detailed description of how to use statistics and frequency analysis to decipher encrypted messages, this was the birth of both statistics and cryptanalysis.34
The Trial of the Pyx is a test of the purity of the coinage of the Royal Mint which has been held on a regular basis since the 12th century. The Trial itself is based on statistical sampling methods. After minting a series of coins - originally from ten pounds of silver - a single coin was placed in the Pyx - a box in Westminster Abbey. After a given period - now once a year - the coins are removed and weighed. A sample of coins removed from the box are then tested for purity.
The Nuova Cronica, a 14th-century history of Florence by the Florentine banker and official Giovanni Villani, includes much statistical information on population, ordinances, commerce and trade, education, and religious facilities and has been described as the first introduction of statistics as a positive element in history,5 though neither the term nor the concept of statistics as a specific field yet existed. But this was proven to be incorrect after the rediscovery of Al-Kindi's book on frequency analysis.34
The arithmetic mean, although a concept known to the Greeks, was not generalised to more than two values until the 16th century. The invention of the decimal system by Simon Stevin in 1585 seems likely to have facilitated these calculations. This method was first adopted in astronomy by Tycho Brahe who was attempting to reduce the errors in his estimates of the locations of various celestial bodies.
The idea of the median originated in Edward Wright's book on navigation (Certaine Errors in Navigation) in 1599 in a section concerning the determination of location with a compass. Wright felt that this value was the most likely to be the correct value in a series of observations.
John Graunt in his book Natural and Political Observations Made upon the Bills of Mortality estimated the population of London in 1662 from parish records. He knew that there were around 13,000 funerals per year in London and that three people died per eleven families per year. He estimated from the parish records that the average family size was 8 and calculated that the population of London was about 384,000. Laplace in 1802 estimated the population of France with a similar method.
The mathematical methods of statistics emerged from probability theory, which can be dated to the correspondence between Pierre de Fermat and Blaise Pascal (1654). Christiaan Huygens (1657) gave the earliest known scientific treatment of the subject. Jakob Bernoulli's Ars Conjectandi (posthumous, 1713) and Abraham de Moivre's The Doctrine of Chances (1718) treated the subject as a branch of mathematics. In his book Bernoulli introduced the idea of representing complete certainty as one and probability as a number between zero and one.
Galileo struggled with the problem of errors in observations and had vaguely formulated the principle that the most likely values of the unknowns would be those that made the errors in all the equations reasonably small. The formal study of theory of errors may be traced back to Roger Cotes' Opera Miscellanea (posthumous, 1722). Tobias Mayer, in his study of the libration of the moon (Kosmographische Nachrichten, Nuremberg, 1750), invented the first formal method for estimating the unknown quantities by generalized the averaging of observations under identical circumstances to the averaging of groups of similar equations.
The first example of what later became known as the normal curve was studied by Abraham de Moivre who plotted this curve on November 12, 1733.6 de Moivre was studying the number of heads that occurred when a 'fair' coin was tossed.
A memoir - An attempt to show the advantage arising by taking the mean of a number of observations in practical astronomy - prepared by Thomas Simpson in 1755 (printed 1756) first applied the theory to the discussion of errors of observation. The reprint (1757) of this memoir lays down the axioms that positive and negative errors are equally probable, and that there are certain assignable limits within which all errors may be supposed to fall; continuous errors are discussed and a probability curve is given. Simpson discussed several possible distributions of error. He first considered the uniform distribution and then the discrete symmetric triangular distribution followed by the continuous symmetric triangle distribution..
Ruđer Bošković in 1755 based in his work on the shape of the earth proposed in his book De Litteraria expeditione per pontificiam ditionem ad dimetiendos duos meridiani gradus a PP. Maire et Boscovicli that the true value of a series of observations would be that which minimises the sum of absolute errors. In modern terminology this value is the median.
with -1 ≤ x ≤ 1.
Pierre-Simon Laplace (1774) made the first attempt to deduce a rule for the combination of observations from the principles of the theory of probabilities. He represented the law of probability of errors by a curve and deduced a formula for the mean of three observations.
Laplace in 1774 noted that the frequency of an error could be expressed as an exponential function of its magnitude once its sign was disregarded.78 This distribution is now known as the Laplace distribution.
Lagrange proposed a parabolic distribution of errors in 1776:
with -1 ≤ x ≤ 1.
Laplace in 1778 published his second law of errors wherein he noted that the frequency of an error was proportional to the exponential of the square of its magnitude. This was subsequently rediscovered by Gauss (possibly in 1795) and is now best known as the normal distribution which is of central importance in statistics.9 This distribution was first referred to as the normal distribution by Pierce in 1873 who was studying measurement errors when an object was dropped onto a wooden base.10 He chose the term normal because of its frequent occurrence in naturally occurring variables.
Lagrange also suggested in 1781 two other distributions for errors - a cosine distribution
with -1 ≤ x ≤ 1 and a logarithmic distribution
with -1 ≤ x ≤ 1 where || is the absolute value of x.
Laplace gave (1781) a formula for the law of facility of error (a term due to Joseph Louis Lagrange, 1774), but one which led to unmanageable equations. Daniel Bernoulli (1778) introduced the principle of the maximum product of the probabilities of a system of concurrent errors.
In 1786 William Playfair invented the line chart and bar chart for economic data which he published in his Commercial and Political Atlas. This was followed in 1795 by his invention of the pie chart and circle chart which he used to display the evolution of England's imports and exports. These latter charts came to general attention when he published examples in his Statistical Breviary in 1801.
In 1802 Laplace estimated the population of France to be 28,328,612.11 He calculated this figure using the number of births in the previous year and census data for three communities. The census data of these communities showed that they had 2,037,615 persons and that the number of births were 71,866. Assuming that these samples were representative of France, Laplace produced his estimate for the entire population.
The method of least squares, which was used to minimize errors in data measurement, was published independently by Adrien-Marie Legendre (1805), Robert Adrain (1808), and Carl Friedrich Gauss (1809). Gauss had used the method in his famous 1801 prediction of the location of the dwarf planet Ceres. The observations that Gauss based his calculations on were made by the Italian monk Piazzi. Further proofs were given by Laplace (1810, 1812), Gauss (1823), Ivory (1825, 1826), Hagen (1837), Bessel (1838), Donkin (1844, 1856), Herschel (1850), Crofton (1870), and Thiele (1880, 1889).
The term probable error (der wahrscheinliche Fehler) - the median deviation from the mean - was introduced in 1815 by the German astronomer Frederik Wilhelm Bessel.
Antoine Augustin Cournot in 1843 was the first to use the term median (valeur médiane) for the value that divides a probability distribution into two equal halves.
Other contributors to the theory of errors were Ellis (1844), De Morgan (1864), Glaisher (1872), and Giovanni Schiaparelli (1875).citation needed Peters's (1856) formula for , the "probable error" of a single observation was widely used and inspired early robust statistics (resistant to outliers: see Peirce's criterion).
In the 19th century authors on statistical theory included Laplace, S. Lacroix (1816), Littrow (1833), Dedekind (1860), Helmert (1872), Laurant (1873), Liagre, Didion, De Morgan, Boole, Edgeworth,12 and K. Pearson.13
Gustav Theodor Fechner used the median (Centralwerth) in sociological and psychological phenomena.14 It had earlier been used only in astronomy and related fields. Francis Galton used the English term median for the first time in 1881 having earlier used the terms middle-most value in 1869 and the medium in 1880.15
Adolphe Quetelet (1796–1874), another important founder of statistics, introduced the notion of the "average man" (l'homme moyen) as a means of understanding complex social phenomena such as crime rates, marriage rates, and suicide rates.16
The first tests of the normal distribution were invented by the German statistician Wilhelm Lexis in the 1870s. The only data sets available to him that he was able to show were normally distributed were birth rates.
Francis Galton in 1907 submitted a paper to Nature on the usefulness of the median.18 He examined the accuracy of 787 guesses of the weight of an ox at a country fair. The actual weight was 1208 pounds: the median guess was 1198. The guesses were markedly non-normally distributed.
The Norwegian Anders Nicolai Kiær introduced the concept of stratified sampling in 1895.19 Arthur Lyon Bowley introduced random sampling in 1906.20 Jerzy Neyman in 1934 showed that stratified random sampling was in general a better method of estimation than purposive (quota) sampling.21
The 5% level of significance appears to have been introduced by R.A. Fisher in 1925.22 Fisher stated that deviations exceeding twice the standard deviation are regarded as significant. Before this deviations exceeding three times the probable error were considered significant. For a symmetrical distribution the probable error is half the interquartile range. The upper quartile of a standard normal distribution lies between 0.66 and 0.67 its probable error is approximately 2/3 of a standard deviation. It appears that Fisher's 5% criterion was rooted in previous practice.
In 1929 Wilson and Hilferty re examined Pierce's data from 1873 and discovered that it was not actually normally distributed.23
See Ian Hacking's The Emergence of Probability24 and James Franklin's The Science of Conjecture: Evidence and Probability Before Pascal25 for histories of the early development of the concept of mathematical probability. In the modern era, the work of Kolmogorov has been instrumental in formulating the fundamental model of Probability Theory, which is used throughout much of statistics.26
|This section relies on references to primary sources. (February 2012)|
In 1747, while serving as surgeon on HM Bark Salisbury, James Lind carried out a controlled experiment to develop a cure for scurvy.27 In this study his subjects' cases "were as similar as I could have them", that is he provided strict entry requirements to reduce extraneous variation. The men were paired, which provided blocking. From a modern perspective, the main thing that is missing is randomized allocation of subjects to treatments.
James Lind is today often described as a one-factor-at-a-time experimenter.citation needed One-factor-at-a-time (OFAT) experimentation reached its zenith with Thomas Edison's "trial and error" methods.28
A theory of statistical inference was developed by Charles S. Peirce in "Illustrations of the Logic of Science" (1877–1878) and "A Theory of Probable Inference" (1883), two publications that emphasized the importance of randomization-based inference in statistics. In another study, Peirce randomly assigned volunteers to a blinded, repeated-measures design to evaluate their ability to discriminate weights.29303132 Peirce's experiment inspired other researchers in psychology and education, which developed a research tradition of randomized experiments in laboratories and specialized textbooks in the 1800s.29303132 Peirce also contributed the first English-language publication on an optimal design for regression-models in 1876.33 A pioneering optimal design for polynomial regression was suggested by Gergonne in 1815.citation needed In 1918 Kirstine Smith published optimal designs for polynomials of degree six (and less).34
The use of a sequence of experiments, where the design of each may depend on the results of previous experiments, including the possible decision to stop experimenting, was pioneered35 by Abraham Wald in the context of sequential tests of statistical hypotheses.36 Surveys are available of optimal sequential designs,37 and of adaptive designs.38 One specific type of sequential design is the "two-armed bandit", generalized to the multi-armed bandit, on which early work was done by Herbert Robbins in 1952.39
The term "design of experiments" (DOE) derives from early statistical work performed by Sir Ronald Fisher. He was described by Anders Hald as "a genius who almost single-handedly created the foundations for modern statistical science."40 Fisher initiated the principles of design of experiments and elaborated on his studies of "analysis of variance". Perhaps even more important, Fisher began his systematic approach to the analysis of real data as the springboard for the development of new statistical methods. He began to pay particular attention to the labour involved in the necessary computations performed by hand, and developed methods that were as practical as they were founded in rigour. In 1925, this work culminated in the publication of his first book, Statistical Methods for Research Workers.41 This went into many editions and translations in later years, and became a standard reference work for scientists in many disciplines.42
A methodology for designing experiments was proposed by Ronald A. Fisher, in his innovative book The Design of Experiments (1935) which also became a standard.43444546 As an example, he described how to test the hypothesis that a certain lady could distinguish by flavour alone whether the milk or the tea was first placed in the cup. While this sounds like a frivolous application, it allowed him to illustrate the most important ideas of experimental design: see Lady tasting tea.
Agricultural science advances served to meet the combination of larger city populations and fewer farms. But for crop scientists to take due account of widely differing geographical growing climates and needs, it was important to differentiate local growing conditions. To extrapolate experiments on local crops to a national scale, they had to extend crop sample testing economically to overall populations. As statistical methods advanced (primarily the efficacy of designed experiments instead of one-factor-at-a-time experimentation), representative factorial design of experiments began to enable the meaningful extension, by inference, of experimental sampling results to the population as a whole.citation needed But it was hard to decide how representative was the crop sample chosen.citation needed Factorial design methodology showed how to estimate and correct for any random variation within the sample and also in the data collection procedures.
Charles S. Peirce (1839—1914) formulated frequentist theories of estimation and hypothesis-testing in (1877—1878) and (1883), in which he introduced "confidence". Peirce also introduced blinded, controlled randomized experiments with a repeated measures design.47 Peirce invented an optimal design for experiments on gravity.
The term Bayesian refers to Thomas Bayes (1702–1761), who proved a special case of what is now called Bayes' theorem. However it was Pierre-Simon Laplace (1749–1827) who introduced a general version of the theorem and applied it to celestial mechanics, medical statistics, reliability, and jurisprudence.48 When insufficient knowledge was available to specify an informed prior, Laplace used uniform priors, according to his "principle of insufficient reason".4849 Laplace assumed uniform priors for mathematical simplicity rather than for philosophical reasons.48 Laplace also introducedcitation needed primitive versions of conjugate priors and the theorem of von Mises and Bernstein, according to which the posteriors corresponding to initially differing priors ultimately agree, as the number of observations increases.50 This early Bayesian inference, which used uniform priors following Laplace's principle of insufficient reason, was called "inverse probability" (because it infers backwards from observations to parameters, or from effects to causes51).
After the 1920s, inverse probability was largely supplantedcitation needed by a collection of methods that were developed by Ronald A. Fisher, Jerzy Neyman and Egon Pearson. Their methods came to be called frequentist statistics.51 Fisher rejected the Bayesian view, writing that "the theory of inverse probability is founded upon an error, and must be wholly rejected".52 At the end of his life, however, Fisher expressed greater respect for the essay of Bayes, which Fisher believed to have anticipated his own, fiducial approach to probability; Fisher still maintained that Laplace's views on probability were "fallacious rubbish".52 Neyman started out as a "quasi-Bayesian", but subsequently developed confidence intervals (a key method in frequentist statistics) because "the whole theory would look nicer if it were built from the start without reference to Bayesianism and priors".53 The word Bayesian appeared in the 1930s, and by the 1960s it became the term preferred by those dissatisfied with the limitations of frequentist statistics.5154
In the 20th century, the ideas of Laplace were further developed in two different directions, giving rise to objective and subjective currents in Bayesian practice. In the objectivist stream, the statistical analysis depends on only the model assumed and the data analysed.55 No subjective decisions need to be involved. In contrast, "subjectivist" statisticians deny the possibility of fully objective analysis for the general case.
In the further development of Laplace's ideas, subjective ideas predate objectivist positions. The idea that 'probability' should be interpreted as 'subjective degree of belief in a proposition' was proposed, for example, by John Maynard Keynes in the early 1920s.citation needed This idea was taken further by Bruno de Finetti in Italy (Fondamenti Logici del Ragionamento Probabilistico, 1930) and Frank Ramsey in Cambridge (The Foundations of Mathematics, 1931).56 The approach was devised to solve problems with the frequentist definition of probability but also with the earlier, objectivist approach of Laplace.55 The subjective Bayesian methods were further developed and popularized in the 1950s by L.J. Savage.citation needed
Objective Bayesian inference was further developed due to Harold Jeffreys, whose seminal book "Theory of probability" first appeared in 1939. In 1957, Edwin Jaynes promoted the concept of maximum entropy for constructing priors, which is an important principle in the formulation of objective methods, mainly for discrete problems. In 1965, Dennis Lindley's 2-volume work "Introduction to Probability and Statistics from a Bayesian Viewpoint" brought Bayesian methods to a wide audience. In 1979, José-Miguel Bernardo introduced reference analysis,55 which offers a general applicable framework for objective analysis.citation needed Other well-known proponents of Bayesian probability theory include I.J. Good, B.O. Koopman, Howard Raiffa, Robert Schlaifer and Alan Turing.
In the 1980s, there was a dramatic growth in research and applications of Bayesian methods, mostly attributed to the discovery of Markov chain Monte Carlo methods, which removed many of the computational problems, and an increasing interest in nonstandard, complex applications.57 Despite growth of Bayesian research, most undergraduate teaching is still based on frequentist statistics.58 Nonetheless, Bayesian methods are widely accepted and used, such as for example in the field of machine learning.59
|This section does not cite any references or sources. (May 2012)|
- Ball, Philip (2004). Critical Mass. Farrar, Straus and Giroux. p. 53. ISBN 0-374-53041-6.
- Thucydides (1985). History of the Peloponnesian War. New York: Penguin Books, Ltd. p. 204.
- Singh, Simon (2000). The code book : the science of secrecy from ancient Egypt to quantum cryptography (1st Anchor Books ed.). New York: Anchor Books. ISBN 0-385-49532-3.
- Ibrahim A. Al-Kadi "The origins of cryptology: The Arab contributions", Cryptologia, 16(2) (April 1992) pp. 97–126.
- Villani, Giovanni. Encyclopædia Britannica. Encyclopædia Britannica 2006 Ultimate Reference Suite DVD. Retrieved on 2008-03-04.
- de Moivre, A. (1738) The doctrine of chances. Woodfall
- Laplace, P-S. (1774). "Mémoire sur la probabilité des causes par les évènements". Mémoires de l'Académie Royale des Sciences Présentés par Divers Savants, 6, 621–656
- Wilson, Edwin Bidwell (1923) "First and second laws of error", Journal of the American Statistical Association, 18 (143), 841-851 JSTOR 2965467
- Havil J (2003) Gamma: Exploring Euler's Constant. Princeton, NJ: Princeton University Press, p. 157
- Peirce CS (1873) Theory of errors of observations. Report of the Superintendent US Coast Survey, Washington, Government Printing Office. Appendix no. 21: 200-224
- Cochran W.G. (1978) "Laplace’s ratio estimators". pp 3-10. In David H.A., (ed). Contributions to Survey Sampling and Applied Statistics: papers in honor of H. O. Hartley. Academic Press, New York ISBN 122047508,
- (Stigler 1986, Chapter 9: The Next Generation: Edgeworth)
- Stigler (1986, Chapter 10: Pearson and Yule)
- Keynes, JM (1921) A treatise on probability. Pt II Ch XVII §5 (p 201)
- Galton F (1881) Report of the Anthropometric Committee pp 245-260. Report of the 51st Meeting of the British Association for the Advancement of Science
- Stigler (1986, Chapter 5: Quetelet's Two Attempts)
- Galton F (1877) Typical laws of heredity" Nature 15: 492-553
- Galton F (1907) One Vote, One Value" Nature 75: 414
- Bellhouse DR (1988) A brief history of random sampling methods. Handbook of statistics. Vol 6 pp 1-14 Elsevier
- Bowley AL (1906) Address to the Economic Science and Statistics Section of the British Association for the Advancement of Science. J Roy Stat Soc 69: 548-557
- Neyman, J (1934) On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society 97 (4) 557-625 JSTOR 2342192
- Fisher RA (1925) Statistical methods for research workers, Edinburgh: Oliver & Boyd
- Wilson EB, Hilferty MM (1929) Note on C.S. Peirce’s experimental discussion of the law of Errors. Proc Nat Acad Sci USA, 15(2) 120-125
- Hacking, Ian (2006). The emergence of probability : a philosophical study of early ideas about probability, induction and statistical inference. Cambridge New York: Cambridge University Press. ISBN 9780521685573.
- Franklin, James (2001). The science of conjecture : evidence and probability before Pascal. Baltimore: Johns Hopkins University Press. ISBN 9780801871092.
- (Salsburg 2001, Chapter 14: The Mozart of Mathematics, pp 137-150)
- Dunn, Peter (January 1997). "James Lind (1716-94) of Edinburgh and the treatment of scurvy". Archive of Disease in Childhood Foetal Neonatal (United Kingdom: British Medical Journal Publishing Group) 76 (1): 64–65. doi:10.1136/fn.76.1.F64. PMC 1720613. PMID 9059193. Retrieved 2009-01-17.
- Douglas C. Montgomery, Scott M. Kowalski (2009). Design and Analysis of Experiments (7th Edition), John Wiley & Sons. ISBN 978-0-470-12866-4 (pp: 7,9-11)
- Charles Sanders Peirce and Joseph Jastrow (1885). "On Small Differences in Sensation". Memoirs of the National Academy of Sciences 3: 73–83.
- Hacking, Ian (September 1988). "Telepathy: Origins of Randomization in Experimental Design". Isis 79 (A Special Issue on Artifact and Experiment): 427–451. doi:10.1086/354775. JSTOR 234674. MR 1013489. More than one of
- Stephen M. Stigler (November 1992). "A Historical View of Statistical Concepts in Psychology and Educational Research". American Journal of Education 101 (1): 60–70. doi:10.1086/444032.
- Trudy Dehue (December 1997). "Deception, Efficiency, and Random Groups: Psychology and the Gradual Origination of the Random Group Design". Isis 88 (4): 653–673. doi:10.1086/383850. PMID 9519574.
- Peirce, C. S. (1876). "Note on the Theory of the Economy of Research". Coast Survey Report: 197–201., actually published 1879, NOAA PDF Eprint.
Reprinted in Collected Papers 7, paragraphs 139–157, also in Writings 4, pp. 72–78, and in Peirce, C.S. (July–August 1967). "Note on the Theory of the Economy of Research". Operations Research 15 (4): 643–648. doi:10.1287/opre.15.4.643. JSTOR 168276.
- Smith, Kirstine (1918). "On the Standard Deviations of Adjusted and Interpolated Values of an Observed Polynomial Function and its Constants and the Guidance they give Towards a Proper Choice of the Distribution of Observations". Biometrika 12 (1/2): 1–85. JSTOR 2331929.
- Johnson, N.L. (1961). "Sequential analysis: a survey." Journal of the Royal Statistical Society, Series A. Vol. 124 (3), 372–411. (pages 375–376)
- Wald, A. (1945) "Sequential Tests of Statistical Hypotheses", Annals of Mathematical Statistics, 16 (2), 117–186.
- Chernoff, H. (1972) Sequential Analysis and Optimal Design, SIAM Monograph. ISBN 978-0898710069
- Zacks, S. (1996) "Adaptive Designs for Parametric Models". In: Ghosh, S. and Rao, C. R., (Eds) (1996). "Design and Analysis of Experiments," Handbook of Statistics, Volume 13. North-Holland. ISBN 0-444-82061-2. (pages 151–180)
- Robbins, H. (1952). "Some Aspects of the Sequential Design of Experiments". Bulletin of the American Mathematical Society 58 (5): 527–535. doi:10.1090/S0002-9904-1952-09620-8.
- Hald, Anders (1998) A History of Mathematical Statistics. New York: Wiley.page needed
- Box, Joan Fisher (1978) R. A. Fisher: The Life of a Scientist, Wiley. ISBN 0-471-09300-9 (pp 93–166)
- Edwards, A.W.F. (2005). "R. A. Fisher, Statistical Methods for Research Workers, 1925". In Grattan-Guinness, Ivor. Landmark writings in Western mathematics 1640-1940. Amsterdam Boston: Elsevier. ISBN 9780444508713.
- Stanley, J. C. (1966). "The Influence of Fisher's "The Design of Experiments" on Educational Research Thirty Years Later". American Educational Research Journal 3 (3): 223. doi:10.3102/00028312003003223.
- Box, JF (February 1980). "R. A. Fisher and the Design of Experiments, 1922-1926". The American Statistician 34 (1): 1–7. doi:10.2307/2682986. JSTOR 2682986.
- Yates, Frank (June 1964). "Sir Ronald Fisher and the Design of Experiments". Biometrics 20 (2): 307–321. doi:10.2307/2528399. JSTOR 2528399.
- Stanley, Julian C. (1966). "The Influence of Fisher's "The Design of Experiments" on Educational Research Thirty Years Later". American Educational Research Journal 3 (3): 223–229. JSTOR 1161806.
- Hacking, Ian (September 1988). "Telepathy: Origins of Randomization in Experimental Design". Isis 79 (A Special Issue on Artifact and Experiment): 427–451. doi:10.1086/354775. JSTOR 234674. MR 1013489. More than one of
- Stigler (1986, Chapter 3: Inverse Probability)
- Hald (1998)page needed
- Lucien Le Cam (1986) Asymptotic Methods in Statistical Decision Theory: Pages 336 and 618–621 (von Mises and Bernstein).
- Stephen. E. Fienberg, (2006) When did Bayesian Inference become "Bayesian"? Bayesian Analysis, 1 (1), 1–40. See page 5.
- Aldrich, A. (2008) "R. A. Fisher on Bayes and Bayes' Theorem", Bayesian analysis, 3 (1),161–170
- Neyman, J. (1977). "Frequentist probability and frequentist statistics". Synthese 36 (1): 97–131. doi:10.1007/BF00485695.
- Jeff Miller, "Earliest Known Uses of Some of the Words of Mathematics (B)"
- Bernardo, JM. (2005). "Reference analysis". Handbook of statistics. Handbook of Statistics 25: 17–90. doi:10.1016/S0169-7161(05)25002-2. ISBN 9780444515391.
- Gillies, D. (2000), Philosophical Theories of Probability. Routledge. ISBN 0-415-18276-X pp 50–1
- Wolpert, RL. (2004) "A conversation with James O. Berger", Statistical Science, 9, 205–218 doi:10.1214/088342304000000053 MR2082155
- Bernardo, J. M. (2006). "A Bayesian Mathematical Statistics Primer". Proceedings of the Seventh International Conference on Teaching Statistics [CDROM]. Salvador (Bahia), Brazil: International Association for Statistical Education.
- Bishop, C.M. (2007) Pattern Recognition and Machine Learning. Springer ISBN 978-0387310732
- Freedman, D. (1999). "From association to causation: Some remarks on the history of statistics". Statistical Science 14 (3): 243–258. doi:10.1214/ss/1009212409. (Revised version, 2002)
- Hald, Anders (2003). A History of Probability and Statistics and Their Applications before 1750. Hoboken, NJ: Wiley. ISBN 0-471-47129-1.
- Hald, Anders (1998). A History of Mathematical Statistics from 1750 to 1930. New York: Wiley. ISBN 0-471-17912-4.
- Kotz, S., Johnson, N.L. (1992,1992,1997). Breakthroughs in Statistics, Vols I,II,III. Springer ISBN 0-387-94037-5, ISBN 0-387-94039-1, ISBN 0-387-94989-5
- Pearson, Egon (1978). The History of Statistics in the 17th and 18th Centuries against the changing background of intellectual, scientific and religious thought (Lectures by Karl Pearson given at University College London during the academic sessions 1921-1933). New York: MacMillan Publishng Co., Inc. p. 744. ISBN 0-02-850120-9.
- Salsburg, David (2001). The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. ISBN 0-7167-4106-7
- Stigler, Stephen M. (1986). The History of Statistics: The Measurement of Uncertainty before 1900. Belknap Press/Harvard University Press. ISBN 0-674-40341-X.
- Stigler, Stephen M. (1999) Statistics on the Table: The History of Statistical Concepts and Methods. Harvard University Press. ISBN 0-674-83601-4
- David, H. A. (1995). "First (?) Occurrence of Common Terms in Mathematical Statistics". The American Statistician 49 (2): 121–133. doi:10.2307/2684625. JSTOR 2684625.
|Wikimedia Commons has media related to History of statistics.|
- JEHPS: Recent publications in the history of probability and statistics
- Electronic Journ@l for History of Probability and Statistics/Journ@l Electronique d'Histoire des Probabilités et de la Statistique
- Figures from the History of Probability and Statistics (Univ. of Southampton)
- Materials for the History of Statistics (Univ. of York)
- Probability and Statistics on the Earliest Uses Pages (Univ. of Southampton)
- Earliest Uses of Symbols in Probability and Statistics on Earliest Uses of Various Mathematical Symbols