Page 1 of 6
Archives of Business Research – Vol. 9, No. 6
Publication Date: June 25, 2021
DOI:10.14738/abr.96.10313. Samsa, G. (2021). The Efficient Market Hypothesis is Usually Assessed Indirectly: What Happens If a Direct Approach is Used
Instead? Archives of Business Research, 9(6). 45-50.
Services for Science and Education – United Kingdom
The Efficient Market Hypothesis is Usually Assessed Indirectly:
What Happens If a Direct Approach is Used Instead?
Greg Samsa PhD
Professor, Department of Biostatistics and Bioinformatics
Duke University 11084 Hock Plaza, Durham NC 27510 USA
ABSTRACT
A vast literature on putative market inefficiencies compares the results of an
investment strategy which takes advantage of the putative inefficiency against a
null hypothesis generated by the efficient market hypothesis (EMH). Even if
negative, such studies do not provide direct evidence in favor of the EMH. To
directly assess a key component of the EMH, namely that stock returns lack
memory, we created 30-year portfolios by sampling annual market returns from
1926-2019 with replacement, and then compared the results with the historical
record of actual 30-year returns. Although centered on the correct amount, the
EMH-based 30-year returns were notably more variable than the historical 30-year
returns. One possible explanation is that market returns regress toward their mean
in the long term. This demonstrates that while the EMH should be taken seriously,
it need not always be taken literally.
Keywords: efficient market hypothesis, index fund, market returns, regression toward
the mean
INTRODUCTION
The efficient market hypothesis (EMH) need not be literally true, under all circumstances, in
order to be an extraordinarily powerful construct. Departures from the EMH are typically
tested using the following algorithm. First, a putative departure from market efficiency is
described. For example, the EMH assumes that previous stock returns provide no information
about subsequent ones (i.e., that "returns lack memory"). One hypothesis about a departure
from the EMH could be: "The psychological underpinnings of behavioral finance suggests that
investors overreact, thus inducing momentum in the short term and regression toward the
mean in the long term, and in the long term this regression toward the mean will cause
currently-poorly-performing stocks to outperform others". Next, an investment strategy is
derived from that hypothesis -- for example, "in anticipation of regression toward the mean,
buy stocks whose previous returns have been poor". Then, this investment strategy is applied
multiple times to a historical database, and a distribution of annual returns generated. Finally,
these annual returns are compared with risk-adjusted benchmarks.
This paradigm has generated a vast literature, the consensus of which is that departures from
the EMH, if real, are subtle (e.g., [1-16]). Whether these departures remain after appropriate
risk adjustment is a source of debate, as is the question of whether investors can actually profit
from the strategies being analyzed.
Page 2 of 6
46
Archives of Business Research (ABR) Vol. 9, Issue 6, June-2021
Services for Science and Education – United Kingdom
Here, we frame the question differently. Instead of asking whether departures from the EMH
are consistent with the historical record, we ask whether the EMH itself produces predictions
which are consistent with that record. Specifically, we assume that the EMH is true, and thus
that stock returns lack memory. We simulate 30-year returns under this assumption by
randomly sampling with replacement from annual returns during the period 1926-2019, and
then compare the results against the actual 30-year returns. The specific context pertains to
saving for retirement.
METHODS
We assume that a young investor is saving for retirement, with a 30-year investment horizon.
For concreteness, we assume that they either make a single one-time contribution of $10,000
or make annual contributions of $1,000. The investor purchases a stock index fund, whose
returns correspond to market returns from 1926-2019 [17]. We then describe the expected
distribution of T, the terminal value of the investment, and R, the annual compounding rate of
return, under two conditions.
First, we use the actual historical record. More specifically, we calculate annualized returns
over 10-year, 20-year and 30-year intervals. For 10-year intervals, we calculate the annualized
return for the period 1926-1935, then calculate the annualized return for the period 1927-
1936, and eventually proceed to calculate the annualized return for the period 2010-2019.
Twenty-year and 30-year returns are calculated in similar fashion. For each cohort (i.e., 85
cohorts for 10-year returns, 75 cohorts for 20-year returns, 65 cohorts for 30-year returns) we
calculate T and R. Finally, we present descriptive statistics (e.g., mean, standard deviation,
percentiles).
Second, under the EMH-based assumption that stock returns lack memory, for each of 30 years
we randomly select a return from the historical record by sampling with replacement. Table 1
illustrates the logic for a single iteration of the simulation, assuming that the investor makes a
one-time contribution of $10,000 at the start of year 1. A return is randomly selected from the
historical record -- for example, the year 2011 is selected and the value of 2.1% is obtained. The
starting value of $10,000 is then multiplied by the return of 1.021 to obtain a final value of
$10,210, which also becomes the starting value for year 2. Because the sampling is performed
with replacement, the return associated with the year 2011 might (or might not) appear in
subsequent years. Year 2 begins with a starting value of $10,210, a return of 1.3% return
associated with the year 1994 is randomly selected, the resulting value becomes $10,210*1.013
= $10,343, which also becomes the starting value for year 3. The process continues for 30 years,
and the final value T is recorded. We describe the sampling distributions of T and R, generated
from 10,000 iterations of the simulation, with 10,000 intended to be a large enough number to
estimate theses distributions with sufficient accuracy.
For the investor's planning purposes, although the entire distribution of T is of interest, in the
spirit of conservatism they would pay particular attention to its lower percentiles. For example,
they could select the 5th or 10th percentile of T, and then compare this result with their
investment goal. If the value of this lower percentile is 50% of this goal, then the contribution
toward the retirement fund should be doubled.
Page 3 of 6
47
Samsa, G. (2021). The Efficient Market Hypothesis is Usually Assessed Indirectly: What Happens If a Direct Approach is Used Instead? Archives of
Business Research, 9(6). 45-50.
URL: http://dx.doi.org/10.14738/abr.96.10313
Table 1: Illustration of simulation logic
Year Starting
value
($)
Random
year
Return Increase
($)
Final
value
($)
1 10,000 2011 2.1% 210 10,210
2 10,210 1994 1.3% 133 10,343
...
30 T
RESULTS
Table 2 presents percentile values for the two scenarios. First considering the historical data,
all 10-year returns were positive, with the exception of cohorts beginning in 1929 and 1930
(i.e., the Great Depression), and in 1999 and 2000 (i.e., the dot-com meltdown). These
exceptional cohorts began with a crash, but had regained most of their value by year 10. This
is consistent with the meme in the popular financial press that "over a 10-year period you are
unlikely to lose money in stocks". Twenty- and 30-year cohorts were uniformly profitable, and
by the end of year 30 the median value of T was almost $220,000, serving to illustrate the
benefits of compounding over a sufficiently long period of time.
Considering 30-year returns, the variability in T is notable. In particular, the worst-case
scenario (from the 1929-1958 cohort) yields a value of T which is only half of the median. Even
so, 5th percentile of T is approximately $150,000, and so an investment strategy which accounts
for the possibility of relatively poor (but not disastrous) luck would only have to recognize that
a one-time contribution of $10,000 might generate $150,000 rather than $220,000 and plan
accordingly -- for example, by increasing contributions by a factor of 220,000/150,000.
Historical data are often summarized as in Table 3. Every 30-year interval generated an annual
compounding rate of return in the range of 8.5% (i.e., for a portfolio starting in 1929) to 13.7%.
This is consistent with the meme that "over a sufficiently long investment horizon the investor
can plan on an annual compounding rate of return of approximately 10%". Although accurate
from a literal sense, such a presentation fails to highlight three points. First, small differences
in R can lead to large differences in T. For example, the difference in terminal values at 30 years
between a 9% annual compounding rate and a 10% annual compounding rate is $132,677
versus $174,494. Second, although those portfolios which began with a crash did "recover",
they did not quite achieve the annual compounding rate of return of portfolios with a more
fortunate beginning. Finally, these are nominal rates of return, and do not consider the impact
of inflation.
Page 4 of 6
48
Archives of Business Research (ABR) Vol. 9, Issue 6, June-2021
Services for Science and Education – United Kingdom
Table 2: Percentile values ($) at 10-year intervals, one-time contribution of $10,000
Scenari
o
Year
s
P1 P5 P10 P25 P50 P75 P90 P95 P99
Historic
al
10 8,699 10,01
9
13,33
2
19,56
8
25,05
6
40,21
4
49,98
2
53,282 62,234
Historic
al
20 18,44
2
29,83
7
37,00
9
47,65
2
85,59
2
125,6
47
166,7
73
217,624 268,359
Historic
al
30 114,7
76
149,9
54
167,2
11
184,5
05
219,7
85
319,7
85
414,3
87
440,839 475,507
EMH 10 5,730 9,567 12,35
4
18,15
9
27,43
0
39,99
1
55,57
4
67,527 93,195
EMH 20 8,466 16,21
7
22,71
8
40,80
0
72,71
6
127,5
47
207,6
85
274,410 462,094
EMH 30 14,63
2
31,18
2
48,23
2
92,87
2
188,4
26
378,9
33
700,8
37
1,003,2
51
1,900,0
35
Table 3: Historical annual compounding rate of return (%) at 10-year intervals
Years Mean Standard
deviation
Minimum maximum
10 10.4 5.5 -1.4 20.1
20 10.9 3.4 3.1 17.9
30 11.1 1.3 8.5 13.7
Based on the historical record, although the distinction between the implications of an
investment which returns approximately 10% per year on average and an annuity which
guarantees an exact 10% return ought to be more prominent, and a margin of safety added in
order to account for this distinction, in their usual form the memes about investing in stocks for
retirement are accurate (i.e., over 30 years it is reasonable to expect an annual compounding
rate of return of not much less than 10%, which should lead to a sizeable retirement fund,
barring disaster).
Table 2 also presents percentile values under the EMH assumption that stock returns lack
memory. Although the median values of T are of the same order of magnitude as before, the
amount of variability under EMH assumptions is dramatically greater than the historical record.
This phenomenon is not caused by the order of returns (i.e., since the same value of T is
generated from a given set of returns which only differ in their order), but instead by the
random selection of, for example, more or fewer crash years than in the actual historical record.
Table 4 presents similar information as Table 2, now with annual contributions of $1,000. Here,
the order of returns does matter, yet the basic result that the variability is notably higher in the
EMH group continues to hold.
Page 5 of 6
49
Samsa, G. (2021). The Efficient Market Hypothesis is Usually Assessed Indirectly: What Happens If a Direct Approach is Used Instead? Archives of
Business Research, 9(6). 45-50.
URL: http://dx.doi.org/10.14738/abr.96.10313
Table 4: Percentile values ($) at 10-year intervals, annual contributions of $1,000
Scenario Years P1 P5 P10 P25 P50 P75 P90 P95 P99
Historic
al
10 6,933 10,075 11,306 13,487 18,054 24,418 26,456 28,311 32,731
Historic
al
20 19,73
1
27,957 36,780 48,178 67,729 107,43
8
132,45
8
145,69
7
155,291
Historic
al
30 80,09
5
118,66
2
138,26
7
173,37
8
267,70
7
323,28
8
396,80
8
433,47
3
519,693
EMH 10 6,836 9,390 11,023 14,122 18,589 24,285 30,457 34,382 43,456
EMH 20 16,24
8
24,561 30,601 45,316 69,396 104,50
2
151,18
7
186,87
2
284,194
EMH 30 31,79
2
54,057 73,152 117,86
5
200,86
7
345,85
4
568,66
5
767,51
2
1,315,12
0
DISCUSSION
Although "one can't prove a negative", the EMH is often indirectly assessed in this fashion. In
other words, researchers, especially those from the specialty of behavioral finance, try to
generate counter-examples where, for example, stock returns demonstrate memory or are
otherwise predictable. To the degree that these demonstrations fail, they also serve to
indirectly support the EMH. Such demonstrations serve to "support" rather than "prove" the
EMH, in the same sense that a statistically non-significant result doesn't "prove" the alternative
hypothesis and, indeed, it can be argued that low statistical power is built into the design of
many of these studies [4], thus suggesting an additional caveat to interpretation. Moreover, it
should be recognized that the research in question is primarily about putative market
inefficiencies rather than the EMH, and that the role of the EMH is mostly limited to the
generation of a null hypothesis.
With this in mind, if the goal is to test the EMH, then a direct way to do so would be to use it to
predict what pattern of data the historical record should show, and then assess how closely the
actual record matches this prediction. To accomplish this, the key idea is that the EMH
effectively assumes that stock returns lack memory, and a lack of memory is functionally
equivalent to sampling from the historical returns with replacement. In fact, the historical
record of 30-year returns differs from the EMH-based predictions, not because the average
returns are "wrong", but because the level of variability is notably greater than expected. One
possible explanation for this phenomenon is that actual market returns possess memory in the
sense that better than average performance leads to stocks being overpriced relative to their
earnings, which then increases the likelihood of poorer returns going forward. In other words:
regression toward the mean on a long-term investment horizon, with the economic rationale
being the tendency for earnings to be relatively stable at the level of the economy over long
periods of time, and for stock price levels to move toward equilibrium with these earnings.
For a straw man, the EMH is remarkably resilient. Nothing in this demonstration should affect
its role in providing the generic alternative hypothesis for studies which are assessing putative
market inefficiencies. Perhaps what this demonstration suggests is that while the EMH should
be taken seriously, it need not always be taken literally.