Page 1 of 6

Archives of Business Research – Vol. 9, No. 6

Publication Date: June 25, 2021

DOI:10.14738/abr.96.10313. Samsa, G. (2021). The Efficient Market Hypothesis is Usually Assessed Indirectly: What Happens If a Direct Approach is Used

Instead? Archives of Business Research, 9(6). 45-50.

Services for Science and Education – United Kingdom

The Efficient Market Hypothesis is Usually Assessed Indirectly:

What Happens If a Direct Approach is Used Instead?

Greg Samsa PhD

Professor, Department of Biostatistics and Bioinformatics

Duke University 11084 Hock Plaza, Durham NC 27510 USA

ABSTRACT

A vast literature on putative market inefficiencies compares the results of an

investment strategy which takes advantage of the putative inefficiency against a

null hypothesis generated by the efficient market hypothesis (EMH). Even if

negative, such studies do not provide direct evidence in favor of the EMH. To

directly assess a key component of the EMH, namely that stock returns lack

memory, we created 30-year portfolios by sampling annual market returns from

1926-2019 with replacement, and then compared the results with the historical

record of actual 30-year returns. Although centered on the correct amount, the

EMH-based 30-year returns were notably more variable than the historical 30-year

returns. One possible explanation is that market returns regress toward their mean

in the long term. This demonstrates that while the EMH should be taken seriously,

it need not always be taken literally.

Keywords: efficient market hypothesis, index fund, market returns, regression toward

the mean

INTRODUCTION

The efficient market hypothesis (EMH) need not be literally true, under all circumstances, in

order to be an extraordinarily powerful construct. Departures from the EMH are typically

tested using the following algorithm. First, a putative departure from market efficiency is

described. For example, the EMH assumes that previous stock returns provide no information

about subsequent ones (i.e., that "returns lack memory"). One hypothesis about a departure

from the EMH could be: "The psychological underpinnings of behavioral finance suggests that

investors overreact, thus inducing momentum in the short term and regression toward the

mean in the long term, and in the long term this regression toward the mean will cause

currently-poorly-performing stocks to outperform others". Next, an investment strategy is

derived from that hypothesis -- for example, "in anticipation of regression toward the mean,

buy stocks whose previous returns have been poor". Then, this investment strategy is applied

multiple times to a historical database, and a distribution of annual returns generated. Finally,

these annual returns are compared with risk-adjusted benchmarks.

This paradigm has generated a vast literature, the consensus of which is that departures from

the EMH, if real, are subtle (e.g., [1-16]). Whether these departures remain after appropriate

risk adjustment is a source of debate, as is the question of whether investors can actually profit

from the strategies being analyzed.

Page 2 of 6

46

Archives of Business Research (ABR) Vol. 9, Issue 6, June-2021

Services for Science and Education – United Kingdom

Here, we frame the question differently. Instead of asking whether departures from the EMH

are consistent with the historical record, we ask whether the EMH itself produces predictions

which are consistent with that record. Specifically, we assume that the EMH is true, and thus

that stock returns lack memory. We simulate 30-year returns under this assumption by

randomly sampling with replacement from annual returns during the period 1926-2019, and

then compare the results against the actual 30-year returns. The specific context pertains to

saving for retirement.

METHODS

We assume that a young investor is saving for retirement, with a 30-year investment horizon.

For concreteness, we assume that they either make a single one-time contribution of $10,000

or make annual contributions of $1,000. The investor purchases a stock index fund, whose

returns correspond to market returns from 1926-2019 [17]. We then describe the expected

distribution of T, the terminal value of the investment, and R, the annual compounding rate of

return, under two conditions.

First, we use the actual historical record. More specifically, we calculate annualized returns

over 10-year, 20-year and 30-year intervals. For 10-year intervals, we calculate the annualized

return for the period 1926-1935, then calculate the annualized return for the period 1927-

1936, and eventually proceed to calculate the annualized return for the period 2010-2019.

Twenty-year and 30-year returns are calculated in similar fashion. For each cohort (i.e., 85

cohorts for 10-year returns, 75 cohorts for 20-year returns, 65 cohorts for 30-year returns) we

calculate T and R. Finally, we present descriptive statistics (e.g., mean, standard deviation,

percentiles).

Second, under the EMH-based assumption that stock returns lack memory, for each of 30 years

we randomly select a return from the historical record by sampling with replacement. Table 1

illustrates the logic for a single iteration of the simulation, assuming that the investor makes a

one-time contribution of $10,000 at the start of year 1. A return is randomly selected from the

historical record -- for example, the year 2011 is selected and the value of 2.1% is obtained. The

starting value of $10,000 is then multiplied by the return of 1.021 to obtain a final value of

$10,210, which also becomes the starting value for year 2. Because the sampling is performed

with replacement, the return associated with the year 2011 might (or might not) appear in

subsequent years. Year 2 begins with a starting value of $10,210, a return of 1.3% return

associated with the year 1994 is randomly selected, the resulting value becomes $10,210*1.013

= $10,343, which also becomes the starting value for year 3. The process continues for 30 years,

and the final value T is recorded. We describe the sampling distributions of T and R, generated

from 10,000 iterations of the simulation, with 10,000 intended to be a large enough number to

estimate theses distributions with sufficient accuracy.

For the investor's planning purposes, although the entire distribution of T is of interest, in the

spirit of conservatism they would pay particular attention to its lower percentiles. For example,

they could select the 5th or 10th percentile of T, and then compare this result with their

investment goal. If the value of this lower percentile is 50% of this goal, then the contribution

toward the retirement fund should be doubled.

Page 3 of 6

47

Samsa, G. (2021). The Efficient Market Hypothesis is Usually Assessed Indirectly: What Happens If a Direct Approach is Used Instead? Archives of

Business Research, 9(6). 45-50.

URL: http://dx.doi.org/10.14738/abr.96.10313

Table 1: Illustration of simulation logic

Year Starting

value

($)

Random

year

Return Increase

($)

Final

value

($)

1 10,000 2011 2.1% 210 10,210

2 10,210 1994 1.3% 133 10,343

...

30 T

RESULTS

Table 2 presents percentile values for the two scenarios. First considering the historical data,

all 10-year returns were positive, with the exception of cohorts beginning in 1929 and 1930

(i.e., the Great Depression), and in 1999 and 2000 (i.e., the dot-com meltdown). These

exceptional cohorts began with a crash, but had regained most of their value by year 10. This

is consistent with the meme in the popular financial press that "over a 10-year period you are

unlikely to lose money in stocks". Twenty- and 30-year cohorts were uniformly profitable, and

by the end of year 30 the median value of T was almost $220,000, serving to illustrate the

benefits of compounding over a sufficiently long period of time.

Considering 30-year returns, the variability in T is notable. In particular, the worst-case

scenario (from the 1929-1958 cohort) yields a value of T which is only half of the median. Even

so, 5th percentile of T is approximately $150,000, and so an investment strategy which accounts

for the possibility of relatively poor (but not disastrous) luck would only have to recognize that

a one-time contribution of $10,000 might generate $150,000 rather than $220,000 and plan

accordingly -- for example, by increasing contributions by a factor of 220,000/150,000.

Historical data are often summarized as in Table 3. Every 30-year interval generated an annual

compounding rate of return in the range of 8.5% (i.e., for a portfolio starting in 1929) to 13.7%.

This is consistent with the meme that "over a sufficiently long investment horizon the investor

can plan on an annual compounding rate of return of approximately 10%". Although accurate

from a literal sense, such a presentation fails to highlight three points. First, small differences

in R can lead to large differences in T. For example, the difference in terminal values at 30 years

between a 9% annual compounding rate and a 10% annual compounding rate is $132,677

versus $174,494. Second, although those portfolios which began with a crash did "recover",

they did not quite achieve the annual compounding rate of return of portfolios with a more

fortunate beginning. Finally, these are nominal rates of return, and do not consider the impact

of inflation.

Page 4 of 6

48

Archives of Business Research (ABR) Vol. 9, Issue 6, June-2021

Services for Science and Education – United Kingdom

Table 2: Percentile values ($) at 10-year intervals, one-time contribution of $10,000

Scenari

o

Year

s

P1 P5 P10 P25 P50 P75 P90 P95 P99

Historic

al

10 8,699 10,01

9

13,33

2

19,56

8

25,05

6

40,21

4

49,98

2

53,282 62,234

Historic

al

20 18,44

2

29,83

7

37,00

9

47,65

2

85,59

2

125,6

47

166,7

73

217,624 268,359

Historic

al

30 114,7

76

149,9

54

167,2

11

184,5

05

219,7

85

319,7

85

414,3

87

440,839 475,507

EMH 10 5,730 9,567 12,35

4

18,15

9

27,43

0

39,99

1

55,57

4

67,527 93,195

EMH 20 8,466 16,21

7

22,71

8

40,80

0

72,71

6

127,5

47

207,6

85

274,410 462,094

EMH 30 14,63

2

31,18

2

48,23

2

92,87

2

188,4

26

378,9

33

700,8

37

1,003,2

51

1,900,0

35

Table 3: Historical annual compounding rate of return (%) at 10-year intervals

Years Mean Standard

deviation

Minimum maximum

10 10.4 5.5 -1.4 20.1

20 10.9 3.4 3.1 17.9

30 11.1 1.3 8.5 13.7

Based on the historical record, although the distinction between the implications of an

investment which returns approximately 10% per year on average and an annuity which

guarantees an exact 10% return ought to be more prominent, and a margin of safety added in

order to account for this distinction, in their usual form the memes about investing in stocks for

retirement are accurate (i.e., over 30 years it is reasonable to expect an annual compounding

rate of return of not much less than 10%, which should lead to a sizeable retirement fund,

barring disaster).

Table 2 also presents percentile values under the EMH assumption that stock returns lack

memory. Although the median values of T are of the same order of magnitude as before, the

amount of variability under EMH assumptions is dramatically greater than the historical record.

This phenomenon is not caused by the order of returns (i.e., since the same value of T is

generated from a given set of returns which only differ in their order), but instead by the

random selection of, for example, more or fewer crash years than in the actual historical record.

Table 4 presents similar information as Table 2, now with annual contributions of $1,000. Here,

the order of returns does matter, yet the basic result that the variability is notably higher in the

EMH group continues to hold.

Page 5 of 6

49

Samsa, G. (2021). The Efficient Market Hypothesis is Usually Assessed Indirectly: What Happens If a Direct Approach is Used Instead? Archives of

Business Research, 9(6). 45-50.

URL: http://dx.doi.org/10.14738/abr.96.10313

Table 4: Percentile values ($) at 10-year intervals, annual contributions of $1,000

Scenario Years P1 P5 P10 P25 P50 P75 P90 P95 P99

Historic

al

10 6,933 10,075 11,306 13,487 18,054 24,418 26,456 28,311 32,731

Historic

al

20 19,73

1

27,957 36,780 48,178 67,729 107,43

8

132,45

8

145,69

7

155,291

Historic

al

30 80,09

5

118,66

2

138,26

7

173,37

8

267,70

7

323,28

8

396,80

8

433,47

3

519,693

EMH 10 6,836 9,390 11,023 14,122 18,589 24,285 30,457 34,382 43,456

EMH 20 16,24

8

24,561 30,601 45,316 69,396 104,50

2

151,18

7

186,87

2

284,194

EMH 30 31,79

2

54,057 73,152 117,86

5

200,86

7

345,85

4

568,66

5

767,51

2

1,315,12

0

DISCUSSION

Although "one can't prove a negative", the EMH is often indirectly assessed in this fashion. In

other words, researchers, especially those from the specialty of behavioral finance, try to

generate counter-examples where, for example, stock returns demonstrate memory or are

otherwise predictable. To the degree that these demonstrations fail, they also serve to

indirectly support the EMH. Such demonstrations serve to "support" rather than "prove" the

EMH, in the same sense that a statistically non-significant result doesn't "prove" the alternative

hypothesis and, indeed, it can be argued that low statistical power is built into the design of

many of these studies [4], thus suggesting an additional caveat to interpretation. Moreover, it

should be recognized that the research in question is primarily about putative market

inefficiencies rather than the EMH, and that the role of the EMH is mostly limited to the

generation of a null hypothesis.

With this in mind, if the goal is to test the EMH, then a direct way to do so would be to use it to

predict what pattern of data the historical record should show, and then assess how closely the

actual record matches this prediction. To accomplish this, the key idea is that the EMH

effectively assumes that stock returns lack memory, and a lack of memory is functionally

equivalent to sampling from the historical returns with replacement. In fact, the historical

record of 30-year returns differs from the EMH-based predictions, not because the average

returns are "wrong", but because the level of variability is notably greater than expected. One

possible explanation for this phenomenon is that actual market returns possess memory in the

sense that better than average performance leads to stocks being overpriced relative to their

earnings, which then increases the likelihood of poorer returns going forward. In other words:

regression toward the mean on a long-term investment horizon, with the economic rationale

being the tendency for earnings to be relatively stable at the level of the economy over long

periods of time, and for stock price levels to move toward equilibrium with these earnings.

For a straw man, the EMH is remarkably resilient. Nothing in this demonstration should affect

its role in providing the generic alternative hypothesis for studies which are assessing putative

market inefficiencies. Perhaps what this demonstration suggests is that while the EMH should

be taken seriously, it need not always be taken literally.