|
|
![]() |
3.3.4a False Discoveries of the Elusive AlphaThe term “alpha” represents the difference between the return on an investment and the return which could have been achieved in an index with identical risk exposure, quantifying a fund manager’s skill. A recent study by Laurent Barras, Olivier Scaillet, and Russ Wermers investigates the presence of true alpha in the results of 2,076 open-end domestic equity mutual funds for the thirty-two years from January 1975 to December 2006. The study, “False Discoveries in Mutual Fund Performance: Measuring Luck in Estimated Alphas,” employs the use of t-statistic hypothesis testing and statistical data to compare funds’ relative performance, employing a “False Discovery Test” to avoid errors which commonly plague statistical analysis and mitigate the effects of false positive and negative results. Unlike many previous studies of mutual fund performance, this method allows for distinctions to be made between fund results based on luck and those based on skill.
In a July 2008 New York Times article titled, “The Prescient Are Few”, journalist Mark Hulbert digs into the results of the landmark study and its implications as described by Prof. Russ Wermers who headed up the study. “The number of funds that have beaten the market over their entire histories is so small that the False Discovery Rate test can’t eliminate the possibility that the few that did were merely false positives,” says Prof. Wermers--or as Hulbert puts it “just lucky.” Figure 3-6B In a study of the Morningstar Direct database, the same conclusions were reached. Virtually no evidence of stock picking skill was found. A multivariable regression analysis of historical returns was conducted to determine whether or not a fund manager has skill, or to put it in academic speak, reliably delivered alpha. The three variables used were the Fama-French three risk factors of market, size and value. This analysis reveals the extent to which the returns can be replicated with a combination of index funds, as well as the value added or subtracted by the manager (i.e., alpha). One way to test the claim that a manager can beat a market is to see if we have enough years of performance data to be statistically significant. The statistical test called the Student’s t-test was introduced in 1908 by William Sealy Gosset, referred to as the “Student,” while working for the Guinness brewery in Dublin, Ireland to evaluate the quality of the brewery’s ingredients. The t-test can be used to determine if a series of historical returns is reliably superior to a risk-equivalent benchmark. This can determine whether alpha (any return over the benchmark return) is due to luck or skill. A t-stat of 2 or higher indicates that we are at least 95% confident that the manager actually earned a return higher than his benchmark due to skill, with up to a 5% chance that it was due to luck. In Figure 3-6B-i, the t-test is applied to U.S. equity funds in six different style classifications over a ten-year period. Out of 614 mutual funds that were compared to their risk-appropriate benchmarks, only 80 of the 614 fund managers had positive excess returns. Of those 80, only one (0.16%) had a t-stat greater than or equal to 2 (signifying skill). But when the time period of that one was extended back to the fund’s November 1991 inception, the t-stat dropped below 2, indicating that skill evaporated. Figure 3-6B-i Only one fund (NFJ Allianz Small Cap Value) had a statistically significant positive alpha (t-statistic greater than 2), and when this fund was analyzed over its entire period since inception, the alpha was no longer statistically significant. The chart below shows the excess return of NFJ Allianz Small Cap Value relative to the Russell 2000 Value Index (Morningstar’s designated benchmark). From the average alpha and variability of the alpha, we see that we need 170 years of similar returns to conclude the presence of skill.
Figure 3-6B-ii Another way to view this data is to draw a line that separates statistical significance on a Alpha versus Standard Deviation of Alpha Scatter Plot. Funds that fall above the line inicated that there is a 95% chance that they may be skillfull. As seen above, after extending the period for the only possible skillful manager, the probablity of skill went down the drain. Figure 3-6C-i Bill Miller of Legg Mason Capital Management holds the distinction of being the only manager to have ever beaten the S&P 500 index for fifteen consecutive years (1991 to 2005). Unfortunately, his returns after 2005 fell short of the S&P 500, so those of his investors who put their money in after he became well-known discovered the meaning of disappointment. The chart below shows how the Legg Mason Capital Management Value Trust fared against the Russell 1000 Index (Morningstar’s designated benchmark) on a calendar year basis from inception through 2010. From the average alpha and variability of the alpha, we see that we need 269 years of similar returns to anoit Mr. Miller with having stock picking skill. Figure 3-6C-ii Two funds that have recently received attention from the financial media are the Yacktman Fund and the Yacktman Focused Fund, both managed by Donald and Stephen and Yacktman. The chart below shows the excess return of Yacktman Focused relative to the Russell 1000 Value Index (Morningstar’s designated benchmark). From the average alpha and variability of the alpha, we see that we need 105 years of similar returns to conclude the presence of skill. Well, 105 is certainly better than 269. Figure 3-6C-iii For the Yacktman Fund vs. the Russell 1000 Value Index, the average alpha was -1.10%, so there is no number of possible years to conclude the existence of skill. 3.3.4b Morningstar’s Manager of the Year: Luck or Skill?Unfortunately for them, investors are constantly bombarded with advertisements, market commentaries, and screaming magazine covers telling them what they should do with their money. Contributing to all the clamor and din is Morningstar’s annual announcement of their awards for “Fund Manager of the Year.” As usual, investors are best served by not paying it any attention. In order to determine whether being named “Fund Manager of the Year” engenders a valid expectation of higher returns for the fund’s investors, Index Funds Advisors ran a statistical test (the t-test) of sixteen domestic equity mutual funds which received this Morningstar recognition (cached article) to determine if the fund’s outperformance was truly attributable to skill (95% or higher probability) or if it could be explained as luck. For each fund, the performance from the manager’s inception date (or the inception date of Morningstar’s benchmark in two cases) through year-end 2011 was evaluated against the benchmark designated for the fund by Morningstar. The charts below show each fund’s alpha (the difference in returns between the fund and the benchmark) on a year-by-year basis. Only one of the sixteen funds (about 6%) met the requirement of the statistical test that would suggest ruling out luck as the explanation for the outperformance based on a 95% confidence level. Before you get too excited however, please note that this fund belongs to the small growth category which of has the lowest expected return per unit of risk of all the different equity style boxes. Among the sixteen funds, the median number of years needed to conclude the presence of skill over luck was 72 years. Five of the funds showed a high enough degree of volatility in their returns (relative to their benchmarks) as to require a minimum of 100 years. Even when there is a statistical indication of skill in a manager’s performance, it is often confined to a single time period and does not persist beyond it. A perfect example of this is Bill Miller of the Legg Mason Value Trust who carries the distinction of being the only mutual fund manager to have beaten the S&P 500 for fifteen consecutive years. Viewing the fifteen-year winning period alone indicates over a 99% probability of true skill, but if we broaden the scope of analysis to his entire tenure, we no longer can statistically conclude the presence of skill over luck. Figure 3-6D1 Figure 3-6D2 Figure 3-6D3 Figure 3-6D4 Figure 3-6D5 Figure 3-6D6 Figure 3-6D7 Figure 3-6D8 Figure 3-6D9 Figure 3-6D10 Figure 3-6D11 Figure 3-6D12 Figure 3-6D13 Figure 3-6D14 Figure 3-6D15 Figure 3-6D16 In an article at TheStreet.com titled Using 'Alpha' to Pick the Best Mutual Funds, Stan Luxenburg identified three funds that he thought had a significant alpha. He stated, "Among the top performers are Weitz Partners III Opportunity Investor(WPOIX), which has an alpha of 9.68, Sequoia(SEQUX) with 6.25, and Hennessy Focus 30(HFTFX) with 4.34." Let's take a look at how many years of data you would need before you would attribute their record to skill instead of luck. Figure 3-6D17 Figure 3-6D18 Figure 3-6D19 Figure 3-6D20
3.3.4c Peter Lynch and Warren Buffett: Luck or Skill?Here is the famous Fidelity Magellan fund. Peter Lynch managed the Fidelity Magellan Fund from 1977 to 1990, during which time the fund's assets grew from $20 million to $14 billion, just in time to experience more losing years than winning years. More importantly, Lynch reportedly beat the Morningstar specified benchmark for 14 years. However, as you can see, starting in 1991, 13 of 20 years resulted in a negative alpha and only 4 years of those years had respectable alphas. The $14 Billion of investor's assets did not fair well. Even with this incredible record, 20.7 years of track record is required to be 95% confident of skill. Did you get 21 years of data from your active manager? Figure 3-6D21
We also took a look at Warren Buffett's (Berkshire Hathaway's) alpha since 1980, relative to the Russell Large Value Index (Russell 1000 Value). One interesting note is the 30%, 40%, two 35%'s and two 60%'s excess returns in the periods prior to 1999. None of these huge excess returns over a benchmark are repeated after 1999. Warren's skill may have been priced in Berkshire Hathaway since that time. Figure 3-6D21a 3.3.4d Calculations for t-StatisticsIn calculating the t-stat, the first step is to determine the excess returns the manager earned above an appropriate benchmark. Then we determine the regularity of the excess returns by calculating the standard deviation of those returns. Based on these two numbers, we can then calculate how many years we need to support the manager’s claims. Of the 80 fund managers who had positive excess returns, the average excess return was 0.84% and the standard deviation was 5.64%. To estimate the years needed for statistical significance, you can find the intersection of the average excess return (about 0.8%) and standard deviation (about 5.6%) in the chart below (see data box for point estimates). Then follow the line out, and you can see that 180 years of returns data are needed to establish skill as the reason for the higher returns. The calculator below the chart provides the exact number of years needed. Obviously, no manager has ever managed a fund for 180 years; therefore, we are unable to accept any of these manager’s claims. Alas, managers are mere mortals. Figure 3-6D22 - Three Factors of Performance Measurements The Figure below shows the formula to calculate the number of years needed for a t-stat of 2. We first determine the excess return over a benchmark (the alpha) then determine the regularity of the excess returns by calculating the standard deviation of those returns. Based on these two numbers, we can then calculate how many years we need (sample size) to support the manager’s claim of skill. Figure 3-6D23 - Sample Size Calculator for Active Manager Alphas As you see in the calculator above, the t-stat is held at 2. Understanding why a t-stat of 2 or more is considered statistically significant is important. However, it is vital to simply grasp why bigger t-stats mean the value is more “reliably” different from zero. To begin with, refer to the following equation defining a t-stat: or t-stat = (average x √Observations ) / standard deviation Decomposing the elements of this equation can demonstrate what leads to bigger t-stats and help instill the intuition behind why a bigger t-stat implies that the observed value is less likely to have a true value of zero. “Average” is the average of all observations in the sample. This parameter is in the numerator, so as the average increases, so does the t-stat. To illustrate, consider the two data series below: Series A: 1, 2, 1, 2, 1, 2, 1, 2, 1, 2 Series B: 9, 10, 9, 10, 9, 10, 9, 10, 9, 10 Both have the same number of observations and the same standard deviation. But series A has an average of 1.5 and series B has an average of 9.5. As the average increases, so does the t-stat, meaning it is less likely the true average from series B is actually zero. The intuition here is that a mean further from zero makes it less likely that the true value is in fact zero. “√N” is the square root of the number of observations. This parameter is also in the numerator, so as the number of observations increases, the t-stat does as well. Consider the two data series below: Series A: 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2 Series B: 1, 2, 1, 2 Both have the same average of 1.5 and the same standard deviation of 0.5, but series A has 20 observations and series B only has 4. As the number of observations increases, so does the t-stat, and the observed average becomes more reliable. In this example, series A has a t-stat of 13.4 and series B has a t-stat of 6 due to the difference in the number of observations. This means series A is more reliably different from zero than series B. The intuition here is that a larger number of observations results in more reliability. “Standard deviation” is a measure of how much the individual observations in the sample vary from the average. This parameter is in the denominator, so as the standard deviation decreases, the t-stat increases. Consider the two data series below: Series A: 9, 10, 9, 10, 9, 10, 9, 10, 9, 10 Series B: 18, 0, -18, 32, 10, -20, 40, 15, 8, 10 Both have the same 9.5 average and the same number of observations, but series A has much less volatility and a lower standard deviation than series B. As the standard deviation increases, the t-stat decreases, so the average from series B is less reliably different from zero than the same average from series A. Said differently, there is a greater likelihood the 9.5 average from series B happened by chance due to the volatility of the data series. The intuition here is that a more volatile data series results in a mean that is less reliably different from zero. Here is a calculator to determine the t-stat. Don't trust an alpha or average return without one. The Fama and French Risk Premiums are good examples of the use of the t-stat. Based on the long term data, there has been an excess return for exposure to these risk factors, referred to as the US Equity Premium (Risk of the Total Market - Risk Free - 30 d T-Bill), the US Value Premium (High Book to Market - Low Book to Market), and the US Size Premium (Small Companies - Big Companies). An important consideration for investors is the likelihood that these risk “premiums” are actually zero (i.e., there is no premium) despite a historical mean that is positive. As discussed, the starting point is calculating a t-stat for each return series as outlined in Table 1 below. The t-stats in Table 1 are all considered statistically significant (i.e., greater than 2), and we can almost be 99% sure that all three risk premiums are positive, with only the SMB t-stat being marginally lower than the required 2.6 for that level of significance. Figure 3-6D24 Figure 3-6D25 Figure 3-6D26 Figure 3-6D27 Figure 3-6D28 All three data series have the same number of observations, so differences in their t-stats will be a function of different means and standard deviations, as illustrated in Table 2 below. As you can see, the equity premium is the most reliable (i.e., different from zero) despite having the highest volatility because it has a significantly higher mean to go with it. Conversely, the size premium is less reliable than the value premium despite having nearly the same volatility because it has a lower historical mean. In “Challenge to Judgment,” Paul Samuelson dismisses investors who claim they can find benchmark-beating managers by saying, “They always claim that they know a man, a bank, or a fund that does do better. Alas, anecdotes are not science. And once Wharton School dissertations seek to quantify the performers, these have a tendency to evaporate into thin air—or, at least, into statistically insignificant t-statistics.” Although a few managers will occasionally appear to have reliably delivered alpha, IFA cautions investors that the fact that there are so many managers virtually guarantees that there will be some who appear to have demonstrated true skill. Unfortunately, the number of such managers is no higher than what we would have if all of them were monkeys throwing darts at the Wall Street Journal. Two studies that elegantly address this point are:
Rob Silverblatt of U.S. News and World Report spoke with Eugene Fama about the implications of the “Luck versus Skill in the Cross Section of Mutual Fund Alpha Estimates” study conducted by Fama of the University of Chicago and Kenneth French from Dartmouth, which casts serious doubt on managers’ ability to generate alpha. Here is his interview: Figure 3-6E 3.3.5 A Stock Picker's DefeatEven professional stock pickers can fall hard. Bill Miller, chief investment officer of Legg Mason Capital Management and portfolio manager of the Legg Mason Capital Management Value Trust and Value Equity Strategy, lost his Midas touch after a long stretch of beating the S&P 500. On November 17, 2011, the company announced that Miller will be stepping down effective April 30, 2012. Formerly a former Morningstar “Fund Manager of the Decade,” Miller seemed to glitter throughout the 90’s only to have his sparkle go dim towards the end of the following decade. His fund grew from $750 million in 1990 to more than $20 billion in 2006. As of November 16, 2011, total assets are down to $2.8 billion. His Legg Mason Value Trust Fund (LMVTX) is portrayed in Figures 3-A, 3-B and 3-C, showing the risk and return results of his fund for three different time periods, compared to various indexes and index portfolios: Figure 3-A for the decade of the 90s through 2000; Figure 3-B for the ten years from 2001 to 2010; and Figure 3-C for the 28 years and 8 months since the inception of the LMVTX fund. Figure 3A Figure 3B Figure 3C Figure 3-B shows just how hard the mathematics did hit Miller. Despite the fact that his “so-called streak” showed him to outperform the S&P 500 for a 10-year period, Miller’s subsequent 10-year returns from 2001 to 2010 pale in comparison to the indexes and index portfolios shown. Miller’s outperformance and subsequent underperformance were the result of his excessively risky bets on concentrated investments among highly correlated stocks. While equity index portfolios invest across many asset classes and invest in as many as 12,000 companies in 40 different countries, Miller’s strategy was to “place big bets on stocks other investors feared,” cites a Wall Street Journal article, “The Stock Picker’s Defeat.” According to the December 2008 article, “Mr. Miller was in his element [a year ago] when troubles in the housing market began infecting financial markets. Working from his well-worn playbook, he snapped up American International Group Inc., Wachovia Corp., Bear Stearns Cos. and Freddie Mac. As the shares continued to fall, he argued that investors were overreacting. He kept buying.” The article continued, “What he saw as an opportunity turned into the biggest market crash since the Great Depression. Many Value Trust holdings were more or less wiped out. After 15 years of placing savvy bets against the herd, Mr. Miller had been trampled by it.” Miller stated, “The thing I didn’t do, from Day One, was properly assess the severity of this liquidity crisis... I was naïve… Every decision to buy anything has been wrong…It’s been awful.” Not only did the assets themselves plummet, but investors bailed on the fund pushing its assets down from its apex of $21 billion to around $4.2 billion. As a final point in the story of Bill Miller, here once again, is the Alpha chart for the Legg Mason Value Trust. The 401 years needed to justify statistically significant alpha tells the whole story. Figure 3D This is a lesson for long-term investors who pick fund managers whom they believe are skilled in stock picking. In this case, the manager is leaving the fund after a roller coaster 30-year career. It might be a good idea to put a warning on the Legg Mason Value Trust prospectus reminding investors that luck is not a reliable source of returns in the future – maybe something along the lines of the health warning on a package of cigarettes.
See this article for more Lessons from Bill Miller: Don't concentrate, don't style drift, and nobody can beat a risk adjusted market over long periods. Invest right, sit tight. Also see the Quote of the Week #45.
3.3.6 Stock Pickers and Coin FlippersThe attempt
to predict the outcome of a coin toss is a futile endeavor. Unless
the coin is rigged, the only way to make a correct prediction is to
guess blindly. Unfortunately, it is with the same disregard for investors’
financial health that the financial institutions and media perpetuate
the false idea that some people have a gift or method for predicting
future stock price gyrations.
|
|
|