Tag Archives: most-popular

The Unreliability Of Human Judgment

Human decision-making is greatly influenced by individualistic preferences, making it very unreliable in most situations. We tend to foolishly project our own biased opinions onto other people, which can adversely affect the quality of our judgment. A statistical approach to decision-making, which requires little, if any, subjectivity, is a lot more robust and reliable. Back in the late 1990s, a struggling author and divorced mother on welfare was trying to publish her first book — a story about an orphan boy wizard. It was rejected by 12 publishers, and her agent warned her that she would “Never make money writing children’s books.” This prediction would prove to be spectacularly wrong. As it ironically turned out, 13 was her lucky number when a small London publishing house reluctantly took a chance and agreed to print it. That book, Harry Potter and the Philosopher’s Stone (or Sorcerer’s Stone for the American version) , went on to sell over 100 million copies, making it one of the best-selling books in history. And that author, J.K. Rowling, would eventually write six more books in the Harry Potter series, which collectively sold over 450 million copies and were adapted into a blockbuster film franchise. Not only did J.K. Rowling make money writing children’s books, it in fact made her rich. Stories like this are not uncommon. A publisher turned down George Orwell’s legendary novel, Animal Farm , explaining it was “Impossible to sell animal stories in the U.S.A.” Decca Records turned down a contract with the Beatles, saying “We don’t like their sound, and guitar music is on the way out.” Walt Disney was fired by a newspaper editor because he “lacked imagination and had no good ideas.” Oprah Winfrey got fired from a job as a news reporter because “she couldn’t separate her emotions from her stories.” Arnold Schwarzenegger was told he’d never be a movie star because “his body, name, and accent were all too weird.” These success stories should really make us question the reliability of human judgment. How could a dozen experienced publishers deem the manuscript for the first Harry Potter book unworthy of publication? Why is it that a large recording company, whose job it was to seek out talented musicians, couldn’t recognize the potential of the Beatles? What took Hollywood so long to recognize the star potential of Arnold Schwarzenegger? The answer is simple — human judgment is influenced by individualistic preferences, making it an unreliable predictor of future outcomes. Let’s say a publishers reviewing the original Harry Potter manuscript happened to dislike stories about magic for some reason, this bias against magic would largely determine whether the book gets published or not. But just because one individual, or even a small group of individuals, dislike a book, that doesn’t mean the book won’t become a best-seller. We should never project our own subjective opinions onto others, because it can adversely affect our judgment and decision making. This is something I learned in high school, when my friend and I once turned in identical essays. Luckily for us, not only did our overworked teacher not notice, she gave my essay a 95 (an A) and my friend’s essay an 82 (a B). Perhaps she just liked me more which subconsciously influenced her grading decision (that’s what I told my friend anyway). Or maybe she was in an unusually good mood at the time she was grading my paper. As crazy as it sounds, the second explanation could in fact be true. It’s been shown that even judges, who are trained to be objective, rule more favorably after lunch breaks (because food puts them in a good mood). The inherent subjectivity involved in grading can be quite problematic since a student’s future depends on such imprecise measurements. In one study , for example, researchers collected 120 term papers and had each paper scored independently by eight faculty members. The resulting grades sometimes varied by two or more letter grades. On average they differed by nearly one letter grade. Given that the average opinion is typically more accurate than most of the individual estimates (i.e., “wisdom of crowds”), the best solution here would be to average the eight independent scores for each paper to derive a more objective overall grade. I once recommend that Seeking Alpha implement something similar. The current editing process is highly subjective. It’s unrealistic to think that an editor, who’s as naturally biased as the publishers that rejected Harry Potter , can distinguish so finely between articles to tag one as, say, an “Editors’ Pick” and another as standard (“Regular” or “Premium”). But by having multiple editors independently reviewing and grading the quality of each article, and then averaging their individual opinions, it would eliminate much of the subjectivity inherent in the editing process. Another subjective measurement that receives more credence than it deserves is the rating of wines. My favorite example is the rating of the 1999 vintage of the Mitchelton Blackwood Park Riesling. One wine rating publication gave it five stars out of five and named it “The Wine of the Year,” while another rated it at the bottom of all wines it reviewed, deeming it “the worst vintage of the decade.” This discrepancy is to be expected, of course, given that wine ratings are based on unreliable, subjective taste perceptions of wine tasters. In one series of experiments , judges at wine competitions were given the same wine at different times throughout the day; the results showed that judges are wildly inconsistent in their evaluation. A wine rated 90 out of 100 on one tasting would often be rated 85 or 95 on the next. This inconsistency explains why the probability that a wine which won a gold medal in one competition would win nothing in others was high; in fact, the medals seemed to be spread around at random. This should make you think twice before purchasing an expensive bottle of wine next time. So far we’ve seen that all subjective measurements are flawed and unreliable. The best way to fix this problem is to take a more objective, statistical approach to measurement. A well-known example of this is Moneyball , a true story about a low-budget baseball team that leveraged statistics, rather than the subjective beliefs of baseball insiders, to identify players whose skills were being undervalued by other teams. This statistical approach to player selection revolutionized the game, and has since been implemented in other sports as well. Credit card companies have also learned to appreciate the power of simple statistical measurements. In the past, human judgment was the primary factor used to evaluate a borrower’s credit worthiness. Not only was this a slow process, it was also very subjective and created a lot of variability in the results. But then a statistical formula known as a “credit score” came along and put a solid number on how risky you were to lend to. The investment business is another area where statistics has gained a strong foothold over the past couple of decades. Investors who employ statistical trading methods are usually called “quants.” The world’s most successful quantitative hedge fund is Renaissance Technologies, which uses elaborate algorithms to identify and profit from inefficiencies in various highly liquid instruments around the world. But investors don’t need to be as sophisticated as Renaissance in order to reap benefits from quantitative investing — even very simple statistical models can work quite well. One of the most well-known is the “Magic Formula,” a model that ranks stocks based on just two variables: return on capital (measures quality) and earnings yield (measures cheapness). Researchers have conducted a number of studies on the Magic Formula and found it to be a market beater, both domestically and abroad. But even a simpler one-variable model, which only uses the cheapness metric, has also been shown to beat the market over the long run. The reason quant-style value investing works is because, unlike a more traditional approach to stock selection, it doesn’t attempt to calculate a company’s “intrinsic value” by foolishly attempting to forecast its long-term financial performance. Instead, it systematically buys the cheapest — and often most hated — stocks based purely on historical data (a very contrarian approach). Another problem with the concept of intrinsic value is that there’s absolutely nothing “intrinsic” about it. It’s not an objective measure at all. It depends entirely on the person doing the valuation, just like the quality of wine depends on the person doing the tasting. This is largely because risk preferences vary from person to person, and even in the same person from time to time. This was discovered by neuroscientists studying professional traders. They found that fluctuating hormone levels — like testosterone and cortisol — can wildly alter a trader’s risk taking or risk aversion. And since these shifting risk preferences directly affect discount rates, which determine the present (or intrinsic) value of stocks, it means that intrinsic value isn’t static — it’s actually in constant flux. Traditional stock picking is flawed in other ways as well. Even the mere act of owning a stock, particularly one you’ve spent considerable time researching, can create emotional attachment, leading you to value it more than you would if you didn’t own it. Inheriting a stock can also create a similar emotional attachment. A friend of mine once inherited a large number of shares in General Motors (NYSE: GM ). When I advised him to sell some shares and diversify the proceeds, he said he “Can’t bring himself to part with his grandfather’s gift.” Unfortunately for him, this “gift” became worthless a year later when the company filed for bankruptcy. This irrational tendency to overvalue something just because we own it is called the “endowment effect.” In residential real estate sales, for instance, there is, on average, a 12% gap between what the owner asks and what the average buyer is willing to pay (in a bad market the gap exceeds 30%!). This is because owners truly believe their homes are worth more. Perhaps they’ve lived there for a long time and have many happy memories associated with that house. The buyer, on the other hand, is more likely to care about things like the black mold growing on the ceiling. It’s just difficult for us to see that the person on the other side of the transaction, buyer or seller, isn’t seeing the world as we see it — value is largely subjective. As explained throughout this article, most of our decisions, both big and small, are guided by our subjective emotions and perspectives. This is usually an automatic cognitive process. Psychologists call it “System 1” thinking, which is fast, instinctive, and nearly effortless. The opposite of this is “System 2” thinking, which is slow, deliberate, and effortful. Our brains tend to be lazy, always looking for the easiest way out, so System 1 guides the majority of our day-to-day decisions. And most of the time it’s actually quite effective. For instance, ever drive home without remembering the exact details of the trip? That’s your System 1 at work. But sometimes this type of fast thinking can lead to poor decisions. Consider this famous example: A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost? The most common answer — and the one suggested by our System 1 — is 10¢. But the real answer is actually 5¢. It requires the slow, effortful thinking associated with System 2 to get it right. Most people simply don’t want to think that hard, so they give the first answer that comes to mind. These same mistakes occur in every domain. In sports, for instance, decisions worth millions of dollars are made on the basis of a coach’s hunch or a scout’s gut feeling. This explains why there’s a long history of so-called “promising” athletes that never realized their full potential. Moneyball showed us that traditional scouting often focused more on the so-called “eye test” (i.e., if someone “looked” like a major leaguer) than on a more objective, statistical analysis of player potential. I myself was twice offered a full athletic scholarship to play football in college. The funny thing is that I never even played the sport before. The recruiters and coaches — fooled by their quick-thinking System 1 — just assumed that I’d be a good football player because of my size and athleticism. I respectfully decline these generous offers (definitely not worth the injuries). In short, reducing subjectivity is a desirable goal for decision makers of all kinds — from entrepreneurs to investors to individuals dealing with their day-to-day personal problems. However, this isn’t to say that individualistic subjectivity is always a bad thing. There are some situations, mate selection being one of them, where it can be quite useful; beauty is, after all, in the eye of the beholder — it’s subjective and difficult to quantify. However, in most other situations, especially ones involving a financial component to them, subjectivity tends to cause more harm than good. In this particular case, the best way to minimize the probability of being wrong is to leverage the power of a more objective, statistical way of thinking. Disclosure: I/we have no positions in any stocks mentioned, and no plans to initiate any positions within the next 72 hours. (More…) I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it (other than from Seeking Alpha). I have no business relationship with any company whose stock is mentioned in this article.

Backtesting – A Cautionary Example

Independent research, long/short equity, dividend investing, ETF investing “}); $$(‘#article_top_info .info_content div’)[0].insert({bottom: $(‘mover’)}); } $(‘article_top_info’).addClassName(test_version); } SeekingAlpha.Initializer.onDOMLoad(function(){ setEvents();}); My previous article detailed backtest results for the ETFReplay.com portfolio. Aggregate, risk-adjusted results since 2004 were impressive when compared to a 60/40 Vanguard mutual fund. However, results over the past 2-3 years lagged the benchmark. The test below was conducted using Portfolio123 (“P123″). It uses a similar ranking system to the ETFReplay 6/3/3 system but has a few seemingly “minor” differences: The P123 begins with a similar basket of ETFs, the only difference is the P123 system ranks 15 ETFs instead of 14, with the PowerShares DB Agriculture ETF (NYSEARCA: DBA ) as the extra ETF. The starting date for the P123 test is 12/10/03, which differs from the ETFReplay start date of 1/1/2004. The P123 system rebalances every 4 weeks, instead of at the end of each month. The ETFReplay test assumes equal holdings each month (i.e. rebalancing back to equal weight each month at no cost) while P123 lets positions run so holdings may become unbalanced over time. The P123 test uses the next days closing price of each ETF for the transaction price, compared to the ETFReplay system which uses the same days closing price when each ETF is ranked. Finally, and perhaps most importantly, the P123 test accounts for slippage with each transaction, which reduces returns. The slippage for each transaction is calculated based on the average trading volume for each ETF. This is a conservative method for calculating ETF slippage. After accounting for these differences, we see the P123 test shows significantly lower results (as an aside, the benchmark for this test was the SPDR S&P 500 Trust ETF ( SPY)): Tables and charts courtesy of Portfolio123 (click to enlarge) (click to enlarge) However, if we assume zero slippage results improve dramatically. Total and annualized return are significantly higher yet we still see different returns and risk metrics than the ETFReplay test. This can be attributed to a slightly different pool of ETFs, and different rebalancing dates/methodology: (click to enlarge) (click to enlarge) The point of this exercise is not to disparage backtests or historical results. Rather, it shows the importance of considering trading costs as well as how changes in test parameters can impact results. Focus on making your tests robust. Run them through multiple time frames with different assumptions and be mindful of data-mining. Finally, be conscious of trading costs and fees! Many brokers now offer commission free ETFs, but taxes and trading slippage can take a big bite out of returns. Disclosures: None Share this article with a colleague

How To Select Securities

In earlier steps, you should have defined top level asset classes that have a lack of correlation to one another and a timeless strategy. Then, you should have selected underlying sectors that strategically boost returns and take advantage of leading indicators. Lastly, you determined a system to categorize future holdings to implement your investment strategy . The next step is to select specific securities you will purchase to implement this strategy. We call this list of securities our “Buy List,” because it is what we would ideally purchase if a client came to us all in cash. We regularly review our Buy List as we are always looking for even better securities to implement our investment strategy. We use three criteria to judge securities. First, the investment should be diversified within its sector. Diversification is the means of achieving a rebalancing bonus , a boost in returns and decrease in volatility due to moving between low correlation stocks. This bonus is highest when the correlation is lowest, but even correlated assets which often move in sync over the long run experience different volatility and returns over any given year. For this reason, your Buy List investment selections should be as diversified as possible within the targeted sector. Such diversification is not easily achieved with individual company stocks. To use U.S. large cap stocks as an example, you would need over 60 individual stocks to achieve only 86% of the diversification . Diversification via individual stocks is even more difficult for small and mid-cap stocks and impossible to accomplish for foreign stock categories using US stock exchanges. For this reason, your buy list should be funds – exchange traded funds (ETFs) or mutual funds – rather than individual securities. There are two methods of seeing how well diversified a fund is within its category. First, favor funds with a large number of holdings. There is no such thing as “over-diversification.” A fund with 200 small cap value stocks is better diversified than one with only 75. Second, favor funds where the top ten holdings represent a smaller percentage of the fund. A small cap value fund whose top ten holdings represent 75% of the fund is less diversified than one where the top ten holdings only represents 25% of the fund. Second, the expense ratio should be low. The most important selection criterion is expense ratio. Morningstar did a study to see what was a better indicator of a fund having better returns in the future: having a low expense ratio or having more Morningstar stars. They determined that having a low expense ratio was a better indicator than Morningstar stars . Low expense ratios help you earn more when markets go up and lose less when they go down. Over time, this can significantly affect the value of your portfolio. The return of an index fund is simply the return of the index plus or minus tracking error minus fund expenses. This is why having a low expense ratio gives you the best chance of having higher returns. Currently the average asset weighted expense ratio for a stock mutual fund is 0.74%. This cost has been dropping over the past decade as funds have to lower costs in order to compete for market share. In most cases, a good investment advisor can significantly reduce the cost of the funds used to diversify your portfolio. We regularly build portfolios with low expense ratios, and many of the revisions we make to our Buy List are moving towards lower cost funds. Our online gone fishing portfolio has an expense ratio of just 0.32% . Finally, we look for low trading costs. Trading costs, like expense ratios, can hurt returns. While expense ratios hurt fund returns, trading costs hurt the rebalancing bonus by putting a drag on moving in and out of investments. Trading costs are tied to your custodian. For this reason, selection of your custodian is extremely important to your investment philosophy. At Schwab, there are four different types of investments, each with their own type of trading cost. Most stock and ETF trades are made at Schwab’s $8.95 per trade brokerage fee. There are some ETFs on the Schwab platform for which they wave the brokerage fee and allow you to trade them for no cost. This can be deceptive. Some of the no-transaction fee ETFs are Schwab funds with higher expense ratios or larger trading spreads either of which could be more costly than a trading fee. For mutual funds, there is a fee between $25 and $50 depending on how much is being traded and how you have negotiated fees for your clients. Although you should hunt for low trading costs, you should also change your purchasing habits based upon the cost. First, because of compound interest, larger amounts of money can overcome a small trading cost faster than smaller amounts of money. So, if you have a trading cost, assess the how long different initial investments takes to earn back the fee. We do this by measuring the trading cost as a percentage of the purchase amount. If you are purchasing $30,000 of an investment, a $30 trading fee is only 0.1%, which could be earned back in a week. However, if you are only investing $600 in the same investment, the same trading fee would be 5% of your investment, which may take 1 year to earn back. Second, we use a simple technique we call “Rocks and Sand” to keep total expenses low without losing the flexibility of diversifying and rebalancing. Rocks are higher trading cost but lower expense ratio investments. They have a fee when you purchase them as well as a fee when you break them apart but they are cheaper to hold once you have purchased them because of their lower expense ratio. Meanwhile, sand has a lower trading cost but a higher expense ratio. Sand is easy to move from one investment to another but is not ideal for long-term holding. Our transaction fee ETFs or mutual funds are Rocks while no-transaction-fee funds are Sand. We fill each asset class with Rocks. Then when smaller monthly deposits come in, we fill in the asset class with Sand around the Rocks. When a significant amount of Sand collects in an asset class, we sell the sand to purchase a Rock in its place. In this manner, you can identify ideal securities for your Buy List as well as trade them efficiently. Although it is not easily appraised, we believe a curated list of funds is extremely valuable. Our investment committee meets regularly to reevaluate and adjust our Buy List. Even if all you use to select funds is expense ratio, the value might be as high as 0.42%. We can only imagine the value of fund selection also based on strategic fit, diversification, index followed, and trading costs. Your investment strategy is critically important but the implementation requires wise fund selection.