The accuracy of cross-country valuation by multiples using comparables – the cultural aspect

INTRODUCTION Assume that all stocks are generally priced at their market value. A simple method of challenging the prices is the use of multiples from other companies in the industry (averages), from financial analysts’ reports as seen in market, or as described in many textbooks, since the intuition is that (almost) similar companies with regard to size, timing and uncertainty of expected future cash flows should be priced (almost) the same, thus having (almost) identical multiples, like P/E, P/B, etc. Multiple valuation using comparables is more like an art than it is a science, since it often offers rules of thumb such as if a stock trades at the bottom end, i.e. multiples are relatively small, then the stock is likely to be of good value. However, if the actual sizes of multiples are due to specific cross-country bias-factors, like differences in countries’ business climate, different accounting regimes, or differences in culture, then this advice can be misleading. At the very least the multiples should probably be adjusted for such bias-factors before any buying or selling decision is made and followed. When we trust the market, we find that the market is (almost) always right, and therefore the average differences between a company’s observable multiple and one calculated on the basis of other companies’ multiples should be small.


INTRODUCTION
Assume that all stocks are generally priced at their market value. A simple method of challenging the prices is the use of multiples from other companies in the industry (averages), from financial analysts' reports as seen in market, or as described in many textbooks, since the intuition is that (almost) similar companies with regard to size, timing and uncertainty of expected future cash flows should be priced (almost) the same, thus having (almost) identical multiples, like P/E, P/B, etc. Multiple valuation using comparables is more like an art than it is a science, since it often offers rules of thumb such as if a stock trades at the bottom end, i.e. multiples are relatively small, then the stock is likely to be of good value. However, if the actual sizes of multiples are due to specific cross-country bias-factors, like differences in countries' business climate, different accounting regimes, or differences in culture, then this advice can be misleading. At the very least the multiples should probably be adjusted for such bias-factors before any buying or selling decision is made and followed. When we trust the market, we find that the market is (almost) always right, and therefore the average differences between a company's observable multiple and one calculated on the basis of other companies' multiples should be small. Hofstede (16,17) documents that people are organised, do things, and think differently in different countries as shown in his well-known cultural indicators. Based on this framework on cultural differences, Gray [15] and later Radebaugh et al. [27] map the indicators into accounting values by transforming the cultural differences into accounting constructs based on how the cultural indicators are hypothesised to affect accounting practice and systems. This paper's main contribution is the challenging of our understanding of valuation based on multiples across countries. The selection of comparables and target from same country leads to small differences, all other things being equal, whereas the use of a global database for the selection of comparables creates larger differences, all other things being equal. The need for handling cultural and country oriented differences as well as differences in accounting practice (different accounting regimes) is obvious, since this leads to smaller differences than use of industry as the only selection criteria. Consequently we see more precision, when we extend the classic approach in Alford [2] to include these considerations in our approach to the multiple valuation, and this obviously increases the method's usefulness.
The rest of the paper is structured in the following way: In section 2 we present the motivation and literature review for the study, and we present our expectations and research design. In section 3 we describe the data collection procedure and we present descriptive statistics. In section 4 we present our results and discuss some implications. Finally, we conclude the paper in section 5. multiples for valuation purposes, Plenborg and Pemental [25] concluded that there are three methods that could be used for the peer group selection. Like Alford [2], the peer group selection could be based on industry classification since companies that operate in same industry are likely to show similar risk and growth characteristics. Or, like Bhojraj and Lee [4], the peer group selection could be based on similar economic valuation fundamentals, such as profitability, growth and risk. Or, like Lee et al. [20], the peer group selection could be based on search traffic pattern on websites, since frequently co-searched companies are likely to be economically similar. Dittmann and Weiner [11] observe that much of the empirical research on multiple valuation is focused on the accuracy of valuation techniques and on the statistical measure for averaging the multiples of comparable companies. Agrrawal et al. [1] show that the use of harmonic mean for price-earnings ratios improves company valuation estimates compared to the use of geometric or arithmetic equal or weighted average estimations of mean or median. This result is in line with the findings by Dittmann and Weiner [11], who find that the harmonic mean leads to more accurate valuations than does the arithmetic mean. Cooper and Cordeiro [7] point at an "extensive academic interest" in equity valuation based on multiples that has arisen "only recently". Much of the empirical research on valuation using multiples deals with the relative performance of multiples. Baker and Ruback [3] analyse the performance of multiples based on EBITDA, EBIT and Sales. Their results indicate that industry-adjusted EBITDA performs better than EBIT and Sales. Liu et al. [21] study the performance of a comprehensive list of value drivers and find the general rankings of multiples to be as follows: 1. Forward earnings measures; 2. Historical earnings measures; 3. Cash flow measures and book value of equity; and 4. Sales. They reveal that earnings still have the best performances in this case, because they got more accurate results using earnings than using any of the other measures. However, the minimum standard deviation of the valuation error in the study by Liu et al. [21] was 28.3%, which is a relatively large valuation error for a study where the best value-driver was chosen from a set of 17 potentially useful multiples to minimise the valuation error.
Concerning the accuracy of valuation using multiples, Cooper and Lambertides [8] find that a small part of the error is due to inaccurate matching of observable characteristics of target companies, whereas a larger part is caused by unobservable, but persistent differences in the characteristics of target and comparable companies, creating "a limit to the accuracy".
As a means to increase the accuracy of equity valuation using multiples, Yoo [35] examine if a linear combination of several simple multiples improves the valuation accuracy compared to the use of the simple multiple valuation technique. The study was conducted using a US data sample consisting of data from 3,246 companies between 1984 and 1999. However, the findings imply that the simple multiple valuation that only uses forward earnings multiples is not very accurate, whereas historical multiples may carry more useful information. Cooper and Cordeiro [7] show that the relative accuracy of a valuation based on comparables does not vary much across industries since the addition of more companies only adds noise and not precision. The attribute that increases the relative accuracy the most is the actual similarity of the comparable companies. In their study, Dittmann and Maug [10] investigate whether the choice of error measure, percentage or logarithm errors affects valuation biases using different averaging methods and using different multiples. Bhojraj et al. [5] operationalised the "industry" parameter by comparing four broadly available industry classification schemes, and they found that the Global Industry Classifications Standard (GICS) was significantly better at explaining stock return co-movements and crosssectional variations in valuation multiples, whereas the Fama and French [14] algorithm often used by academics proved to be the second best in the study. In Serra and Favero [31] as well as in Dittmann and Weiner [11], the country factor is challenged. Serra and Favero [31] question if there are specific considerations to address when selecting cross-border comparables by studying whether mean industry multiples are similar in Brazil and in the US, and they find significant variability across companies within the same industry in each country and between the two countries, which makes the use of industry multiples harder to justify.
In their study across European equity markets on valuation accuracy, Schreiner and Spremann [29] find that using different types of multiples generally approximates market values reasonably well across Europe as expressed by the use of the Dow Jones STOXX 600 companies. According to Dittmann and Weiner [11], the introduction of the euro in 1999 seems to have had no effect on valuation errors of European companies, although their sample period is too short to give a final answer to this question.
In their study, Young and Zeng [36] challenge accounting regime as an explanatory factor by examining the link between enhanced accounting comparability and the valuation performance of pricing multiples using the Bhojraj and Lee [4] method, and they find that comparable peers selected across 15 European Union countries made better valuation performance measured as pricing accuracy, a fact that at least partly can be attributed to the increased accounting comparability caused by the mandatory use of IFRS since 2005. Additionally, Young and Zeng [36] mention culture as a potential explanatory factor for the observed differences in performance. Actually, Violet [34] is one of the first to suggest that accounting is not culture-free. He argues that accounting should be seen as a 'sociotechnological activity' involving interaction between both human and non-human resources, and consequently culture is often a very important factor when comparing available financial accounting information. Later, Nobes [22] finds that culture is one of the main factors causing accounting differences across countries as well as across companies. According to Nobes [22] the legal system and the capital market are two distinct and quite country-specific institutional elements that are influenced by culture and hence highly affect how the accounting regime is developed in a country. Thus, different accounting regimes, practices, approaches, and audit behaviour related to the production of this information should be considered as reflecting culture, even though this is not the only factor. Gray [15] identifies four widely recognised accounting values, Professionalism, Uniformity, Conservatism, and Secrecy, and establishes a linkage between these values and the four cultural dimensions proposed by Hofstede contemporarily, Power distance, Individualism, Masculinity, and Uncertainty avoidance. Later, when Hofstede [17] had added his fifth cultural dimension, Long-term orientation, Radebaugh et al. [27] extend Gray's original description of the four accounting values by making references to Hofstede's five cultural dimensions as follows 1 : • Professionalism (PROF) entails a preference for individual professional judgement and maintenance of professional self-regulation as opposed to compliance with prescriptive legal requirements and statutory control. • Uniformity (UNIF) entails a preference for uniform accounting practices between companies and for consistent use of such practices over time as opposed to flexibility in accordance with the perceived circumstances of the individual companies. • Conservatism (CONS) entails a preference for a cautious approach to measurement to cope with the uncertainty of future events as opposed to a more optimistic and risktaking approach. • Secrecy (SECR) entails a preference for confidentiality and restrictions on disclosure of information about the business such that information is only disclosed to those who are closely involved with its management and financing as opposed to a more transparent, open, and publicly accountable approach.
The four relationships are presented below in Table 1. It is important to emphasise that all concepts are only verbal descriptions where the focus is on links and mutual influence on each other, while the absolute relations are not considered.  Gray [15] and Radebaugh et al. [27] use the terms strong, less strong, and weak to describe the relationships between the cultural dimensions and the accounting values as shown in Table 1.
To facilitate a weighted combination of the multiple elements that comprise an accounting value, we translate these terms into weights of four, two, and one for strong, less strong, and weak, respectively, for our adjustment for cultural influence. Hereby, when the accounting values are calculated, a relationship described as strong carries twice the effect of a rather than feminine values of relationships, nurturing and caring. Uncertainty avoidance refers to the degree to which individuals feel uncomfortable with ambiguity and uncertainty. Long-term orientation refers to the preference for encouraging people to focus on future rewards, thrift and endurance.
relationship described as less strong, and similarly a relationship described as less strong carries twice the effect in our weighting method as a relationship described as weak.
Since Gray presented his accounting values constructs in 1988, there have been several contributions in the literature attempting to extend, test and refine the relations in Table 1 in order to understand the influence of culture on accounting. Perera [24] was probably the first to do so, and he provides additional discussions of the Gray-constructs in respect of the claimed relationships and considers both Hofstede's cultural indicators and Gray's accounting value dimensions and uses them to explain apparent differences in the accounting practices adopted in continental European countries versus in Anglo-American countries. Since then, many studies ( However, studies showing mixed or non-positive results have also been presented, see for example Doupnik and Tsakumis [13], which indicates that the validity of cultural dimension theories needs further testing, but as Joannides et al. [18] suggest, the critiques of the Hofstede-Gray setting can be questioned as well.
Alford [2] and a number of other studies on multiple-based valuation only use data from one country. Once the analysis is extended to include several countries, the multiple-based target company valuation and the peer group selection become more complicated, as companies from different countries may differ simply due to differences in culture, religion, legal or political or accounting regimes, and economic conditions, all of which could lead to less accurate valuations, but to our knowledge until now, no-one has tried to incorporate our quantify this effect..
We believe that even though a country factor could probably capture many of these differences, it makes more sense to quantify the cultural contribution of the countries in a more direct fashion as expressed by our practical interpretation of Gray and country specific economic conditions. The latter can be quantified by using the overall business climate in every country as presented in the officially published notions on country "performance" in the yearly Global Competitiveness Report from World Economic Forum (WEF -see Schwab [30]). Furthermore, differences in accounting regimes are often mentioned as central when doing studies involving international comparability and using accounting numbers, for which reason this factor is also addressed explicitly.
For our study, we use a large global sample of companies as the basis from which to choose and select the peer group companies. As a consequence, if the selections of peer group companies are made without any restrictions, we expect that we might face some of the abovementioned potentially disturbing factors. Our expectations can be explicitly formulated as follows: § Restrictions on peer group selection: ü Same accounting regime as the target should increase average accuracy. ü Same country oriented economic competitiveness factor (WEF) as the target should increase average accuracy. ü Same culture as the target should increase average accuracy. § Phrased alternatively, the latter actually means: ü Low precision is probably due to cultural differences among the target's peer group companies.
To challenge our expectations, we use five commonly used measures of classic company multiples, Price-to-Earnings (P/E), Price-to-Book (P/B), Price-to-Sales (P/S), Enterprise-Valueto-Sales (EV/S), and Enterprise-Value-to-EBITDA (EV/EB). However, since we intend to use the harmonic mean, we use as the equivalent the inverse multiples averaged by the arithmetic method, in accordance with for example Agrrawal et al. [1]. And as our main precision level measure we use the Mean Absolute Error, since this well-known metric is more robust to outliers than, say, squared measures.
According to classic studies such as Alford [2], an appropriate way to analyse the usefulness of multiple-based valuation is to calculate a target company's peer group multiple and multiply by the / an appropriate accounting variable to reach a calculated value that can be compared to the target company's observable value four months after the balance sheet date. As shown in e.g. Plenborg and Pemental [25], there are many factors and parts to be considered when developing the analysis design. To ensure comparability to earlier studies we choose to let us be inspired by Alford [2] and later studies following him, by using a quite similar research design as to the formation of a target company peer group. We determine an estimate for the !̂ value for each multiple for each target company by using the harmonic mean of the peer group companies. When we compare these estimates to the target company's actual stock market value four months after the balance sheet day, we obtain a value for the error magnitude.
Hereafter the mean absolute prediction error (MAE) for all the companies per multiple is calculated, and the size of MAE gives valuable information on the accuracy for each multiple using different ways of forming peer groups. Afterwards, we compare and evaluate the precision of different calculations.
First, we use our global dataset when we follow the peer group selection methodology in the classic US-data based study by Alford [2], and several later studies: • Market: We select all non-target companies in sample.
• Industry: We select all non-target companies in sample from the same Fama-French [14] industry as target. • Total Assets: We select the ten companies in the sample that have a Total Assets value closest to that of the target company.
• ROE: We select the ten companies in the sample that have a Return on Equity value closest to that of the target company.
• Industry and Total Assets: We now focus on the companies in the sample from the same  industry as the target and select the ten companies with Total Assets values closest to that of the target company. • Industry and Return on Equity: We again focus on the companies from the same  industry as the target and select the ten companies with Return on Equity values closest to that of the target company. • ROE + Total Assets: We now apply a two-step selection procedure. First, we select the 2.44% of the total sample, i.e. 412 companies that have Return on Equity values that are closest to that of the target company. Second, we select the 2.44% of the 412 companies, i.e. ten companies that have Total Assets values closest to that of the target company. • Total Assets + ROE: We again apply a two-step selection procedure. First, we select the 2.44% of the total sample, i.e. 412 companies that have Total Assets values that are closest to that of the target company. Second, we select the 2.44% of the 412 companies, i.e. ten companies that have Return on Earnings values closest to that of the target company.
As the next step, we present sequentially detailed analyses introducing a) accounting regime (AR); b) country oriented economic competitiveness factor (WEF); and c) culture. To enable us to introduce accounting regime as a means for further and relevant specification for market and industry, we add the following two peer group selection prescriptions: • AR + Market: All non-target companies in sample having same accounting regime as the target company. • AR + Industry: All non-target companies in sample from the same  industry as the target and having the same accounting regime as the target company.
Our expectation is that the introduction of accounting regimes will improve the average precision for each target company's valuation compared to a purely market-based peer group selection procedure, and compared to an industry-based peer group selection procedure.
Since our starting point is a global perspective, we introduce a country based economic competitiveness factor (WEF in [30]) as a means for a further and relevant specification for Total Assets and Return on Equity, and thus, we add the following two peer group selection prescriptions: • WEF + Total Assets: We first select all companies in the sample with the same WEF as the target company and select the ten companies that have Total Assets values closest to that of the target company. • WEF + ROE: We first select all companies in the sample with the same WEF as the target company and select the ten companies that have Return on Equity values closest to that of the target company.
In order to introduce culture, proxied by Gray's accounting values, as a means for a further and relevant specification for Total Assets and ROE, we add the following peer group selection prescriptions: • Culture + Total Assets: We select the ten companies with the same culture as the target company that have Total Assets values closest to that of the target company. We do this for each of the culture-values: PROF; UNIF; CONS; and SECR. • Culture + ROE: We select the ten companies with the same culture as the target company that have Return on Equity values closest to that of the target company. We do this for each of the culture-values: PROF; UNIF; CONS; and SECR.
As the third step we evaluate the results. First, we evaluate the five different multiples based on the ranking of their MAE (their precision) and extract four of the multiples and put them into the two most contrasting groups containing on the one hand the two most precise multiples (high precision -the two levered multiples Earnings-to-Price and Book-to-Price), and on the other hand the two most imprecise multiples (low precision -the two unlevered multiples Sales-to-Enterprise Value and EBITDA-to-Enterprise Value).
Second, for the multiples in each precision group (high and low), the four Gray accounting values are evaluated by contrasting the differences in precision between the best ten per cent and the worst ten per cent. Since we expect low precision to be due to cultural differences, we expect that culture is a key explanatory factor for the imprecision for the 90 th percentile, i.e. the most imprecise companies, and when we compare the precisions from the 10 th and the 90 th percentile, the expectation is that the relation between culture and error for the 90 th percentile is much larger than for the 10 th percentile.
And third, we perform regression analyses for both the 10 th percentile and the 90 th percentile to verify the expectations that the different cultural accounting values are systematically contributing to the imprecise accuracy measure as a sensitivity analysis. To ensure that we uncover the effect from culture, we rank all the companies after their absolute prediction error and divide the resulting list in three parts: the ten per cent having the smallest absolute prediction errors; the ten per cent having the largest absolute prediction errors; and the rest. Hereby we can address our expectations further by contrasting the bunch of companies showing smallest absolute prediction errors with the bunch of companies showing largest absolute prediction errors using OLS-regression in several different combinations for sensitivity analyses of the absolute prediction errors and central culture variables.

DATA SELECTION AND DESCRIPTIVE STATISTICS
From the ORBIS-database we selected all listed non-financial and non-insurance companies during May 2016 and deleted all companies that had negative earnings and/or equity to ensure that all companies with negative multiples are disregarded. This procedure left us with more than 17 thousand public accounts from 112 different countries from all over the world, and from 44 different industries using the Fama-French classification (Fama and French [14]).
Based on available information from PriceWaterhouseCoopers' contemporaneous publication [26] on IFRS adoption by country, we categorised each country's accounting regime: is IFRS mandatory, permitted or not allowed for listed companies in the country. For the mandatory regime, we also considered the "version" of IFRS that is referred to, all of which leaves us with four accounting regimes as: (i) Mandatory IFRS (as prescribed by the IASB); (ii) Mandatory EU (as published by IASB and accepted by the EU); (iii) Permitted (as prescribed by the IASB); and (iv) Disallowed (since another GAAP is prescribed).
Since the Hofstede cultural indicators (PDI, IDV, MAS, UAI, and LTO) unfortunately are not available for all countries in the World, we deselected the companies from those quite few countries where we would not find calculated Hofstede indicators (via his website April 2016). Companies with incomplete datasets were deleted, and the remaining result was 16,898 valid companies from 112 different countries. The descriptive variables for the five multiples that involve an accounting and a market price element are presented in Table 2 above, and the numbers show some variations. Also, the statistics for a number of other relevant variables for some of the peer group selection procedures are shown, and risk, size and leverage are presented. We see the natural logarithm of Total Assets as a size variable, and as in Alford [2], it is introduced as a surrogate for risk, while Return on Equity (ROE) is introduced as a surrogate for growth. Financial Leverage (FLEV) is often used as a risk-oriented control variable, but here it primarily serves as the key link between the three levered and the two unlevered multiples. That all negative multiples are disregarded becomes obvious when we look at the numbers in the table, since the descriptive statistics shows that all multiples are positively skewed, which can also be easily seen in all cases, as the means are larger than the medians. Table 3 we present the results, i.e. average mean absolute errors for our chosen five multiples resulting from different starting point selection methods as presented. The results give an indication of the variation in the sample and the apparent differences in precision levels as a consequence of the individual selection methods.

FINDINGS AND IMPLICATIONS In
The absolute prediction errors presented in Table 3 measure accuracy, and the precision performance calculation measures computed in this study allow the comparison of its results with the corresponding results of other empirical studies that use percentage errors and harmonic means. Table 3 is, in fact, by and large a replication of Alford [2], but on a global sample, and not only for the Price-to-Earnings multiple. However, the results show that Alford's original Price-Earnings multiple and US-based empirical results hold for our sample, since our results are quite comparable to those of Alford for our Earningsto-Price measure. Qualitatively we see same results for the other two levered multiples, Bookto-Price and Sales-to-Price, whereas the results for the two unlevered multiples, Sales-to-Enterprise Value and EBITDA-to-Enterprise Value, are quite different since using the market as the basis for peer group formation seems to be the best in this case and shows the smallest prediction errors.

The test statistics in the first part of Panel A in
The non-parametric tests (Friedman tests), as introduced in Alford [2] and later studies, are applied in order to compare the precision of the different peer group selection procedures, exemplified in Panel B and Panel C. A positive sign generally indicates that the row is more accurate than the column. The signs and the magnitudes of the pairwise t-statistics are similar to those in the other tables. In the bottom half of Table 3, Panel B reports the average tstatistics using the nonparametric test for the Earnings-to-Price multiple. The magnitude of the t-statistics indicates that for the Earnings-to-Price, the peer group selection procedure "ROE + Total Assets" gives the most accurate predictions on average. For any significance level above 0.00031 the "ROE + Total Assets" peer group selection method performs better than "Market", which gives the second-most accurate predictions on average. Panel C reports the average t-statistics using the nonparametric test for the EBITDA-to-Enterprise Value multiple. The magnitude of the t-statistics indicates that for this multiple, the peer group selection "Market" gives the most accurate predictions on average. The t-statistics for the comparison with "Industry" shows -1.48809 which indicates that at significance levels below 0.06837, "Market" is preferable. And where the peer group selection procedure "ROE + Total Assets" was the best one for the Earnings-to-Price multiple, this procedure is worse than "Market" for all significance levels above 0.00703 for the EBITDA-to-Enterprise Value multiple. This clearly indicates that large differences exist across different multiples, and from Panel A there seems to be a clear difference in the mean absolute precision error sizes for the unlevered multiples compared to the levered multiples. Table 4 presents the results from next step in the study. Table 4 considers whether Accounting Regime, as intuitively expected, is a relevant parameter to consider as regards the prediction of multiples. As seen in the table, the mean absolute errors are slightly reduced for some of the multiples when added to the model using the average of all other companies as the basis for the peer selection multiple generation. However, for the "Market" based peer group selection, the three levered multiples, Earnings-to-Price, Book-to-Price and Sales-to-Price, show improved precision, when the peer group selection is changed to "AR+Market", although the increase is not significant for significance levels below 0.38583, 0.45615 and 0.42898, respectively. For the industry-based peer selection multiples this pattern is not reflected, since the average absolute error increases for most multiples, indicating that the precision is reduced, which is exactly the opposite of what was expected. In Table 5, the results from the next step in the study are presented. The table reveals several findings. First, for two central multiples, Earnings-to-Price and Book-to-Price, the use of Total Assets as well as the use of ROE for peer group selection shows different behaviour. For the Earnings-to-Price multiple, the precision for the peer group where Total Assets is the key selection criteria is smaller than in cases where ROE is the key selection criteria, and vice versa for the Book-to-Price multiple. Second, the introduction of the business climate variable, WEF, leads to a (slightly) higher precision for the Earnings-to-Price multiple as expected, which can be especially pronounced at a significance level of 35% for the Total Assets based peer selection procedure and at a significance level of 17% for the ROE based peer selection procedure. But for the Book-to-Price multiple, the opposite is the case although even less significant, since the precision significance level for the Total Assets based selection procedure is -47% and at a significance level of -48% for the ROE based peer selection procedure. In other words, both results are non-significant for almost all relevant significant levels, and consequently, WEF does not contribute to more precision, which is exactly the opposite of what was expected. The business climate variable, WEF, in the 112 countries varies from 3.40 to 5.76. Table 6 shows an evaluation of the five different multiple measures against each other for the risk and for the growth-based peer group selections, i.e. Total Assets and ROE, which were the two central measures according to Alford [2] for US data. For our global dataset two selection procedures reveal that almost the same multiple performs best (worst) as to accuracy (precision). Panel B shows the t-statistics for comparisons of the multiples.
As the size of the mean absolute error shows, the levered multiples outperform the unlevered multiples in both cases. However, the magnitudes of the t-statistics indicate that the Earningsto-Price multiple is statistically indistinguishable from the Book-to-Price multiple at reasonable significance levels in both cases. The comparison of the unlevered multiples with the levered multiples based on values of t-statistics leads to the conclusion that these multiples are less accurate. For the Total Assets based peer selection procedure and the ROE based peer selection procedure, the Book-to-Price and the Earnings-to-Price multiples obtain the best accuracy according to our accuracy measure MAE. However, based on the t-statistics it is difficult to conclude which one is best, since the Book-to-Price multiple is best for the ROE based peer selection procedure while the Earnings-to-Price multiple is best for the Total Assets based peer selection procedure, and at the same time the magnitudes of the t-statistics show that the precision for the two multiples are at equal levels for all reasonably chosen significance levels.
The ranking enables us to select two subgroups, the high precision group multiples, Earningsto-Price and Book-to-Price; and the low precision group multiples, Sales-to-Enterprise Value and EBITDA-to-Enterprise Value. The differences in these two groups between Earnings-to-Price and Book-to-Price, and Sales-to-Enterprise Value and EBITDA-to-Enterprise Value, respectively, are quite insignificant, showing critical t-statistics at 0.269 to 0.495. In contrast, the differences between the multiples across the two groups are also remarkable, in particular for ROE that shows critical t-statistics below 0.0001 while Total Assets shows critical values between 0.0012 and 0.0123. As the results show, there is no doubt that we have two quite different groups, for which reason it makes perfect sense to deal with the two groups separately in the following.
Tables 7a (for the high precision group) and 7b (for the low precision group) present the results from the next step in the study. Table 7a reveals evidence supporting the cultural influence on the two high precision multiples. In both panels, the results show that as regards the introduction of culture (accounting values) into the peer group selection generates slightly more precise multiples than not using culture in some cases, and vice versa in other cases.
The positive relation is clear for the ROE peer group selection procedure, but only for the accounting value "PROF" for the Earnings-to-Price multiple for the Total Assets peer group selection procedure. In the table we have "compressed" the t-statistics for the two multiples, Earnings-to-Price and Book-to-Price, showing the Book-to-Price multiple related statistics in the upper right corner, and the Earnings-to-Price statistics in the lower left corner. The tables are to be read in the same way as the previous tables, meaning that e.g. Book-to-Price for the "PROF+TA" based peer group selection method performs worse than the "Total Assets" based peer group selection method at a t-statistics level of 0.30653, which corresponds to a significance level at 38.0%. However, the t-statistics reported are generally so small making the critical significance levels so large that it does not make sense to state that culture measured as accounting values has a significant influence on the precision. In other words, there is only a small effect, but due to the high critical significance levels there is hardly no real need for introducing culture in the selection procedure for this high precision multiples group. As was the case in Table 7a, Table 7b reveals evidence related to the cultural impact on the multiples, but unlike 7.a, the focus is now on the two low precision multiples. The low precision multiples are actually more interesting than the high precision multiples, since our expectation is that the higher the imprecision, the larger the impact of the cultural aspect (accounting values).
Comparing Table 7b to Table 7a gives further evidence supporting our expectation. Comparing the low precision multiples here with the high precision multiples in the former table, the cultural effect is much more significant here, pointing in the direction that at least the accounting values conservatism (CONS) and secrecy (SECR) govern the observable precision of the multiples. For example, reading Table 7b as Table 7a for Sales-to-Enterprise Value, the relation between the ROE based peer group selection method and the "SECR+ROE" based peer group selection method shows a t-statistics at 1.99668, which tells us that the peer group selection procedure including SECR is better than the ROE (only) based peer group selection procedure at a critical significance level of 0.0229 (2.3%). In general, the levels for the EBITDAto-Enterprise Value multiple are not quite as clear as for the Sales-to-Enterprise Value multiple, but a t-statistics at 1.04089 gives a critical significance level at 0.14897 (14.9%) for the similar example, i.e. the relation between the ROE based peer group selection method and the "SECR+ROE" based peer group selection method.
To challenge the observations based on the low precision multiples in Table 7b especially, we did some sensitivity analyses and present the results in Table 8. As discussed above, one central reason for imprecision might be attributed to a cultural factor, and the differences in precision between the low precision and the high precision multiples that became clear in Table 7b confirm this expectation. In Table 8 we present results from the complete set of comparable regression analyses of the absolute errors as a function of the cultural accounting values for both the high and low precision multiples groups: Absolute Error = ʄ(cultural accounting value) + error In order to carry the analyses a step further, we also rank the absolute errors based on multiples and peer group selection procedure, since as mentioned we expect that the more imprecise the multiples, the larger the general effect from the cultural accounting value. In Table 8 we contrast collected results for the ten per cent of the sample (1,690 companies) with the smallest absolute errors with the ten per cent of the sample (1,690 companies) with the largest absolute errors. As can be seen in Table 8, Panel A for the high precision multiple Earnings-to-Price using the peer group selection procedure including "PROF", for instance, there is a striking difference in the cultural causality for the absolute errors between the 10 th and 90 th percentiles, which can be seen by comparing F = 0.005 (.943) to F = 5.061 (.025). For the other three accounting values for the Earnings-to-Price multiple, the differences are even more striking. There seems to be no relation to culture at the 10 th percentile (the most precise observations) generally, whereas a clear relation is found at the 90 th percentile (the most imprecise observations). The results reveal that culture as such is clearly more likely to be an important factor in explaining the difference for the Earnings-to-Price, the Sales-to-Enterprise Value, and the EBITDA-to-Enterprise Value multiples, while for the Book-to-Price multiple, no differences are really to be seen apart from the accounting value "UNIF".
The effect is less clear for the low precision multiples in Panel B, the unlevered multiples, so for that reason, sensitivity analyses were made with the inclusion of Financial Leverage (FLEV), and the results are shown in Panel C in Table 8. Compared to Panel B, i.e. cultural accounting values alone, it seems that the introduction of FLEV leads to generally smaller F-values, and less efficient models. Opposite to what could be expected, the financial leverage does not have the expected effect, since the introduction leads to less variation in precision explained in all cases. However, in almost all cases, the tendency that culture has larger effect on the 90 th percentiles group than on the 10 th percentiles group remains unchanged, but even less clear than for culture alone. There is no doubt, however, that including the cultural aspect -especially Gray's accounting values "CONS" and "SECR" -in the peer group selection yields better comparables for the precision than merely using industry, ROE, or Total Assets as the primary selection criteria, and the cultural aspect should therefore be included when cross-country companies are used as comparables.
CONCLUSIONS Summarising, we find support for our expectation that culture, as expressed in Gray's accounting values, has an impact on the precision for selected peer groups' comparables, while our other expectations regarding Business Climate and Accounting Regime were not supported by our results. Introduction of prescribed Accounting Regimes as part of the selection criteria seems to have no improving effect as to the accuracy of the target's peer group based valuation, which might simply reflect that the IFRS leaves many choices in detailed accounting practice to the user, for which reason our introduction of averages was not able to reflect the expected improvements in the accuracy.
The introduction of the Business Climate to the peer group selection procedure can also be described as a quite limited success. As shown for the two central multiples, Earnings-to-Price and Book-to-Price, the effect of introducing Business Climate (WEF) on the mean average errors is small; for Earnings-to-Price a small improvement in accuracy was detected, while Book-to-Price showed a small deterioration in accuracy. Further, the observed changes were so small that they were statistically inconclusive in all cases.
Concerning the cultural effect, the introduction of Gray's accounting values as a way of operationalising cultural effects to help explain differences in accuracy could be taken into consideration when forming peer groups that include companies from different countries. Further, the Gray accounting values as operationalised and implemented here as part of the peer group selection procedure, seem to capture the imprecision for the multiples with the lowest precision better.