• Keine Ergebnisse gefunden

For each stock independently, the adequacy of calibration was assessed primarily based on the expected annualized yield – the higher is the yield, the better is assumed to be the specification. As it has been mentioned above, the accuracy measure is highly suitable

88

8.4 Parameter Calibration

Stock Specification R¯ Learning sample ADS fund. and tech. 0.5% sliding window ALT fundamental 1.0% sliding window ALV fundamental 1.0% sliding window BAS fund. and tech. 0.5% expanding BAY fundamental 0.5% sliding window

TUI fund. and tech. 0.5% expanding

Table 8.3: Calibration results for BNS tree pruning, N/A indicates situations when none of the inputs were able to produce a positive calibration yield

for many academic applications. However, in trading environments the financial result is what ultimately counts. While chances of a more accurate procedure to produce better financial results are higher, all other things being equal, a superiority of one procedure over another in terms of only the accuracy measure does not guarantee its superiority in terms of financial returns. A simple example can support this statement: suppose that a procedure A shows the 80% accuracy rate capturing 1% stock price changes correctly every time and making two wrong 5% stock price change classifications, and a procedure B yields only 60% of correct classifications (see Table 8.4). However, showing the ’big hit’ ability, the procedure B classifies correctly all 5% price changes. Which procedure shows better results? For stock picking applications, all other things being equal, the procedure B supersedes the procedure A since the overall absolute profit is 26% (procedure B) vs. 20% (procedure A) or more than a 30% relative difference. In the ideal situation, high accuracy rates would combine with the high level of yielded profits.

Stock Price Change 1% 1% 1% 1% 1% 5% 5% 5% 5% 5%

Procedure A • • • • • • • • ◦ ◦

Procedure B ◦ ◦ ◦ ◦ • • • • • •

Table 8.4: Two classification procedures making either hits (•) or misses (◦) when fore-casting sequential stock price changes. Procedure B exhibits a lower hit rate but superior financial result

To avoid potential spuriousness of calibration results, several countermeasures were employed. Define an active classification operation as the one yielding classes long or short. Firs of all, theactivity ratioindicator (the percentage of active operations during calibration for a given stock) was used in the following way. The activity ratio had to exceed 40% in order for a specification to be considered reliable, and competing

89

8 Historical Simulation of DAX30 Stock Picking

specifications (with the similar amount of the yielded expected return) were selected in favor of those with the highest activity ratio. 40% is a subjective reasonable value, and it simply cancels those specifications that are unable to provide at least 40% of active classifications during the calibration period (see Section 8.5), which may happen when either ¯R is set too high or, say, BNS parameters ¯nand ¯p are set inappropriately.

Additionally, thehit ratio(the proportion of correct active directional forecasts during the calibration) of a reliable specification had to exceed 45%, which roughly corresponds to a similar condition of the weak classifier construction for various ensemble methods, see Section 6.3 for more details.

Finally, the representativity ratio (the average proportion of observations belonging to two active classes in the root nodes of trees during the calibration period) had to exceed 50% to ensure that the joint proportion of classes long and short is not too low.

This constraint is applied primarily to avoid spuriousness of calibration results due to unreasonably high values of ¯R.

Therefore, a specification can pass the calibration phase if and only if:

• accumulated profit by the end of the calibration period is positive;

• activity ratio exceeds 40%;

• hit ratio exceeds 45%;

• representativity ratio exceeds 50%.

If two specifications show close results in terms of the accumulated calibration profit, a specification with the higher activity ratio is preferred. If one specification demon-strates a significantly higher level of profit (at least 1.5 times higher) and passes all four necessary calibration conditions, it is preferred to the other specification.

Basic spec

Figure 8.2: Specification priorities according to the Occam’s Razor principle and cali-bration process

90

8.4 Parameter Calibration Figure 8.2 illustrates the priority of various specifications that are comprised of either basic or extended feature sets and have the learning sample with either fixed (sliding window) or expanding size. For each stock individually and for fixed ¯pand ¯n, calibration starts with selecting the appropriate value of ¯R for the sliding window learning sample constituted by the basic feature set. If such ¯R can not be found, the more complex learning sample type is employed, etc. If at the end of calibration no appropriate ¯R can be obtained even for the case of the extended feature set and expanding learning sample, the analyzed stock is excluded from the portfolio (see Section 8.5).

In the case when all four configurations from Figure 8.2 have to be passed, calibration would be equivalent to analyzing 13×4×15 = 780 scenarios for each cost-complexity and BNS pruning (for fixed ¯nand ¯p), making it 1560 scenarios altogether. Calibration of

¯

nand ¯pby setting up a grid containing at least five points for ¯n(say, 3%, 5%, 10%, 15%, and 20% of the learning sample size) and at least three points for ¯p(say, 75%, 87.5%, and 100%) would increase the number of scenarios by 15 times up to 13×4×15×5×3 = 11700 scenarios. While that would not be a problem in an industrial setting, computing power is quite limited in the framework of this study. At the same time, an empirical rule of thumb suggesting to set ¯nto 10% of the learning sample size and ¯p= 75% is likely to provide comparable results. The four calibration constraints increase the probability of selecting those specifications that are more likely to show better performance out-of-sample and implicitly assist to control the appropriate values of ¯n and ¯p, although certainly just to a moderate extent.

Stock Specification R¯ Learning sample ADS fund. and tech. 0.5% sliding window ALT fundamental 0.5% sliding window ALV fundamental 0.5% sliding window BAS fund. and tech. 0.5% expanding BAY fundamental 0.5% sliding window

Table 8.5: Calibration results for cost-complexity tree pruning, N/A indicates situations when none of the inputs were able to produce a positive calibration yield Such calibration was performed independently for the BNS and cost-complexity ap-proaches of tree pruning. Tenfold cross-validation and 1-SE rule were employed to find optimal cost-complexity trees. In case when the resulting optimal tree was obviously underparameterized (consisted of the single root node after pruning), the 0-SE rule was employed instead, see Section 5.3 for formal definitions. Table 8.3 and Table 8.5

pro-91

8 Historical Simulation of DAX30 Stock Picking

Sliding window Expanding LS

Figure 8.3: Overall calibration results: learning sample type distribution

vide calibration results. Note that in each case that passed the calibration phase, the obtained values of ¯R are nonzero suggesting that the ’big hit’ ability may be quite an important feature of this and other similar applications where various predictions of the same class may have different economic values. ¯R = 0.5% is obtained in 60% of cases for BNS and in 100% of cases for cost-complexity pruning, or in 75% of cases overall.

Values of ¯Rgreater than 1% never appeared to be the optimal choices, which may partly be limited by the four calibration constraints among other factors.

Basic specification Extended spec

Figure 8.4: Overall calibration results: input specification type distribution Sliding window appeared to be a more frequent choice for BNS (70% of cases) and cost-complexity pruning (67% of cases), or in 69% of cases overall, than a learning sample with the dynamically increasing size. However, one should recall that sliding window has a higher priority during calibration (refer to Figure 8.2).

The basic feature set was preferred in 70% of cases for BNS and in 67% of cases for cost-complexity pruning, or in 69% of cases overall, but one should again recall the lower calibration priority of the extended feature set.

Overall, roughly in 75% of cases, calibration suggested to employ the class assignment rule equivalent to setting ¯R = 0.5%, sliding window learning sample type, and basic feature set. However, the distribution of ¯Rvaries a lot among the two pruning techniques employed.

92