• Keine Ergebnisse gefunden

Boasting prediction using SMOTE

Im Dokument 2 Literature review (Seite 19-51)

Table 5 shows the misclassification errors for random forest before and after imple-menting the SMOTE algorithm. The re-sults show that the random forest model built on data with reduced imbalance us-ing the SMOTE performs slightly bet-ter than the model built on the original highly imbalanced data set.This finding is consistent with previous related studies such as (Shrivastava, Jeyanthi and Singh, 2020). Reducing the decision space of the majority class while increasing that of mi-nority class improves prediction.

5.3 Variable Importance

Figures 7 to 22 show the variable impor-tance of random forest models for dif-ferent countries. The results show that the importance of variables varies from country to country. Credit variables such as total loans to the non-financial pri-vate sector, mortgage loans to the non-financial private sector, total loans to households and total loans to business

emerge as very important in detecting a financial crisis in Australia, Belgium, Denmark, France,Italy, Norway, Switzer-land and Portugal. This is inline with findings by previous studies such as Schu-larick and Taylor,2012; Fricke, 2017 who concluded that credit growth is key in predicting financial crisis.

Rates of return on assets is important in detecting financial crisis in Netherlands, Norway and Portugal. Housing prices are very important in detecting crisis in Nor-way, Australia,Sweden and USA. This is inline with the findings of Beutel et al., 2019; Kindleberger et al., 2011; Jord‘a et al., 2015 who concluded that real estate prices as well as asset prices drive crisises especially if they are debt-financed.

Money prices and interest rates are im-portant in detecting financial crisis in Portugal,Spain, USA and UK. Similar findings have been made by Sevim et al., 2014.Real economy variables are gen-erally important but appear specifically important in Australia, Belgium, Fin-land,France,Germany and Switzerland.

Public debt to GDP ratio,

govern-ment revenue and expenditure are impor-tant in Belgium,Italy,Japan,Netherlands, Sweden,USA and UK.

The difference in variable importance across countries points to the heterogene-ity in crisis causing factors across coun-tries. Some caution should however be taken when interpreting this results since the variables included in the model dif-fer from country to country depending on availability. Thus some variables that appear very important for some country may not have been available for another country. Table 6 in the appendix shows the variables included in each country model. Generally, in additional to the general real economy variables, credit and monetary variables emerge as very impor-tant variables for detecting a financial cri-sis 1 to 3 years from it’s onset.

6 Conclusion

In this study, we have identified variables that are important for detecting that a fi-nancial crisis may occur 1 to 3 years from

it is onset. To do this, first we show that random forest performs better than our benchmark model, logistic regression on long historical macroeconomic data.

We have minimised class imbalance in the data which is a major problem in modeling crisis due to the irregular nature of their occurrence. We have shown that the SMOTE technique improves the per-formance of random forest. Future stud-ies may focus on adopting methods that optimize machine learning techniques by complimenting them with better methods that minimize the data imbalance which is still a problem.

The key finding of the study is that whereas variables that are important in detecting that a financial crisis may occur in a country 1 to 3 years from it is onset vary from country to country, some sim-ilarities are observed. Credit and mon-etary variables for instance emerge as very important in detecting financial cri-sis across a number of countries. Asset and housing prices in addition to the tra-ditional real economy variables were also found to be specifically important among

countries.

7 References

Alessi, L. and Detken, C. (2018). Identifying excessive credit growth and leverage.

Journal of Financial Stability, 35, pp.215-225.

Asanovi´c, ˇZ. (2017). Predicting Systemic Banking Crises Using Early Warning Models: The Case of Montenegro. Journal of Central Banking Theory and Practice, 6(3), pp.157-182.

Aydin, Alev ¸calı¸skan ¸cavdar, ¸seyma. (2015). Prediction of Financial Crisis with Artificial Neural Network: An Empirical Analysis on Turkey. International Journal of Financial Research. 6. 10.5430/ijfr.v6n4p36.

Beutel, Johannes List, Sophia Von Schweinitz, Gregor. (2018). An evaluation of early warning models for systemic banking crises: Does machine learning improve predictions?.

Beutel, J., List, S. and von Schweinitz, G., 2019. Does machine learning help us predict banking crises?. Journal of Financial Stability, 45, p.100693.

Bordo, M., Eichengreen, B., Klingebiel, D. and Martinez-Peria, M., 2001. Is the crisis problem growing more severe?. Economic Policy, 16(32), pp.52-82.

Bussiere, M. and Fratzscher, M. (2006). Towards a new early warning system of financial crises. Journal of International Money and Finance, 25(6), pp.953-973.

Breiman, L. (1996). Bagging Predictors. Machine Learning 24 (2), 123–140.

Breiman, L. (2001). Random Forests. Machine Learning 45 (1), 5–32.

Candelon, B., Dumitrescu, E. and Hurlin, C. (2014). Currency crisis early warning systems: Why they should be dynamic. International Journal of Forecasting, 30(4), pp.1016-1029.

Chawla, N., Bowyer, K., Hall, L. and Kegelmeyer, W., 2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, pp.321-357.

Coke, Rebecca Berg, Andrew. (2004). Autocorrelation-Corrected Standard Errors in Panel Probits: An Application to Currency Crisis Prediction. IMF Working Papers. 04. 10.5089/9781451845860.001.

Demirguc-Kunt, A. and Detragiache, E. (2000). Monitoring Banking Sector Fragility: A Multivariate Logit Approach. The World Bank Economic Review, 14(2), pp.287-307.

Demirg¨u¸c-Kunt, A. and Detragiache, E. (2005). Cross-Country Empirical Studies of Systemic Bank Distress: A Survey. National Institute Economic Review, 192(1), pp.68-83.

Duca, M. and Peltonen, T. (2013). Assessing systemic risks and predicting systemic events. Journal of Banking Finance, 37(7), pp.2183-2195.

Fricke, D. (2017). Financial Crisis Prediction: A Model Comparison. Deutsche Bundesbank; University College London; London School of Economics Political Science (LSE) - Systemic Risk Centre.

Holopainen, M. and Sarlin, P. (2017). Toward robust early-warning models: a horse race, ensembles and model uncertainty. Quantitative Finance, 17(12), pp.1933-1963.

Jord`a, `O., Schularick, M. and Taylor, A. (2011). Financial Crises, Credit Booms, and External Imbalances: 140 Years of Lessons. IMF Economic Review, 59(2), pp.340-378.

Jord`a, `O., Schularick, M. and Taylor, A., 2015. Leveraged bubbles. Journal of Monetary Economics, 76, pp.S1-S20.

Kaminsky, Graciela Lizondo, Saul Reinhart, Carmen. (1998). Leading Indicators of Currency Crises. International Monetary Fund. 45. 10.1596/1813-9450-1852.

Kumar, M., Moorthy, U. and Perraudin, W. (2003). Predicting emerging market currency crashes. Journal of Empirical Finance, 10(4), pp.427-454.

Michie, R., 2012. Charles P. Kindleberger and Robert Z. Aliber, Manias, panics and crashes: a history of financial crises (New York: Palgrave Macmillan, 6th edn., 2011. Pp. viii + 356. 3 tabs. ISBN 9780230365353 Pbk. . The Economic History Review, 65(4), pp.1609-1611. Neunhoeffer, M. and Sternberg, S. (2018). How Cross-Validation Can Go Wrong and What to Do About It. Political Analysis, 27(1), pp.101-106.

Nicole, M. (2016). Predicting Financial Crises. Wharton Research Scholars. 136.

Olivier, B., Angela, D. (2010). Euro area GDP forecasting using large survey datasets. A random forest approach. Euroindicators working papers

Oscar Jord`` a, Moritz Schularick, and Alan M. Taylor. 2017. “Macrofinancial His-tory and the New Business Cycle Facts.” in NBER Macroeconomics Annual 2016, volume 31, edited by Martin Eichenbaum and Jonathan A. Parker. Chicago: Uni-versity of Chicago Press.

Pattillo, C. and Berg, A. (1998). Are Currency Crises Predictable? a Test. IMF Working Papers, 98(154), p.1.

Rose, A. and Spiegel, M. (2012). Cross-country causes and consequences of the 2008 crisis: Early warning. Japan and the World Economy, 24(1), pp.1-16.

Sevim, C., Oztekin, A., Bali, O., Gumus, S. and Guresen, E. (2014). Developing an early warning system to predict currency crises. European Journal of Operational Research, 237(3), pp.1095-1104.

Schularick, M. and Taylor, A., 2012. Credit Booms Gone Bust: Monetary Pol-icy, Leverage Cycles, and Financial Crises, 1870–2008. American Economic Review, 102(2), pp.1029-1061.

Tanaka, K., Kinkyo, T. and Hamori, S. (2016). Random forests-based early warn-ing system for bank failures. Economics Letters, 148, pp.118-121.

Tudela, Merxe Falcetti, Elisabetta. (2006). Modelling Currency Crises in Emerg-ing Markets: A Dynamic Probit Model with Unobserved Heterogeneity and Au-tocorrelated Errors. Oxford Bulletin of Economics and Statistics. 68. 445-471.

10.1111/j.1468-0084.2006.00172.x.

van den Berg, J., Candelon, B. and Urbain, J. (2008). A cautious note on the use of panel models to predict financial crises. Economics Letters, 101(1), pp.80-83.

Shrivastava, S., Jeyanthi, P. and Singh, S., 2020. Failure prediction of Indian Banks using SMOTE, Lasso regression, bagging and boosting. Cogent Economics Finance, 8(1).

8 Appendix

8.1 Table 1: Table showing Summary literature review

8.2 Table 2: Table showing Crisis years per country 1870-2008

8.3 Table 3: Variable names and description

8.4 Inspecting stationarity using Auto correlation Function (Before de-trending)

Figure 3: The figure shows the ACF plots for the different series. For stationary series, a decay in lags overtime is expected

Figure 4: 7.4 continued

8.5 Inspecting stationarity using Auto correlation Function (After de-trending)

Figure 5: The figure shows the ACF plots for the different series.The lags are observed to decay to zero pointing to stationarity

Figure 6: 7.5 continued

8.6 Table 4: Misclassification error for logistic regression and random forest on significant variables from imbalanced data

8.7 Table 5: Misclassification error for random forest before and after SMOTE

8.8 Variable Importance

Figure 7: Variable importance - Australia

Figure 8: Variable importance - Belgium

Figure 9: Variable importance - Denmark

Figure 10: Variable importance - Finland

Figure 11: Variable importance - France

Figure 12: Variable importance - Germany

Figure 13: Variable importance - Italy

Figure 14: Variable importance - Japan

Figure 15: Variable importance - Netherlands

Figure 16: Variable importance - Norway

Figure 17: Variable importance - Portugal

Figure 18: Variable importance - Switzerland

Figure 19: Variable importance - Sweden

Figure 20: Variable importance - Spain

Figure 21: Variable importance - USA

Figure 22: Variable importance - UK

8.9 Table 6: Variables included in each country model

Non-exclusive licence to reproduce thesis and make thesis public

I, Geofrey Wanyama

1. herewith grant the University of Tartu a free permit (non-exclusive licence) to

reproduce, for the purpose of preservation, including for adding to the DSpace digital archives until the expiry of the term of copyright,

Early warning system for financial crisis: application of random forest

supervised by Mustafa Hakan Eratalay and Luca Alfieri

2. I grant the University of Tartu a permit to make the work specified in p. 1 available to the public via the web environment of the University of Tartu, including via the DSpace digital archives, under the Creative Commons licence CC BY NC ND 3.0, which allows, by giving appropriate credit to the author, to reproduce, distribute the work and communicate it to the public, and prohibits the creation of derivative works and any commercial use of the work until the expiry of the term of copyright.

3. I am aware of the fact that the author retains the rights specified in p. 1 and 2.

4. I certify that granting the non-exclusive licence does not infringe other persons’

intellectual property rights or rights arising from the personal data protection legislation.

Geofrey Wanyama 25/05/2020

Im Dokument 2 Literature review (Seite 19-51)