• Keine Ergebnisse gefunden

Accounting for response behavior in longitudinal rating data

N/A
N/A
Protected

Academic year: 2022

Aktie "Accounting for response behavior in longitudinal rating data"

Copied!
6
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

CLADAG 2021

BOOK OF ABSTRACTS AND SHORT PAPERS

13th Scientific Meeting of the Classification and Data Analysis Group

Firenze, September 9-11, 2021

edited by

Giovanni C. Porzio Carla Rampichini

Chiara Bocci

FIRENZE UNIVERSITY PRESS

2021

(2)

CLADAG 2021 BOOK OF ABSTRACTS AND SHORT PAPERS : 13th Scientific Meeting of the Classification and Data Analysis Group Firenze, September 9-11, 2021/ edited by Giovanni C. Porzio, Carla Rampichini, Chiara Bocci. — Firenze : Firenze University Press, 2021.

(Proceedings e report ; 128)

https://www.fupress.com/isbn/9788855183406 ISSN 2704-601X (print)

ISSN 2704-5846 (online) ISBN 978-88-5518-340-6 (PDF) ISBN 978-88-5518-341-3 (XML) DOI 10.36253/978-88-5518-340-6

Graphic design: Alberto Pizarro Fernández, Lettera Meccanica SRLs

Front cover: Illustration of the statue by Giambologna, Appennino (1579-1580) by Anna Gottard

FUP Best Practice in Scholarly Publishing (DOI https://doi.org/10.36253/fup_best_practice)

All publications are submitted to an external refereeing process under the responsibility of the FUP Editorial Board and the Scientific Boards of the series. The works published are evaluated and approved by the Editorial Board of the publishing house, and must be compliant with the Peer review policy, the Open Access, Copyright and Licensing policy and the Publication Ethics and Complaint policy.

Firenze University Press Editorial Board

M. Garzaniti (Editor-in-Chief), M.E. Alberti, F. Vittorio Arrigoni, E. Castellani, F. Ciampi, D. D’Andrea, A.

Dolfi, R. Ferrise, A. Lambertini, R. Lanfredini, D. Lippi, G. Mari, A. Mariani, P.M. Mariano, S. Marinai, R.

Minuti, P. Nanni, A. Orlandi, I. Palchetti, A. Perulli, G. Pratesi, S. Scaramuzzi, I. Stolzi.

The online digital edition is published in Open Access on www.fupress.com.

Content license: except where otherwise noted, the present work is released under Creative Commons Attribution 4.0 International license (CC BY 4.0: http://creativecommons.org/licenses/by/4.0/

legalcode). This license allows you to share any part of the work by any means and format, modify it for any purpose, including commercial, as long as appropriate credit is given to the author, any changes made to the work are indicated and a URL link is provided to the license.

Metadata license: all the metadata are released under the Public Domain Dedication license (CC0 1.0 Universal: https://creativecommons.org/publicdomain/zero/1.0/legalcode).

© 2021 Author(s)

Published by Firenze University Press Firenze University Press

Università degli Studi di Firenze via Cittadella, 7, 50144 Firenze, Italy

CLAssification and Data Analysis Group (CLADAG)

of the Italian Statistical Society (SIS)

INDEX

Preface 1

Keynote Speakers Jean-Michel Loubes

Optimal transport methods for fairness in machine learning 5 Peter Rousseeuw, Jakob Raymaekers and Mia Hubert

Class maps for visualizing classification results 6

Robert Tibshirani, Stephen Bates and Trevor Hastie

Understanding cross-validation and prediction error 7 Cinzia Viroli

Quantile-based classification 8

Bin Yu

Veridical data science for responsible AI: characterizing V4 neurons

through deepTune 9

Plenary Session Daniel Diaz

A simple correction for COVID-19 sampling bias 14

Jeffrey S. Morris

A seat at the table: the key role of biostatistics and data science in the

COVID-19 pandemic 15

Bhramar Mukherjee

Predictions, role of interventions and the crisis of virus in India: a data

science call to arms 16

Danny Pfeffermann

Contributions of Israel’s CBS to rout COVID-19 17

Invited Papers

Claudio Agostinelli, Giovanni Saraceno and Luca Greco

Robust issues in estimating modes for multivariate torus data 21 Emanuele Aliverti

(3)

58

A

CCOUNTING FOR

R

ESPONSE

B

EHAVIOR IN

L

ONGITUDINAL

R

ATING

D

ATA

Roberto Colombi1, Sabrina Giordano2and Maria Kateri3

1Department of Management, Information and Production Engineering, University of Bergamo, Italy (e-mail:roberto.colombi@unibg.it)

2Department of Economics, Statistics and Finance “Giovanni Anania”, University of Calabria, Italy (e-mail:sabrina.giordano@unical.it)

3 Institute for Statistics, RWTH Aachen University, Germany (e-mail:

maria.kateri@rwth-aachen.de)

ABSTRACT: We present a hidden Markov model for repeated ordinal responses ob- served on some units at different time occasions. The responses reflect the levels of unobservable latent constructs and can be observed under two latent regimes accord- ing to whether the respondents are confident with their preference or take shelter in the extremes/middle points of the rating scale.

KEYWORDS: latent variables; response style; financial capability.

Hidden Markov models with two regimes

Consider one ordinal response observed onn units at T time occasions. So Yit denotes the response of unit i, i∈I ={1, . . . ,n}, at occasiont, t∈T = {1, . . . ,T}, withYit ∈C ={1, . . . ,c}. The response is assumed to reflect the levels of unobservable latent constructsLit,i∈I,t∈T and can be observed under two different latent regimes: awareness(AWR) andmiddle or extreme categories response style(EMRS) that are captured by binary latent variables Uit,i∈I,t∈T. The presence of two regimes is based on the idea that when required to express their opinion on one item, respondents either identify their true preference into one category on the rating scale or, when in doubt or reluc- tant to disclose their opinion, take shelter by opting for the extreme or middle categories. These are the cases, for example, of patients asked to give a subjec- tive assessment of their health or disability in daily living, or people required to evaluate their financial capability; all of them can feel confident or reluctant to answer. The proposal is a hidden Markov model (HMM) defined by two components that describe the distribution of the latent variables and the condi- tional distribution of the response given the latent variables. It generalizes the models by Bartolucciet al., 2012 to a bivariate latent Markov process. Here, we describe the main features of the model proposed by Colombiet al., 2021.

The latent Markov model. For everyi∈I, t∈T, the latent construct Lit (as: health status, financial capability) has a finite discrete state space SL={1, . . . ,k}, while the latent binary response style indicator Uit has a state space SU ={1,2}, where 1 and 2 denote the EMRS and AWR states, respectively. The latent variables are independent across units and for every unit, {Lit,Uit}t∈T is a first order bivariate Markov process with states (u,l), u∈SU, l∈SL. The initial probabilities (t=1) of{Lit,Uit}tT areπi1(u,l), and πit(u,l|u,¯ l)¯ are the transition probabilities. They are are simplified to πit(u,l|u¯,l¯) =πUit|L(u|l,u¯)πLit(l|l¯),t=2, . . . ,T,by assuming thatLit, given its past, does not depend on the past ofUit and the currentUit depends on its past and on the contemporaneous latent construct but not on the past of the latent construct. The row vectorsx(m)i andz(m)it ,m∈ {L,U}, stand for the covariates, not necessarily different, influencing the initial and transition probabilities, re- spectively, of the latent variables. Assuming independence between the latent variables at the first time, the latent model is specified by the following logit models: A) a baseline logit model for the initial probabilities of the latent construct logππLLi1(l)

i1(1)0l1lx(L)i ,l=2, . . . ,k; B) a logit model for the ini- tial probabilities of the response style indicator logππUi1U(1)

i1(2) =α¯0+α¯1x(U)i ; C) baseline logit models for the marginal transition probabilities of the latent con- struct, with reference category the state ¯lof the previous time point, i.e. for l¯∈SL,logππLitL(l|l)¯

it(l¯|l)¯0ll¯1ll¯z(L)it ,l∈SL,l=l,t¯ =2, . . . ,T; D) a logit model for the conditional transition probabilities of the response style indicator for each response style state ¯uof the previous occasion and for each current statel of the latent construct logπUit|L(1|l,u)¯

πUit|L(2|l,u)¯ =β¯0lu¯+β¯1lu¯z(U)it ,l∈SL,u¯∈SU,t=2, . . . ,T.

The observation model.Independence is assumed among units. The con- ditional probability functions ofYit, given the EMRS(1,l)and AWR(2,l)la- tent states are both time and subject invariant, denoted by f(y|l,u),u∈SU,l∈ SL,y∈C,fort∈T,i∈I. Given the EMRS regime, f(y|l,1),l∈SL, is pa- rameterized by the logits log f(y−1|l,1)f(y|l,1)0l1ls(y),y=2, . . . ,c,where the scores are known constantss(y) = (c2−y)/

cy=11(y−c/2)2,y∈C0 gov- erns the skewness,φ1the U and bell shape. Given the AWR regime, f(y|l,2), l∈SL, is parameterized by the logits log f(yf(y|l,2)1|l,2)yl,y=2, . . . ,c.

Application to Bank of Italy data.We applied the model to the panel data from the Survey on Household Income and Wealth (Bank of Italy), collected every 2 years from 2006 to 2016 on 1109 Italian households. The ordinal re-

(4)

A

CCOUNTING FOR

R

ESPONSE

B

EHAVIOR IN

L

ONGITUDINAL

R

ATING

D

ATA

Roberto Colombi1, Sabrina Giordano2and Maria Kateri3

1Department of Management, Information and Production Engineering, University of Bergamo, Italy (e-mail:roberto.colombi@unibg.it)

2Department of Economics, Statistics and Finance “Giovanni Anania”, University of Calabria, Italy (e-mail:sabrina.giordano@unical.it)

3 Institute for Statistics, RWTH Aachen University, Germany (e-mail:

maria.kateri@rwth-aachen.de)

ABSTRACT: We present a hidden Markov model for repeated ordinal responses ob- served on some units at different time occasions. The responses reflect the levels of unobservable latent constructs and can be observed under two latent regimes accord- ing to whether the respondents are confident with their preference or take shelter in the extremes/middle points of the rating scale.

KEYWORDS: latent variables; response style; financial capability.

Hidden Markov models with two regimes

Consider one ordinal response observed on n units atT time occasions. So Yit denotes the response of unit i, i∈I ={1, . . . ,n}, at occasiont,t∈T = {1, . . . ,T}, withYit ∈C ={1, . . . ,c}. The response is assumed to reflect the levels of unobservable latent constructsLit,i∈I,t∈T and can be observed under two different latent regimes: awareness(AWR) andmiddle or extreme categories response style(EMRS) that are captured by binary latent variables Uit,i∈I,t∈T. The presence of two regimes is based on the idea that when required to express their opinion on one item, respondents either identify their true preference into one category on the rating scale or, when in doubt or reluc- tant to disclose their opinion, take shelter by opting for the extreme or middle categories. These are the cases, for example, of patients asked to give a subjec- tive assessment of their health or disability in daily living, or people required to evaluate their financial capability; all of them can feel confident or reluctant to answer. The proposal is a hidden Markov model (HMM) defined by two components that describe the distribution of the latent variables and the condi- tional distribution of the response given the latent variables. It generalizes the models by Bartolucci et al., 2012 to a bivariate latent Markov process. Here, we describe the main features of the model proposed by Colombiet al., 2021.

The latent Markov model. For everyi∈I,t∈T, thelatent construct Lit (as: health status, financial capability) has a finite discrete state space SL ={1, . . . ,k}, while the latent binary response style indicator Uit has a state space SU ={1,2}, where 1 and 2 denote the EMRS and AWR states, respectively. The latent variables are independent across units and for every unit,{Lit,Uit}t∈T is a first order bivariate Markov process with states(u,l), u∈SU, l∈SL. The initial probabilities (t=1) of{Lit,Uit}tT are πi1(u,l), and πit(u,l|u,¯ l)¯ are the transition probabilities. They are are simplified to πit(u,l|u¯,l¯) =πUit|L(u|l,u¯)πLit(l|l¯),t=2, . . . ,T,by assuming thatLit, given its past, does not depend on the past ofUit and the currentUit depends on its past and on the contemporaneous latent construct but not on the past of the latent construct. The row vectorsx(m)i andz(m)it ,m∈ {L,U}, stand for the covariates, not necessarily different, influencing the initial and transition probabilities, re- spectively, of the latent variables. Assuming independence between the latent variables at the first time, the latent model is specified by the following logit models: A) a baseline logit model for the initial probabilities of the latent construct logππLLi1(l)

i1(1)0l1lx(L)i ,l=2, . . . ,k; B) a logit model for the ini- tial probabilities of the response style indicator logππUUi1(1)

i1(2) =α¯0+α¯1x(U)i ; C) baseline logit models for the marginal transition probabilities of the latent con- struct, with reference category the state ¯l of the previous time point, i.e. for l¯∈SL,logππLitL(l|l)¯

it(l¯|l)¯0ll¯1ll¯z(L)it ,l∈SL,l=l,t¯ =2, . . . ,T; D) a logit model for the conditional transition probabilities of the response style indicator for each response style state ¯uof the previous occasion and for each current statelof the latent construct logπU|Lit (1|l,u)¯

πUit|L(2|l,u)¯ =β¯0lu¯+β¯1lu¯z(U)it ,l∈SL,u¯∈SU,t=2, . . . ,T.

The observation model.Independence is assumed among units. The con- ditional probability functions ofYit, given the EMRS(1,l)and AWR(2,l)la- tent states are both time and subject invariant, denoted by f(y|l,u),u∈SU,l∈ SL,y∈C,fort∈T,i∈I. Given the EMRS regime, f(y|l,1), l∈SL, is pa- rameterized by the logits logf(y−1|l,1)f(y|l,1)0l1ls(y),y=2, . . . ,c,where the scores are known constantss(y) = (2c−y)/

cy=11(y−c/2)2,y∈C0gov- erns the skewness,φ1the U and bell shape. Given the AWR regime, f(y|l,2), l∈SL, is parameterized by the logits logf(yf(y|l,2)1|l,2)yl,y=2, . . . ,c.

Application to Bank of Italy data.We applied the model to the panel data from the Survey on Household Income and Wealth (Bank of Italy), collected every 2 years from 2006 to 2016 on 1109 Italian households. The ordinal re-

(5)

60

Figure 1.Observation probability functions of AWR and EMRS respondents in the two latent states of the perceived financial condition.

sponse of interest is the perception of the household’s financial ability to make ends meet (ve = very easily, e = easily, fe = fairly easily, sd = with some dif- ficulty, d = with difficulty, gd = with great difficulty), the covariates are: G (female,male), J (Jse: self-employee, Jhrs: housekeeper/retired/student,em- ployee), CH (with children, no children), D (with debts, no debts), S (with savings,no savings), E (up to secondary school,over high school), R (no risk averse in managing financial investments,risk averse), with the reference cate- gories being in italics. The minimum BIC corresponds to the model withk=2 states, meaning that households can be grouped according to whether they feel financially confident (l=1) or deal with financial stress (l=2). Fig. 1 allows us to characterize the choices of the respondents in 4 latent states. Individuals, in the financially confident latent state, when in doubt about their perception, tend to choose with more chance the optimistic extreme points, AWR peo- ple instead are more incline to the intermediate rates. Reluctant households (EMRS) in the latent group that deals with financial stress have the highest probabilities of reporting great difficulties, AWR people in the same group are more likely to point out just some difficulties. The behavior in the 4 stata is well distinguished, and optimistic/pessimistic choices are mainly due to the EMRS tendency. By the sign of the estimates in Table 1 row 1, we deduce that at the first occasion women, employees, people without savings, with high education and risk averse are with higher probability in a worse financial sta-

Table 1. Estimates (EM algorithm) of the parameters of logit models A, B, C, D.

parameters cst G Jse Jhrs CH D S E R

02,α2) 2.8 0.44 -1.38 -0.75 -0.15 0.02 -1.44 -1.86 -0.35 α0,α¯1) -0.06 -0.03 0.16 0.08 -0.04 0.32 0.63 0.04 0.14 021,β121) -0.86 1.32 0.27 -0.49 -0.89 0.48 -1.69 -1.16 -0.17 012,β112) -11.93 0.18 -0.91 -0.21 -0.36 -0.23 8.44 1.38 -8.83 (β¯011,β¯111) 1.10 0.45 -0.29 0.00 -0.20 0.13 -0.79 -0.47 -0.06 (β¯021,β¯121) -3.36 -0.05 1.09 -0.33 0.45 -0.37 1.97 0.81 -0.37 (β¯012,β¯112) 1.91 -0.07 -0.35 -0.23 0.00 -0.05 -0.19 -0.29 -0.39 (β¯022,β¯122) 1.69 -0.50 -0.34 -0.08 0.10 -0.07 1.80 -0.09 -0.37

cst: constant – 95% confidence interval does not contain zero

tus. Further, responders with savings show a major propensity to a response style at the beginning of the survey (row 2). From row 3, it seems that, in two consecutive moments, women move from a financially confident (l=1) condition to a worse status (l=2) with higher probability, while low-educated households with children and savings more likely tend to rest in the previous more comfortable financial status (l=1). Individuals who have savings and a low education pass with greater probability from the financial stressed status (l=2) to the better condition (l=1), while financially stressed households tend to remain in the same worst status with greater probability when they are no risk averse (row 4). From rows 5-6, it is more likely to change from the EMRS status ( ¯u=1) to an AWR behavior (u=2) for low educated persons with savings, who currently belong to the group of financially confident house- holds, while self-employee and low educated respondents with savings show greater probability of remaining in the EMRS status if in the previous occasion were reluctant ( ¯u=1) and in the current time are financially stressed (l=2).

Who is no risk averse and in the current moment feels to be financially confi- dent has higher probability of keeping the previous awareness in revealing the own financial capability. On the other hand, individuals with savings, being in the latent financially worrying status, tend with more propensity to give up on the previous AWR behavior and opt for a response style, rows 7-8.

References

BARTOLUCCI, F., FARCOMENI, A., & PENNONI, F. 2012. Latent Markov Models for Longitudinal Data. CRC Press.

COLOMBI, R., GIORDANO, S., & KATERI, M. 2021. Hidden Markov models for longitudinal rating data with dynamic response styles: evidence on household financial capability. Submitted.

(6)

Figure 1.Observation probability functions of AWR and EMRS respondents in the two latent states of the perceived financial condition.

sponse of interest is the perception of the household’s financial ability to make ends meet (ve = very easily, e = easily, fe = fairly easily, sd = with some dif- ficulty, d = with difficulty, gd = with great difficulty), the covariates are: G (female, male), J (Jse: self-employee, Jhrs: housekeeper/retired/student,em- ployee), CH (with children, no children), D (with debts, no debts), S (with savings,no savings), E (up to secondary school,over high school), R (no risk averse in managing financial investments,risk averse), with the reference cate- gories being in italics. The minimum BIC corresponds to the model withk=2 states, meaning that households can be grouped according to whether they feel financially confident (l=1) or deal with financial stress (l=2). Fig. 1 allows us to characterize the choices of the respondents in 4 latent states. Individuals, in the financially confident latent state, when in doubt about their perception, tend to choose with more chance the optimistic extreme points, AWR peo- ple instead are more incline to the intermediate rates. Reluctant households (EMRS) in the latent group that deals with financial stress have the highest probabilities of reporting great difficulties, AWR people in the same group are more likely to point out just some difficulties. The behavior in the 4 stata is well distinguished, and optimistic/pessimistic choices are mainly due to the EMRS tendency. By the sign of the estimates in Table 1 row 1, we deduce that at the first occasion women, employees, people without savings, with high education and risk averse are with higher probability in a worse financial sta-

Table 1. Estimates (EM algorithm) of the parameters of logit models A, B, C, D.

parameters cst G Jse Jhrs CH D S E R

022) 2.8 0.44 -1.38 -0.75 -0.15 0.02 -1.44 -1.86 -0.35 (α¯0,α¯1) -0.06 -0.03 0.16 0.08 -0.04 0.32 0.63 0.04 0.14 021,β121) -0.86 1.32 0.27 -0.49 -0.89 0.48 -1.69 -1.16 -0.17 012,β112) -11.93 0.18 -0.91 -0.21 -0.36 -0.23 8.44 1.38 -8.83 (β¯011,β¯111) 1.10 0.45 -0.29 0.00 -0.20 0.13 -0.79 -0.47 -0.06 (β¯021,β¯121) -3.36 -0.05 1.09 -0.33 0.45 -0.37 1.97 0.81 -0.37 (β¯012,β¯112) 1.91 -0.07 -0.35 -0.23 0.00 -0.05 -0.19 -0.29 -0.39 (β¯022,β¯122) 1.69 -0.50 -0.34 -0.08 0.10 -0.07 1.80 -0.09 -0.37

cst: constant – 95% confidence interval does not contain zero

tus. Further, responders with savings show a major propensity to a response style at the beginning of the survey (row 2). From row 3, it seems that, in two consecutive moments, women move from a financially confident (l=1) condition to a worse status (l=2) with higher probability, while low-educated households with children and savings more likely tend to rest in the previous more comfortable financial status (l=1). Individuals who have savings and a low education pass with greater probability from the financial stressed status (l=2) to the better condition (l=1), while financially stressed households tend to remain in the same worst status with greater probability when they are no risk averse (row 4). From rows 5-6, it is more likely to change from the EMRS status ( ¯u=1) to an AWR behavior (u=2) for low educated persons with savings, who currently belong to the group of financially confident house- holds, while self-employee and low educated respondents with savings show greater probability of remaining in the EMRS status if in the previous occasion were reluctant ( ¯u=1) and in the current time are financially stressed (l=2).

Who is no risk averse and in the current moment feels to be financially confi- dent has higher probability of keeping the previous awareness in revealing the own financial capability. On the other hand, individuals with savings, being in the latent financially worrying status, tend with more propensity to give up on the previous AWR behavior and opt for a response style, rows 7-8.

References

BARTOLUCCI, F., FARCOMENI, A., & PENNONI, F. 2012. Latent Markov Models for Longitudinal Data. CRC Press.

COLOMBI, R., GIORDANO, S., & KATERI, M. 2021. Hidden Markov models for longitudinal rating data with dynamic response styles: evidence on household financial capability. Submitted.

Referenzen

ÄHNLICHE DOKUMENTE

Studies in ultra low voltage condition have been reporting the interesting phenomena such as the mismatch of the proportional relationship between the image contrast and the

54c, 764Ð770 (1999); received November 8, 1998/March 6, 1999 Antioxidative Enzymes, Lipid Peroxidation, Oxidative Stress, Oxyfluorfen, Protoporphyrin IX.. The response of plants to

Reflectance spectra allow the early detection of stressors causing differences in pigment content as well as changes of leaf tissue structure and photosynthetic activity..

For instance, the model of animal suffering developed within the laboratory sciences and expressed by the Departmental Committee on Experiments on Animals directly infl uenced

Firstly, the study was performed under controlled conditions, and for this reason, a simulated diurnal cycle was performed on the laboratory to simultaneously record the

• In order to determine a solution of (1) over a complete lattice with infinite ascending chains, we define a suitable widening and then solve (3) :-). • Caveat: The construction

Study one revealed that distinct cognitive decision-making mechanisms in a gambling task share neural mechanisms: Brain activity patterns extending from temporo-parietal to

Table 2: Solar PV and wind potential, CO 2 mitigation costs and impacts on maximum crop yield potentials for the top 40 countries ranked by groundwater