Coupled Social Media Content Representation ModelModel

Socioeconomic Status based on Social Media Data

4.3 The Proposed Model

4.3.2 Coupled Social Media Content Representation ModelModel

In this part, we first present the social media text representation method and coupled user level attribute representation method. Then, the social media text representation and platform-based user level attribute representation are aggregated into a vector representation of social media content. Finally, based on the social media content representation, we build a 3-way classifier to assign SES label to each social media user.

Social Media Text Representation. Long Short-Term Memory (LSTM) [43], a variation of RNN, is widely adopted for textual data modeling due to its excellent performance on sequence modeling. LSTM is able to consider long-term dependencies of a sequence through introducing a memory cell. To model the semantic representation of social media text and consider the order of text, we adopt BiLSTM (Bidirectional LSTM) to represent the social media text both from forward and backward, which can increase the amount of input information available to the network compared with LSTM. Besides, to take into account the hierarchical structure of social media text, inspired by the principle of compositionality [30], we model a social media user’s text through a hierarchical structure composed of three levels, i.e., word-level, microblog-level and user-level.

As shown in Figure 4.2, in the word level, we first embed each word in a microblogbiinto a low dimensional semantic space, i.e., each wordw_jⁱ is mapped to its embeddingwⁱ_j ∈R^d. The word embedding method and its settings will be described in Section 4.5.1. At each step, given

4.3 The Proposed Model 71

Figure 4.2:The architecture of the proposed model.

an input word embeddingwⁱ_j, the current cell statecⁱ_j and hidden statehⁱ_jcan be updated with the previous cell statecⁱ_j−1and hidden statehⁱ_j−1as follows:



wherei,f,oindicates gate activations,denotes element-wise multiplication,σis the logistic sigmoid function andW,bare the trainable parameters. Therefore, for a sequence of words {wⁱ₁, wⁱ₂, ..., wⁱ_l

i}, the forward LSTM reads the word sequence fromwⁱ₁towⁱ_l

i and the backward LSTM reads the word sequence fromwⁱ_l

i tow₁ⁱ. Then we concatenate the forward hidden state

−→

hⁱ_jand the backward hidden state←−

hⁱ_j, i.e.,hⁱ_j = [−→ hⁱ_j;←−

hⁱ_j], where[.;.]denotes the concatenation operation. In BiLSTM, the hidden statehⁱ_j denotes the information of the whole sequence centered aroundwⁱ_j. As a result, the BiLSTM network receives[wⁱ₁,wⁱ₂, ...,wⁱ_l

i]and generates hidden states[hⁱ₁,hⁱ₂, ...,hⁱ_l_i]. Then we feed the hidden states to an average pooling layer to obtain the microblog text representationbifor microblogbi.

Intra Intra Intra Inter

Inter Inter

Coupled interactions among attributes

Original user level attribute set Integration Coupled attribute representation

Figure 4.3:An overview of coupled user level attribute representation.

In the microblog level, given the microblog representation vectors of a user{b₁,· · · ,bn}, we also utilize BiLSTM to encode the microblogs as follows:

−

→h_i =−−−−−−−→

LST M(b_i), (4.5)

←−

hi =←−−−−−−−

LST M(bi), (4.6)

We then concatenate the forward hidden state −→

hi and the backward hidden state ←− hi, i.e., h_i = [−→

h_i;←−

h_i].h_isummarizes the neighbor microblogs around thei-th microblog but still focus on thei-th microblog. Then we feed the hidden states to an average pooling layer to obtain the final social media text representationu^tfor useru.

Coupled User Level Attribute Representation. Besides social media text, each social media user generally has platform-based user level attributes. For example, some attributes like the number of followees indicate platform impact, some like the number of microblogs indicate platform behaviors. Like previous related work, we assume that these user level attributes could make a contribution to the representation of social media content for individual SES prediction.

To our best knowledge, most previous works only leverage original user level attributes without considering relations among attributes. However, inspired by previous work [18, 96], in the real word, attributes are more or less coupled via explicit or implicit relationships. Therefore, it is natural to hypothesize that the user level attributes are related to each other in some way. To this end, this work proposes to employ a coupled representation method [96] to represent user level attributes, which is able to capture such latent relations among attributes.

To be more specific, as illustrated in Figure 4.3, we consider two kinds of interaction relations among platform-based user level attributes: the intra-coupled interaction within an attribute with the correlations between every attribute and its own powers, and the inter-coupled interaction among different attributes with the correlations between each attribute and the powers of other attributes.

4.3 The Proposed Model 73

Firstly, we map the original attribute space to an expanded space for incorporating linear and nonlinear information by means of a power expansion as follows:

{ha₁i¹,ha₁i², ...,ha₁i^L,ha₂i¹,ha₂i², ...,ha₂i^L, ...,ha_mi¹,ha_mi², ...,ha_mi^L} (4.7) whereha_ji^p(1≤p≤L, p∈Z,1≤j ≤m)denotes thep-th power of the corresponding value of attributeaj.

Leveraging the power expansion, the intra-coupled interaction within attributeaⁿ_j is defined as anL×LmatrixMIa(aj), with considering the correlations between attributeaj and its own powersha_ji^p.

whereθpq(j)denotes the Pearson’s product-moment correlation coefficient betweenha_ji^pand ha_ji^q. Here, we use the revised correlation coefficient by taking account of the p-values for testing the hypothesis of no correlation between attributes, i.e., if p-value is no less than 0.05, the correlation coefficient is set to 0.

Besides, the inter-coupled interaction between numerical attributeaj and other attributesak

(k6=j)is defined as anL×L·(m−1)matrixM_Ie(a_j|{a_k}_k6=j).

whereδ_pq(j, k_i)denotes the Pearson’s product-moment correlation coefficient betweenha_ji^p andha_k_ii^q, and{a_k}_k6=j ={a_k₁, ..., a_k_m−1}is the set of attributes other thanaj.

For each user objectui, the attribute values ofaj and its powers are presented as a vector:

ze_i(a_j) = [hv_iji¹,hv_iji², ...,hv_iji^L], (4.10) while the attribute values of other attributes{a_k}_k6=jand their powers are denoted as another vector:

zei({a_k}_k6=j) = [hv_ik₁i¹,hv_ik₁i², ...,hv_ik₁i^L, ...,hv_ik_m−1i¹,hv_ik_m−1i², ...,hv_ik_m−1i^L]. (4.11)

Here, the attribute value of user ui on attributeaj isvij. We incorporate the intra-coupled interaction and the inter-coupled interaction into a new coupled attribute representation, a1×L vectorri(aj), for user objectuion the numerical attributeaj as follows:

ri(aj) =z_ei(aj)w⊗[Mⁿ_Ia(aj)]^T +z_e_i({a_k}_k6=j)[w,w,· · · ,w]

| {z }

m−1

⊗[Mⁿ_Ie(a_j|{a_k}_k6=j)]^T, (4.12)

wherew= [1/(1!),1/(2!),· · · ,1/(L!)],denotes the Hadamard product and⊗indicates the matrix multiplication. After considering all thednoriginal numerical attributes, we obtain the final coupled user level attribute representation for the user objectu_ias follows:

r^a_i = [r_i(a₁),r_i(a₂),· · ·,r_i(a_m)]∈R^L·m (4.13)

Before fusing the user level attributes, to capture the latent relationships between high level features, we link the raw attribute vectorr^ato thek-length representation vectoru^ain terms of a fully connected network as follows:

u^a=r^a·W_a (4.14)

where the weightWaencodes the interaction strength over attributes in the fully-connected layer.

Consequently, in the user level, we aggregate user level attributes and social media text into a representation vector. More specifically, we concatenate the social media text representation and the coupled user level attribute representation to obtain the social media content representation u= [u^t;u^a].

Individual SES Prediction based on Social Media Content. Given the high level repre-sentation of social media content, we employ a linear layer and a softmax layer to project the social media content representationuinto SES distribution ofCclasses as follows:

p_c=sof tmax(W u+b). (4.15)

wherep_c is the predicted probability of SES labelc. In this model, the cross-entropy error between ground truth SES level distribution and predicted SES level distribution is defined as loss function for optimization when training:

L=−^X

u∈U C

c=1

p^g_c(u)·log(pc(u)), (4.16) wherep^g_cdenotes the gold probability of SES labelcwith ground truth being 1 and others being 0, andU represents the training social media users.

4.3 The Proposed Model 75

Keyword

Search user

User 2

User 3 User 1 certificated person card

Figure 4.4:A demonstration of user search function in Sina Weibo.

Im Dokument Identification of Online Users' Social Status via Mining User-Generated Data (Seite 81-86)