Opinion formation on social media: an empirical approach.

Opinion formation on social media: An empirical approach Fei Xiong and Yun Liu Citation: Chaos: An Interdisciplinary Journal of Nonlinear Science 24, 013130 (2014); doi: 10.1063/1.4866011 View online: http://dx.doi.org/10.1063/1.4866011 View Table of Contents: http://scitation.aip.org/content/aip/journal/chaos/24/1?ver=pdfcov Published by the AIP Publishing Articles you may be interested in Sociable Social Media Comput. Sci. Eng. 15, 88 (2013); 10.1109/MCSE.2013.111 Taking your conference experience to the next level with social media Phys. Teach. 50, 187 (2012); 10.1119/1.3685127 Adaptive bridge control strategy for opinion evolution on social networks Chaos 21, 025116 (2011); 10.1063/1.3602220 Social Influence and Water Conservation: An Agent-Based Approach Comput. Sci. Eng. 7, 65 (2005); 10.1109/MCSE.2005.21 Patterns of work attitudes: A neural network approach AIP Conf. Proc. 517, 221 (2000); 10.1063/1.1291261

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 142.132.1.147 On: Thu, 04 Sep 2014 21:20:15

CHAOS 24, 013130 (2014)

Opinion formation on social media: An empirical approach Fei Xiong1,a) and Yun Liu2 1

School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China Key Laboratory of Communication and Information Systems, Beijing Municipal Commission of Education, Beijing Jiaotong University, Beijing 100044, China 2

(Received 15 September 2013; accepted 5 February 2014; published online 11 March 2014) Opinion exchange models aim to describe the process of public opinion formation, seeking to uncover the intrinsic mechanism in social systems; however, the model results are seldom empirically justified using large-scale actual data. Online social media provide an abundance of data on opinion interaction, but the question of whether opinion models are suitable for characterizing opinion formation on social media still requires exploration. We collect a large amount of user interaction information from an actual social network, i.e., Twitter, and analyze the dynamic sentiments of users about different topics to investigate realistic opinion evolution. We find two nontrivial results from these data. First, public opinion often evolves to an ordered state in which one opinion predominates, but not to complete consensus. Second, agents are reluctant to change their opinions, and the distribution of the number of individual opinion changes follows a power law. Then, we suggest a model in which agents take external actions to express their internal opinions according to their activity. Conversely, individual actions can influence the activity and opinions of neighbors. The probability that an agent changes its opinion depends nonlinearly on the fraction of opponents who have taken an action. Simulation results show user action patterns and the evolution of public opinion in the model coincide with the empirical data. For different nonlinear parameters, the system may approach different regimes. A large decay in individual C 2014 Author(s). All activity slows down the dynamics, but causes more ordering in the system. V article content, except where otherwise noted, is licensed under a Creative Commons Attribution 3.0 Unported License. [http://dx.doi.org/10.1063/1.4866011] Opinion dynamics tries to describe the process of public opinion formation in social systems. Many opinion models have been presented that explore how the local individual behavior affects collective phenomena. However, those model results are seldom empirically justified using large-scale actual data, and whether traditional opinion models suitably describe online opinion interactions requires further exploration. We analyze users’ opinions regarding a certain topic using large-scale actual data collected from a famous social network, i.e., Twitter, and discover two nontrivial results: first, consensus is difficult to achieve in a finite time and second, users seldom change their opinions, and the number of individual opinion changes decays as a power law. We present a discrete opinion model including agents’ internal opinions and external actions that are determined by agents’ activity. Agents’ activity also evolves during the dynamics. Simulation results show our model can retrieve similar properties to those of actual data. We hope theoretical opinion models will be verified by actual data in different social systems so that they can better characterize actual social interactions. In the future, whether opinion models can predict the evolutionary trend of public opinion in actual situation will be investigated. This study will improve the applicability of research on opinion dynamics.

a)

E-mail: [email protected]

1054-1500/2014/24(1)/013130/8

I. INTRODUCTION

In recent years, many opinion models have been presented, simulating opinion interactions from a personal perspective. Given an individual interacting rule, the research on opinion dynamics aims to understand the global complex properties in the area of social science.1,2 Statistical methods are used to explore how the local rules affect the collective behavior of social agents.3 In these models, agents hold one of several possible opinions, corresponding to discrete opinion models,4–7 or the opinions of agents take value from a certain range of real numbers, i.e., continuous opinion models.8,9 Starting from an initial opinion configuration, agents update their opinions according to the interacting rules, and finally the models try to find the formation of public opinion and the conditions of phase transition. Discrete opinion models, such as the Ising model and the Sznajd model,10–12 tend to use the analogy of ferromagnetic spins from the science of solid-state physics. Interacting rules are defined in opinion models, and agents update their states following the pattern of ferromagnetic spins, explaining social systems through the analogy of physical systems. In most models, neighboring influence plays a vital role in individual decisions. The final macroscopic state of the system may be consensus, fragmentation, or polarization. In binary opinion models, the consensus state is often favored as a result of imitation and compromises among neighbors, but polarization may also be observed in some other discrete models, such as the model with discrete vectorial opinions13 and the voter model with three opinions.14 In addition, virtual and actual complex

24, 013130-1

C Author(s) 2014 V


013130-2

F. Xiong and Y. Liu

networks have been used to mediate opinion interaction.15–17 Sociological and psychological features have also been introduced into opinion models, such as memory,18 inertia,19,20 noise,21 and conviction,22 characterizing the way in which these features change individual behavior and the global dynamics in a specific scenario. Although more and more actual factors are being included in opinion models, the question of whether the models can adequately describe the process of opinion formation in real society and explain or even predict social phenomena still requires further exploration. In Ref. 23, the Finland 2003 election data of the voting sets were applied to verify an opinion model, and it was found that the transient opinion profiles produced by the model are in agreement with the real data. In Ref. 24, the authors studied political discussions on an Internet forum. They chose several hundred posts and identified their sentiments. Focusing on the growth and topology of the network, they proved that quarrels and personal conflicts between participants boost the growth of discussions. They identified the final state of discussions, finding that opinion exchanges do not lead to consensus formation and opinions tend to go to extremes. Similar results are found in Ref. 25. These studies24,25 raise the question of whether traditional opinion models suitably describe online opinion interactions, and whether the interacting rules reflect the actual characteristics of human behavior. Except for these few studies, due to the limitations of data acquisition and processing capability, large-scale empirical analysis has seldom been carried out to check the validity of opinion models. The Internet has become one of the most important ways to obtain information. As a popular application service on the Internet, online social media have attracted millions of users. On social media, users interact with others, build relationships, publish posts or replies, and discuss topics. Therefore, the growth of social networks is promoted by users’ actions. The process of opinion formation on social media is more complicated than in real society, and information diffuses and evolves more rapidly. For instance, users always discuss issues with others anonymously. They do not know the true names of their neighbors, and they cannot become well acquainted with the personality characteristics of their neighbors. Moreover, users cannot directly see the internal opinions of their neighbors but instead learn about their opinions through the posts they publish. In Refs. 26 and 27, a model with continuous opinions and discrete actions was presented; the findings suggested that after observing the actions of neighbors, agents update their own internal opinions that cannot be noticed by others. This model may be an attempt to interpret the opinion evolution on social media. Social media provide huge amounts of user and topic information. In Refs. 28 and 29, the authors empirically studied the spread of health behavior in an online social network. In Refs. 30 and 31, the dynamics of health behavior sentiments was investigated by real data collected from a large social network. From social media, one can also collect user relationships and posts easily and freely, and analyze the sentiments of the posts to reproduce the opinion evolution process. In this

Chaos 24, 013130 (2014)

study, we collect abundant data from a popular social medium, i.e., Twitter, and obtain users’ opinions from posts. Studying the evolution of public opinion, we determine differences in opinion formation compared with traditional opinion models. Based on our findings, we propose a discrete opinion model in which individual opinions are latent and only agents’ actions are accessible to others. Individual actions are induced by their activity, which evolves during the opinion evolution, and individual opinion changes nonlinearly depending on the proportion of opponents. The rest of the paper is structured as follows. Section II carries out an empirical analysis of opinion interaction on Twitter. Section III presents a model with internal opinions and external actions driven by individual activity. Section IV illustrates simulation results and provides a discussion about the model. Concluding remarks are given in Sec. V. II. EMPIRICAL ANALYSIS OF OPINION FORMATION ON ONLINE SOCIAL MEDIA

Twitter, a microblogging service, has become one of the most popular social media on the Internet. With its huge number of readers, Twitter has demonstrated its strength for information propagation. Twitter is sometimes the source of popular topics and can even cause online emergencies. Therefore, studying public opinion on Twitter can help us to understand the opinion formation process in online social networks. We collected an abundance of data from Twitter through our directed robot, including information about users and their related posts. After several hours of collection from Twitter, 2 348 854 user profiles and approximately 6 million posts from December 2010 to June 2011 were downloaded. The data in each month contain 301 184, 908 976, 967 328, 1 390 116, 1 282 112, 1 354 921, and 725 435 posts, respectively, and contain 164 154, 435 632, 447 910, 611 082, 586 182, 592 893, and 435 813 active users, respectively. Users of Twitter usually publish a post to express their ideas, attitudes, or sentiments toward a social event or product. User’s posts do not always track the real evolution process of user’s opinions because a user may not publish posts all the time, and the active users that create posts frequently only comprise a small proportion of the population. Even so, we can also analyze the total posts belonging to a certain topic to investigate the dynamic trend of public opinion on Twitter. In our sentiment analysis (see the Appendix), these posts may have positive or negative polarity. In this paper, we treat the sentiment of a post as user’s current opinion so users can hold one of two possible opinions on a topic. We chose three topics about electronic products (i.e., “iPhone 4,” “Blackberry,” “iPad 2”) and gathered related posts. Each of the topics contains 102 815, 225 954, and 199 702 posts, respectively. For the three topics, we calculated the proportion of cumulative positive posts at different time to quantify the opinion dynamics on Twitter; thus, we can observe the change of public opinion over time. In this paper, we consider that the system achieves an ordered state, if there exists a clear majority-minority splitting of two opinions, and the polarization of opinions can


013130-3

F. Xiong and Y. Liu

Chaos 24, 013130 (2014)

FIG. 1. Time evolution for the proportion of total positive posts on three topics.

also be observed; however, in a disordered state, the densities of the two opinions are equal and no opinion dominates.32 Figure 1 shows the evolution of the proportion of total positive posts on these topics. As shown in Figure 1, the public opinion for each topic fluctuates in the early stage, and then the dynamics is extremely slowed down after a short time. Ultimately, the public opinion evolves into an ordered state and one opinion predominates absolutely, but a state of complete consensus is difficult to reach. At first, one opinion takes a slight advantage, but this advantage gradually grows toward the initial majority opinion in the evolution process. On these three topics, the majority of people hold a commendatory attitude, and more approvers are attracted by the product “iPhone 4” and “iPad 2.” We also study the evolution of other topics, and find that an ordered state is a common stable state for online dynamic systems. In Ref. 33, the authors proposed a non-consensus opinion model in which agents within a community hold the same opinion but opinions of agents across communities are different. Then, we check the real user network of Twitter, and cannot find a clear community structure with consensus within, and the proportion of positive posts for these three topics in any subset of the real network is generally different. Now, we study the participation level of users in each topic. As shown in Figure 2, the distribution of the number of users’ posts decays as a power law with a long tail. The power exponents of the distribution for “iPhone 4,” “iPad 2,” and “Blackberry” are c ¼ 2:343 6 0:008, c ¼ 2:451 6 0:004, and c ¼ 2:767 6 0:011, respectively. More than 10 000 users only publish one post on a certain topic, but several enthusiasts discuss and comment on the product more than 100 times. Meanwhile, the heterogeneity in users’ participation in the topic “Blackberry” remains the largest among these topics, implying that users have less interest in this topic. It is also found that the results for the topics “iPhone 4” and “iPad 2” are similar. The reason may be that these two products “iPhone 4” and “iPad 2” are provided by the same corporation, and a lot of users hold positive attitude towards the corporation, as well as its products. Many users participate in both the topics “iPhone 4” and “iPad 2,” and have similar activity on these two topics.

FIG. 2. Distribution for the number of users’ posts on the three topics. The slopes of the straight lines for “iPhone 4,” “iPad 2,” and “Blackberry” are 2.343, 2.451, and 2.767, respectively.

Figure 3 shows the distribution for the number of individual opinion changes in the interaction. Agents’ internal opinions are recorded according to the posts published by them, so we can only consider those agents that publish posts at least twice to track opinion changes. As shown in Figure 3, although users create many posts to express their opinions, the influence of these posts on the opinions of neighbors is not effective, and users tend to insist on their original attitude about a product. We calculate the data, and find although people are less active on the topic “Blackberry,” more users tend to update opinions. One reason may be that some users do not form a deep impression on the topic until they withdraw from the interaction. It is also found that the distribution for the number of opinion changes approximately follows a power law. The power exponents for “iPhone 4,” “iPad 2,” and “Blackberry” are c ¼ 2:193 6 0:143, c ¼ 3:01 6 0:116, and c ¼ 2:68 6 0:174, respectively. One may think that the distribution of individual opinion changes is caused by the power-law decay of the number of posts.

FIG. 3. Distribution for the number of individual opinion changes. The slopes of the straight line for “iPhone 4,” “iPad 2,” and “Blackberry” are 2.193, 3.01, and 2.68, respectively.


013130-4

F. Xiong and Y. Liu

Chaos 24, 013130 (2014)

has the opportunity to update its opinion. Then, agents’ states are changed as follows:

FIG. 4. The number of individual opinion changes versus the number of posts for each agent.

However, as in Figure 4, the number of individual opinion changes does not have a clear relationship with the number of posts. III. THE MODEL

In the empirical analysis of the Twitter data, we found that users of social media prefer to keep their original opinions. The complete consensus is difficult to reach, and the system often evolves into an ordered state. In fact, even if agents can update their internal opinions, they do not always publish posts to share their ideas with the public and may drop out of interaction. In consideration of the characteristics of online social interaction, we present the following opinion model. We assume that each agent has two properties, its opinion and activity. Agents hold one of two possible opinions, a positive r ¼ þ1 or a negative opinion r ¼ 1. It should be noted that agents’ opinions are latent, and that agents are unaware of the internal opinions of their neighbors. If an agent takes an action (e.g., publishes a post on social media) to express its opinion at time t, then its neighbors can notice its choice. The opinion of this agent’s action is the same as its opinion at time t. This means that at the following time, the agent may change its opinion, but if it does not take a further action, neighbors can only realize its opinion at time t. However, the agent’s current opinion may no longer be consistent with that action. An agent’s decision depends on its activity, which is denoted by s, and the most active agent has a higher priority to take an action. Agent’s activity is influenced by its neighbors. If an agent expresses its opinion, the neighbors see its action and are motivated, so that their activity on the topic increases. Now, we will introduce the interacting rule of our model. In the beginning, all agents are frozen, i.e., their activity s ¼ 0. Then, m agents are selected at random, and their activity is increased by 1. These agents are initially active agents that start a conversation on a topic. At each time step, m agents with the highest activity are selected to express their opinions. Any agent that has ever taken an action is considered to be active and attentive to the topic, and thus

(1) After agents take an action, their activity decays by the proportion d(d < 1). Agents have no reason to maintain their activity, and are unwilling to repeatedly publish the same opinion. (2) Agents’ actions provide a demonstration for their neighbors, and neighbors are likely to be motivated because agents usually imitate other’s behavior. Therefore, after each agent’s action, the activity of its neighbors increases by 1. (3) After noticing the recent actions, neighbors are aware of the opinions of these agents. In addition to their activity, active neighbors will also update their opinions. The probability that active neighbors will change their opinions nonlinearly depends on the fraction of disagreeing active agents in their local environment, similar to Refs. 34 and 35. For instance, if agent i publishes its opinion at time t, then its activity si decreases by dsi and all of its neighbors increase their activity by 1. Let j denote one of agent i’s active neighbors, and then agent j will update its opinion. Assuming the fraction of agent j’s neighbors taking an opposite action with j is p, the probability of agent j changing its opinion is defined as pa (a > 0). It should be mentioned that agent j can only see the external actions of its neighbors, ignoring the inactive agents that have not take any action. This means that although some agents may disagree with agent j, if these agents do not express their idea to the public, their opinion has no influence on j. From the above model, when a ¼ 1, one recovers the voter model, and the average magnetization is conserved. When a ! 0, agents change their opinions with nearly the same probability and more randomly. Therefore, the initial discrepancy between the density of two opinions will decrease gradually, and the model will evolve almost like a coin-flip. When a > 1, agents change opinions only when a large fraction of neighbors disagree, and the prevalence of initial majority opinion is enhanced. However, if a is too large, the opinion dynamics will become frozen and the average magnetization remains unchanged. Obviously, in our model, the probability that an agent with fully disagreeing neighbors changes its opinion is equivalent to 1, despite the parameters. We consider a simplified case, assuming that agents freely update their opinions on a fully connected network, regardless of the activity. We conduct an analysis of the overall opinion. The overall density of opinion þ1 at time t is denoted by f ðtÞ. Thus, the evolution of f ðtÞ is shown as follows: a

a

@f ðtÞ=@t ¼ ð1 f ðtÞÞ f ðtÞ f ðtÞ ð1 f ðtÞÞ :

(1)

Especially, when a ¼ 1, the variation of f ðtÞ equals zero, thus leading to f ð1Þ ¼ f ð0Þ. For a 6¼ 1, it is easy to obtain the solution f ð1Þ (simply written as f ) to Eq. (1); that is, f ¼ 0, f ¼ 1, or f ¼ 0:5. We can analyze the stability of these stationary points by introducing small deviations


013130-5

F. Xiong and Y. Liu

around these solutions.19 When a < 1, the stable solution is f ¼ 0:5. When a > 1, the stable solution is f ¼ 1 or f ¼ 0, depending on the initial condition. If f ð0Þ > 0:5, then f ¼ 1, and vice versa. However, in the model, agents take an action according to their activity, and the withdrawal from the interaction for agents will prevent the complete consensus state. IV. SIMULATION RESULTS

We are interested in the co-evolution of individual opinions and activity. We will investigate the heterogeneity of the participation level of agents, and how likely an agent will be to change its opinion following those of its neighbors in our model. In the analytical approach, we neglect agents’ activity, so that for a larger a, the system achieves the consensus state. Now, we will consider the influence of individual activity on its external actions and study the process of opinion evolution. First, we will use a real social network as an interaction topology, and obtain the statistical characteristics of agents’ activity and opinion distribution. Then, we will construct a virtual network that mediates the dynamics of the model, studying the relationship between the final opinion distribution and the parameters, and how the nonlinear probability of opinion change takes effect. In the simulations, agents’ initial opinions are assigned uniformly with a given ratio between the two opinions. After m agents with the highest activity take an action and their active neighbors update their opinions asynchronously, the time step is increased by 1. Therefore, the parameter m determines the overall time scale, which should not be too large. A. Statistical characteristics of user behavior and opinion update

We downloaded node information and relationships from Twitter to build a real network. In the network, all neighboring relationships were downloaded, so the network could be treated as a subset of Twitter. The network contains 4286 nodes, and the average degree of the network is 29.38. The network is used to mediate interactions in our model. Here, we concentrate on the macroscopic distribution of agents’ opinions and activity.

FIG. 5. Time evolution for the proportion of total positive actions with different initial opinion assignments, d ¼ 0:2, m ¼ 20, and a ¼ 2.

Chaos 24, 013130 (2014)

Figure 5 shows the temporary evolution of the proportion of total positive actions from our model. We choose three initial opinion configurations f ð0Þ ¼ 0:4, f ð0Þ ¼ 0:6, and f ð0Þ ¼ 0:8. Obviously, the proportion of action þ1 has a drastic change in the early stage, and after a short time, it tends to level off towards the direction of consensus. The density difference between the two opinions is enhanced because more agents turn to the initial majority opinion. The magnetization is not conserved, and the system becomes more ordered. Moreover, if the decay parameter d is too large, the dynamics stabilize more slowly, but the consensus state is still hard to reach within a finite time. In Figure 5, for a large initial density of f ð0Þ ¼ 0:8, a few agents still take the negative opinion and are difficult to persuade even with more time. These phenomena are in accordance with the opinion evolution of real topics. In addition, although some agents with low activity may drop out of the interaction, the dynamic process is not stopped. Active agents continue to publish their opinions, but these two opinions reach a state of balance. Although the system still evolves extremely slowly, approaching the ordered absorbing state, agents’ activity-based behavior prevents the occurrence of complete consensus in a finite time. As illustrated in Figure 6, the distribution of individual participation level on the real network in our model has almost a power-law scaling, similar to the actual data in Figure 2. Intuitively, with a large activity decay, agents will become increasingly reluctant to take actions. Thus, as the decay parameter d increases, the number of agents that seldom publish their own opinions also increases, leading to a larger absolute value of the power exponent. In addition, when the number of actions is above 30, the corresponding number of agents is scattered in a large range, and does not strictly adhere to the power-law scaling. We also find that, when the parameter m is small, the distribution of individual participation level is almost independent of m. However, if m becomes extremely large, especially when it approaches the system size although this is unrealistic, the power law

FIG. 6. Distribution of the number of individual actions with various values of d, f ð0Þ ¼ 0:6, m ¼ 20, and a ¼ 2. The slopes of the straight lines for d ¼ 0:8, d ¼ 0:6, and d ¼ 0:2 are 2.2017, 1.675, and 1.246, respectively.


013130-6

F. Xiong and Y. Liu

Chaos 24, 013130 (2014)

a ¼ 1, the model reduces to the voter dynamics, but agents do not have continual opinion interactions as they do in the voter model due to the influence of individual activity. Increasing the nonlinear parameter a, only agents with a large fraction of opponents will reconsider the other opinion, so agents tend to keep their original opinions. We realize that even if agents only update opinions a few times, most agents will eventually adopt the majority opinion. B. Dependence of opinion evolution on parameters

FIG. 7. Distribution of the number of individual actions in different networks, N ¼ 50000, m ¼ 20, d ¼ 0:8, and a ¼ 2. The slope of the straight line is 2.6142.

does not exist. In the model, agents with a larger degree are motivated more often and thus have large activity. Therefore, the underlying network still has influence on individual actions. We investigate the network extracted from the social medium, and find that the node degree has a Poisson-like distribution. Furthermore, we also implement simulations on large-scale small-world and scale-free networks with the same average degree k ¼ 30 and achieve analogous results in Figure 7. The average degree of the network does not affect the existence of power-law distribution, but the absolute value of the power exponent decreases with the increase in the average degree. We explore the frequency of individual opinion changes in our model. Consistent with Figure 3, opinion changes are recorded according to external actions because only external actions can be observed by the public; i.e., agents that publish their opinions at least twice are taken into consideration. In Figure 8, no extremeness is introduced into our discrete model, but agents driven by activity rarely change their ideas, coinciding with the actual situation in Figure 3. When

FIG. 8. Distribution of the number of individual opinion changes, f ð0Þ ¼ 0:6, m ¼ 20, and d ¼ 0:8. The slopes of the straight lines for a ¼ 1, a ¼ 1:5, and a ¼ 2 are 2.011, 2.36, and 3.1769, respectively.

The proportion of action þ1 does not always reflect the evolutionary process of internal opinions because some active agents may take more actions and express their opinions more frequently. Therefore, we investigate the detailed internal opinion dynamics with different initial conditions, and calculate the final opinion distribution. Scale-free networks are used as an interacting topology with an average degree of 20. The number of agents N is 10 000. The order parameter36 measures the disordering of a system and characterizes the density divergence of two opinions. A larger order parameter means that the majority opinion accounts for a larger proportion of the population. If the order parameter equals 0, the two opinions have the same density. The order parameter is defined as X n ri N ; O ¼ (2) i¼1

where h:::i denotes the configurational average. Figure 9 illustrates the final order parameter in our model with different initial opinion assignments. The opinion evolution process is intrinsically determined by the nonlinear parameter a, and is also related to f ð0Þ. For a ¼ 1, the order parameter changes linearly with f ð0Þ, indicating the conserved magnetization; however, consensus cannot be reached in any realization. The order parameter remains unchanged during the evolution, and this result coincides with the mean-field

FIG. 9. Final order parameter as a function of f ð0Þ with different values of a after 1000 time steps, d ¼ 0:2. In the left plot, m ¼ 20, and in the right plot, m ¼ 40. The results are averaged over 100 different realizations.


013130-7

F. Xiong and Y. Liu

FIG. 10. Final order parameter with different values of d after 1000 time steps, m ¼ 20, and a ¼ 3. The plot is an average of 100 different simulations.

analysis. When a < 1, the order parameter goes down, implying that the majority opinion has less advantages. Agents perhaps change their opinions even when faced by a small fraction of opponents, making the opinion distribution more random. In particular, a much smaller a will reduce the difference of order parameters with different values of f ð0Þ. However, as a result of the influence of agents’ activity, the disordered state with zero magnetization is hardly achieved. For a > 1, the system evolves into a more ordered state. The density of the initial majority opinion is enlarged, leading to a large order parameter. The system tends to stabilize quickly, and then the dynamics becomes much slower, so that it becomes difficult for individual opinions to converge. We also find that for a 1, if f ð0Þ is set around 0.5, then the final densities of these two opinions are approximately equal to each other; however, for a > 1, one opinion will still predominate because of random fluctuations. In addition, larger m promotes individual opinion interaction and increases the variation of the order parameter during the dynamics, but the parameter m cannot affect the final essential regime of the system. Figure 10 shows the final order parameter with different activity decays. Clearly, the activity decay d does not change the essential ordering of the system, but it can influence the relaxation process. The order parameter is always larger than that of the conserved system. Dramatically, a large decay d slows the dynamics and decelerates individual interactions but gives the system a larger order parameter. The reason is that when the agents that take actions lose their activity more rapidly, other agents have an additional chance to express their own opinions. Therefore, the extent of interactions is adequately broadened, and more agents will reconsider and update their opinions, thereby strengthening the advantage of the majority opinion. In particular, for d ¼ 0:8, even if f ð0Þ is approximately 0.5, a large order parameter is produced. V. CONCLUSIONS

To investigate the actual opinion formation process and to verify the validity of opinion models for actual

Chaos 24, 013130 (2014)

interactions especially on social media, we obtained an abundance of data from Twitter. We collected more than one million posts, and assessed their sentiments because posts created by users reflect their internal opinions. Then, we analyzed the user behavior patterns and the opinion formation process. We found that the public opinion often evolves into an ordered state in which one opinion dominates absolutely in the population. Moreover, we found that agents would rather express their opinions than change them. We proposed an opinion model in which agents have discrete internal opinions and external actions. Agents take actions according to their activity, and that activity evolves during the dynamics. Agents’ actions have the same sentiment as their internal opinions, and can influence their neighbors, whose opinions depend on the proportion of neighboring opponents. We carried out computer simulations for our model, and compared the results with empirical studies. Results show that the distribution of the number of individual actions decays as a power law in different underlying networks and that agents tend to maintain their original opinions. These phenomena agree with the actual situation on social media. With dissimilar nonlinear parameter of opinion update, the system evolves into distinctly different regimes. If the parameter equals 1, then the average magnetization is conserved. If the parameter is above 1, the system achieves a more ordered state where one opinion takes the majority, in agreement with actual opinion formation in social networks. However, when the parameter is below 1, the ordering of the system will decrease. The decay of individual activity decelerates opinion interactions, but large activity decay may lead to more ordering of individual opinions. Although the sentiment classification algorithm we used has excellent performance for short-text documents, the accuracy of our results can still be improved by implementing new algorithms of sentiment analysis in the future. Moreover, in this paper, users’ internal opinions are gained from the posts published by them; however, some users may not always publish posts to express their opinions, even if their opinions have changed. We cannot track the real-time opinion evolution of the agents that are not active enough. Therefore, our results are just treated as a large-scale sample of the real social network. An interesting Twitter application that can attract users to express their feelings towards a given topic should be developed, and the popularity of the application determines the reasonableness of the data collected by it. Furthermore, in the future, we hope to build a realistic model capable of predicting the evolutionary trend of public opinion and to use actual data to train parameters, instead of presetting parameters at the beginning of simulations in our current model.

ACKNOWLEDGMENTS

This work was partially supported by the National Natural Science Foundation of China under Grant Nos. 61172072 and 61271308, the Beijing Natural Science Foundation under Grant No. 4112045, the Research Fund for the Doctoral Program of Higher Education of China under Grant No. 20100009110002, the Beijing Science and


013130-8

F. Xiong and Y. Liu

Technology Program under Grant No. Z121100000312024, the Fundamental Research Funds for the Central Universities under Grant No. 2014JBM018. We also thank Dr. Pang Wu for his help of processing data. APPENDIX: DATA PROCESSING

Most of the posts downloaded from Twitter were written in English, and therefore, we only concentrated on English posts. The language detection tools provided by Cybozu Labs37 were used to eliminate the posts in other languages. This tool divides documents into different language categories based on the Naive Bayes classifier, and the accuracy of the tool approaches 99%. Next, the algorithm of lexiconbased constrained symmetric nonnegative matrix factorization (CSNMF) was used to implement the sentiment classification of users’ posts.38 CSNMF is suitable for sentiment analysis of short texts on online social media and can achieve a high accuracy.39 CSNMF was evaluated by labeled short-text documents, and performed very well.39 Moreover, from the global view with a large data set, the errors of sentiment analysis in a single post tend to cancel out,40 so that the average sentiment of total posts is reliable. After analyzing the sentiment values of all posts, we deleted the descriptive posts that do not contain users’ attitudes, preferences, or emotions, and only conserved the posts that have either positive or negative polarity. 1

C. Castellano, S. Fortunato, and V. Loreto, Rev. Mod. Phys. 81, 591 (2009). 2 B. Latane, Am. Psychol. 36, 343 (1981). 3 S. Galam, Sociophysics: A Physicist’s Modeling of Psycho-political Phenomena (Springer, 2012), Vol. 439, p. 297 illus. 261. 4 V. Sood and S. Redner, Phys. Rev. Lett. 94, 178701 (2005). 5 S. Galam, Physica A 274, 132 (1999). 6 S. Galam, Physica A 285, 66 (2000). 7 S. Galam, Y. Gefen, and Y. Shapir, J. Math. Sociol. 9, 1 (1982). 8 G. Weisbuch, G. Deffuant, F. Amblard, and J.-P. Nadal, Complexity 7, 55 (2002).

Chaos 24, 013130 (2014) 9

R. Hegselmann and U. Krause, Journal of Artifical Societies and Social Simulation 5, 2 (2002), see http://jasss.soc.surrey.ac.uk/5/3/2.html. K. Sznajd-Weron and J. Sznajd, Int. J. Mod. Phys. C 11, 1157 (2000). 11 K. Sznajd-Weron, M. Tabiszewski, and A. M. Timpanaro, Europhys. Lett. 96, 48002 (2011). 12 S. Galam, Physica A 238, 66 (1997). 13 R. Axelrod, J. Conflict Resolut. 41, 203 (1997). 14 S. Banisch and T. Ara ujo, Discontinuity, Nonlinearity, and Complexity 2, 57 (2012). 15 B. Kozma and A. Barrat, Phys. Rev. E 77, 016102 (2008). 16 D. Stauffer and M. Sahimi, Eur. Phys. J. B 57, 147 (2007). 17 C. Qian, J. D. Cao, J. Q. Lu, and J. Kurths, Chaos 21, 025116 (2011). 18 H. U. Stark, C. J. Tessone, and F. Schweitzer, Phys. Rev. Lett. 101, 018701 (2008). 19 R. Lambiotte, J. Saramaki, and V. D. Blondel, Phys. Rev. E 79, 046107 (2009). 20 Y. Liu and F. Xiong, Phys. Lett. A 377, 362 (2013). 21 N. G. F. Medeiros and A. T. C. Silva, Phys. Rev. E 73, 046120 (2006). 22 A. C. R. Martins and S. Galam, Phys. Rev. E 87, 042807 (2013). 23 S. Banisch and T. Araujo, Phys. Lett. A 374, 3197 (2010). 24 P. Sobkowicz and A. Sobkowicz, Eur. Phys. J. B 73, 633 (2010). 25 A. Chmiel, P. Sobkowicz, J. Sienkiewicz, G. Paltoglou, K. Buckley, M. Thelwall, and J. A. Holyst, Physica A 390, 2936 (2011). 26 A. C. R. Martins, Int. J. Mod. Phys. C 19, 617 (2008). 27 A. C. R. Martins, Phys. Rev. E 78, 036104 (2008). 28 D. Centola, Science 329, 1194 (2010). 29 D. Centola, Science 334, 1269 (2011). 30 M. Salathe and S. Khandelwal, PLoS. Comput. Biol. 7, e1002199 (2011). 31 M. Salathe, D. Q. Vu, S. Khandelwal, and D. R. Hunter, EPJ Data Sci. 2, 1 (2013). 32 J. Y. Guan, Z. X. Wu, and Y. H. Wang, Phys. Rev. E 76, 042102 (2007). 33 R. Lambiotte, M. Ausloos, and J. A. Hołyst, Phys. Rev. E 75, 030101 (2007). 34 R. Lambiotte and S. Redner, Europhys. Lett. 82, 18007 (2008). 35 F. Schweitzer and L. Behera, Eur. Phys. J. B 67, 301 (2009). 36 N. Crokidakis and C. Anteneodo, Phys. Rev. E 86, 061127 (2012). 37 Cybozu Labs, Language Detection Library for Java, see http://www. slideshare.net/shuyo/language-detection-library-for-java. 38 W. Peng and D. H. Park, in Proceedings of the International AAAI Conference on Weblogs and Social Media (2011), p. 273. 39 L. Nguyen, P. Wu, W. Chan, W. Peng, and J. Zhang, in Proceedings of the Workshop on Issues of Sentiment Discovery and Opinion Mining at 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2012), No. 6. 40 B. O’Connor, R. Balasubramanyan, B. R. Routledge, and N. A. Smith, in Proceedings of the International AAAI Conference on Weblogs and Social Media (2010), p. 122. 10


Community pharmacists, Internet and social media: an empirical investigation.

Opinion dynamics on interacting networks: media competition and social influence.

Opinion Formation by Social Influence: From Experiments to Modeling.

Opinion formation models on a gradient.

Committee Opinion No. 622: Professional use of digital and social media.

Interplay between media and social influence in the collective behavior of opinion dynamics.

Opinion strength influences the spatial dynamics of opinion formation.

An empirical analysis of White privilege, social position and health.

Stigma's Effect on Social Interaction and Social Media Activity.

Rural electrification and respiratory health: an empirical approach in Peru.

Evaluating international research ethics capacity development: an empirical approach.

Characterizing Social Interaction in Tobacco-Oriented Social Networks: An Empirical Analysis.

Bicelles coming of age: an empirical approach to bicelle crystallization.

An empirical Bayes approach to network recovery using external knowledge.

The Social Consequences of Poverty: An Empirical Test on Longitudinal Data.

Relapse to substance abuse: empirical findings within a cognitive-social learning approach.

Opinion formation with time-varying bounded confidence.

Ethics and social media.

Social media and health.

Palliative social media.

Embracing social media.

Social media and communication.

Social media activism and Egyptians' use of social media to combat sexual violence: an HiAP case study.

An approach using ensemble empirical mode decomposition to remove noise from prototypical observations on dam safety.