Saturday, November 24, 2012

A SNA application on Study of Twitter


Recently, a friend of mine is engaging in a research about web mining of microblog (weibo), and trying to present some association rules and marketing recommendations. He said there are few studies on web mining of microblog and the main reason is that the length of the blog is too short to find sufficient information and the vocabulary and phrases used in microblog tend to be casual and informal that hinders the machine learning.

Figure 1. Shannon and Weaver's Model of Communication applied to Twitter Communication Channel

Weibo, as the Chinese version of microblog, has been playing a more and more important role in our daily life just as Twitter in western society. I spend some time to search the articles about Twitter and finally found a published essay about An Observational Study of Physical Activity-Related Tweets. This is a PHD dissertation from Columbia University. The specific aims of this observational study of physical activity-related messages (Tweets) from the microblogging social medium, Twitter, were to determine the overall network structure and major communities among Tweet sources, and describe Tweet contents. The research team applied web data mining methods including social network analysis and n-gram based text mining techniques to discover network patterns among Tweet sources and contents of 174,394 Tweets that mentioned at least one of 17 different physical activities.

The primary framework underpinning this study is Shannon and Weaver's mathematical model of communication (Shannon, 1948; Weaver & Shannon, 1963), which was introduced by Prof. Chan earlier in our class. The social network analysis also use some indicators we are familiar with to demonstrates that most physical activity Tweet networks have sparse networks consisting of many isolates and small groups (total average Tweet users= 2000, and density = 0.00037, reciprocity 12.5%, total degree centralization 0.0113, link count 970, isolates 743 per a network). The analysis results yielded graphical representations of Tweet communication network structures and network measures and identified key actors and communities. Key actors in communities in most of the 17 physical activity networks were predominantly individuals rather than organizations, healthcare providers, or governments.
The study results contribute to advancing the methodological breadth of mining social media for health-related purposes and also a good case study for other purposes’ study on microblog.


[1] Sunmoo Yoon. Application of Social Network Analysis and Text Mining to Characterize Network Structures and Contents of Microblogging Messages: An Observational Study of Physical Activity-Related Tweets. Columbia University, 2011.

Tuesday, November 13, 2012

An Interesting Practice of Social Network Analysis


From the first lecture, we have been talking about Social Network. Social Networks consists of a set of actors (i.e., people, organisations or Web sites) connected by a set of relationships such as friendship, information exchange, or Web traffic. When one or more relations connect a pair of actors, it is assumed that the pair has a tie.

 Figure 1. Social Networks in Renren.com


  Figure 2. Mass Social Networks on the Internet

Then how to gain something useful and avoid the noise from complicated networks consisting of large numbers of nodes and ties? Here comes the SNA. Social Networks analysis (SNA) is a shift from the individual level analysis towards a structural analysis. The characteristics of network analysis include on relationships between actors and the effect of the structure on the outcome. Perhaps one of the most interesting features of network analysis is its visual display called network diagram (see Figure 1 and Figure 2.).

                                                              Figure 3. Sociogram of LUO Ling's blog network

I was indeed impressed by the Sociogram of IEMS5720 blogosphere shown by Prof. Chan in the lecture 7. It tells us who is the most influential and who is the most prestigious in the class. For the purpose of further understanding and review of Graph theory related to SNA, I also drew a Sociagram of my blog network (see Figure 3., due to 12 Nov 2012 20:00 pm). The people involved in the diagram are the ones who gave comments to my blogs or the ones who received my comments. The blue node in the middle of the diagram represents myself and I got comments from 5 different classmates and gave comments to 6 other guys in one-way direction. CUI Helei (the red node) got comments from most people and the yellow node has only one tie that means this node has little communication with other nodes in this network. However, it doesn’t necessarily say the red node is more active than the yellow node in other sub groups of the whole class. 

There are a lot of interesting concepts and theories in the field of SNA besides the above we have talked, let's continue to find more and learn more through practice. 

[1] Hanna L. Schneider, Lilli M. Huber. Social Networks: Development, Evaluation and Influence. Nova Science Publishers, 2008.

Monday, November 5, 2012

Collaboration makes a better understanding


In the lecture 5&6, we had a topic of Communication and Social Behaviours on the Internet. We talk much about group, like ‘What is Group?’, the Types of Group, the Group Structure, etc. Because group is a very important concept when we study ‘Communication and Social Behaviours’.

‘A communication network is formed within a group.’ This is the relationship between communication and group. Then Prof. Chan introduced the five-person communication network (Circle, concom, wheel, Y, and chain) that reminds me of major types of computer networks.

More interesting things came in the Lecture 6 in which we got a case study, Social Cloud Computing: A Vision for Socially Motivated Resource Sharing. When finishing reading the case material, we were asked to answer two questions as an individual work. After the individual work, we are required to go through the same questions within our own group on Google Doc. The answers for both activities are illustrated in Figure 1&2. We can easily find that there is a more complete and well-done answers in the group work.

                                                                            Figure 1. Answers in Individual Work


                                                                            Figure 2. Answers in Group Work

Now, let me state the definition of Epistemic Aims here. They are goals related to finding things out, understanding them, and forming beliefs. In individual work, I could only rely on myself to work out the answers and the first task for me was to find answers out and form the knowledge that I thought it right. However, in group work, explanation and understanding became epistemic aims as well because I needed to sell my ideas to my teammates and decide whether to adopt other people’s opinions or not.

In terms of epistemic cognition, it differs in the two different types of activities. First, the epistemic aims are not the same as I explained in the last paragraph. Second, sources and justification of knowledge are much stronger in the group work than that in the individual work. In another word, we have higher reliability of processes for achieving epistemic aims. For example, in the group work, every teammate wrote down questions that they couldn’t solve in individual work and we surprisingly found something important and interesting that we ignored before. We tried to give best answers by working together. One of the group members used his personal experience to help the whole group understand a relatively new concept in the case material. This is so called “Knowledge Externalization (Form Tacit Knowledge to Explicit Knowledge)” in the theory of Spiral of Knowledge Creation.

In all, social networking facilitates interpersonal processes as well as community and institutional processes. And we truly achieve more through collaboration.