Social network change detection

Investigation of potential copyright issue Please note this is about the text of this Wikipedia article; it should not be taken to reflect on the subject of this article. Do not restore or edit the blanked content on this page until the issue is resolved by an administrator, copyright clerk or OTRS agent.
If you have just labeled this page as a potential copyright issue, please follow the instructions for filing at the bottom of the box.
The previous content of this page or section has been identified as posing a potential copyright issue, as a copy or modification of the text from the source(s) below, and is now listed on Wikipedia:Copyright problems (listing): http://www.casos.cs.cmu.edu/publications/papers/CMU-CS-08-116.pdf (Duplication Detector report) Unless the copyright status of the text on this page is clarified, the problematic text or the entire page may be deleted one week after the time of its listing. Temporarily, the original posting is still accessible for viewing in the page history.
Can you help resolve this issue? Lua error in Module:Details at line 30: attempt to call field '_formatLink' (a nil value). <templatestyles src="Template:Hidden begin/styles.css"/> If you hold the copyright to this text, you can license it in a manner that allows its use on Wikipedia. Click "Show" to see how. You must permit the use of your material under the terms of the Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA) and the GNU Free Documentation License (GFDL) (unversioned, with no invariant sections, front-cover texts, or back-cover texts). Explain your intent to license the content on this article's discussion page To confirm your permission, you can either display a notice to this effect at the site of original publication or send an e-mail from an address associated with the original publication to permissions-en at wikimedia dot org or a postal letter to the Wikimedia Foundation. These messages must explicitly permit use under CC-BY-SA and the GFDL. See Wikipedia:Donating copyrighted materials. Note that articles on Wikipedia must be written from a neutral point of view and must be verifiable in published third-party sources; consider whether, copyright issues aside, your text is appropriate for inclusion in Wikipedia. <templatestyles src="Template:Hidden begin/styles.css"/> To demonstrate that this text is in the public domain, or is already under a license suitable for Wikipedia, click "Show". Explain this on this article's discussion page, with reference to evidence. Wikipedia:Public domain and Wikipedia:Compatibly licensed may assist in determining the status. <templatestyles src="Template:Hidden begin/styles.css"/> Otherwise, you may write a new article without copyright-infringing material. Click "Show" to read where and how. Your rewrite should be placed on this page, where it will be available for an administrator or clerk to review it at the end of the listing period. Follow this link to create the temporary subpage. Simply modifying copyrighted text is not sufficient to avoid copyright infringement—if the original copyright violation cannot be cleanly removed or the article reverted to a prior version, it is best to write the article from scratch. (See Wikipedia:Close paraphrasing.) For license compliance, any content used from the original article must be properly attributed; if you use content from the original, please leave a note at the top of your rewrite saying as much. You may duplicate non-infringing text that you had contributed yourself. It is always a good idea, if rewriting, to identify the point where the copyrighted content was imported to Wikipedia and to check to make sure that the contributor did not add content imported from other sources. When closing investigations, clerks and administrators may find other copyright problems than the one identified. If this material is in the proposed rewrite and cannot be easily removed, the rewrite may not be usable. State that you have created a rewrite on this article's discussion page.
About importing text to Wikipedia Lua error in Module:Details at line 30: attempt to call field '_formatLink' (a nil value). Posting copyrighted material without the express permission of the copyright holder is unlawful and against Wikipedia policy. If you have express permission, this must be verified either by explicit release at the source or by e-mail or letter to the Wikimedia Foundation. See Wikipedia:Declaration of consent for all enquiries. Policy requires that we block those who repeatedly post copyrighted material without express permission.	Instructions for filing If you have tagged the article for investigation, please complete the following steps: Add the following to the bottom of Wikipedia:Copyright_problems/2015 December 14 `* {{subst:article-cv\|:Social network change detection}} from http://www.casos.cs.cmu.edu/publications/papers/CMU-CS-08-116.pdf. ~~~~` Place this notice on the talk page of the contributor of the copyrighted material: `{{subst:Nothanks-web\|pg=Social network change detection\|url=http://www.casos.cs.cmu.edu/publications/papers/CMU-CS-08-116.pdf}} ~~~~` To blank a section instead of an entire article, add the template to the beginning of the section and </div> at the end of the portion you intend to blank.

Figure 1. Example of Change Detection

Social network change detection (SNCD) is a process of monitoring social networks to determine when significant changes to their organizational structure occur and what caused them. This scientific approach combines analytical techniques from social network analysis with those from statistical process control(SPC). SNCD can be used to detect when significant changes occur in a network. In application, it requires the use of statistical process control charts to detect changes in observable network measures. By taking measures of a network over time, a control chart can be used to signal when significant changes occur in the network.^[1] SNCD may offer executives and military analysts a tool to operate inside the normal decision cycle.

SNCD not predicting change,but rather detecting that a change occurred quickly and making some inference about the actual time of change.For example, before a terrorist commits an attack, there will be a change in the social network as the organization plans and resources the attack. SNCD may allow an analyst to detect the change in the social network, prior to the successful completion of the attack.^[2]

Background

There has been a recent increase in temporal social network data . Unobtrusive tools now exist to extract network data from e-mail servers,from news media, from written documents within an organization. This allows an analyst to construct multiple network observations of an organization, whether it is daily, weekly,yearly, or any other temporal breakdown. With the increased emergence of observed instances of social networks over time, improved methods of detecting meaningful change are needed. Simply looking for obvious drastic changes may be insufficient for many applications.^[1]

However, methods of change detection in social networks are limited. Hamming distance (Hamming, 1950) is often used in binary networks to measure the distance between two networks. Euclidean distance is similarly used for weighted networks (Wasserman and Faust, 1994). While these methods may be effective at quantifying a difference in static networks, they lack an underlying statistical distribution. This prevents an analyst from identifying a statistically significant change, as opposed to normal and spurious fluctuations in the network. Social Network Change Detection significantly improves on previous attempts to detect organizational change over time by introducing a statistically sound probability space and uniformly more powerful detection methods. Encouraged by this,SNCD was proposed by McCulloh.

History

SNCD was initially proposed by Major Ian McCulloh, an assistant professor in the U.S. Military Academy's Network Science Center in 2006. Since then, SNCD has been presented at a variety of venues from NetSci2007 in New York City to the International Network for Social Network Analysis annual conference in 2008, to the Military Operations Research Society Working Group on emerging threats and social networks in 2008.

McCulloh and Carley complete a project in 2008 which is supported in part by army since the work of the project is related with terrorism detection.In the project report, McCulloh illustrate the detection methodology by three data sets, email communications among graduate students and perceived connections among members of al Qaeda based on open source data.Results of the project indicate that the approach illustrated in the report is able to detect change even with the high levels of uncertainty inherent in these data.

In 2009, McCulloh further develop the idea and use his methodology to detect changes in dynamic social network.This new approach is demonstrated in multi-agent simulation as well as on eight different real-world data sets.

Applications

To provides an estimate of when a change occurred, the CUSUM procedure is used to demonstrate SNCD on two data sets. The optimality constant k is set to 0.5 corresponding to a shift of one standard deviation; and the decision interval h is set to 3.5, corresponding to 1% false positive rate. Data gathered from survey and text gathered from internet, well established datasets in the social network literature, are used to illustrate this method. The features of datasets are listed below:

Comparison of Real World Data^[2].
	No. of Nodes	Time periods	Method of Collection	Type of Relation	Design	Known Change
Fraternity	17	15	Survey	Ranking	Fixed	Yes
Al-Qaeda	62-260	17	Text	Rating	Free	Yes

Newcomb Fraternity

This data set was gathered from an experiment conducted on 17 incoming transfer students at the University of Michigan, by Theodore Newcomb (1961). These participants, with no prior acquaintance, were housed together in fraternity house, and they were asked to rank each other from 1 to 16 by preference, where 1 stands for the person they felt the most comfortable with. Data was collected weekly for 15 weeks, except for the 9th week. David Krackhardt (1998) dichotomized the network data by assigning a link to preference ratings of 1-8 and having no link for ratings of 9-16. To determine typical behavior, the mean and standard deviation of the density, average betweenness, and average closeness were estimated from the first five networks. The CUSUM statistic was then calculated for all time periods.

The approach successfully detected significant events in the Fraternity network data^[2]. For each social network measure monitored, two control charts are needed to run because the CUSUM will detect either increases or decreases in a measure rather than both. The betweenness measure is chosen to signal changes here because the closeness measure is similar to the betweenness measure and the density measure is not effective for a fixed network network.

In this illustration, the control chart for average betweenness signals at time period 13 that a change may have occurred in the social network of the fraternity members. The most likely time that the change actually occurred is time period 8 in the Newcomb Fraternity data, which is the last time period that the C statistic was equal to 0. This time point was the week before a mid-semester Break. Therefore, it is probably that social relationships may have changed over a break as participants possibly vacationed together. Although details of the group are not completely known, this approach was still proven to work well in detecting network changes.

Al-Qaeda

The Center for Computational Analysis of Social and Organizational Systems (CASOS) at Carnegie Mellon University created snapshots of the annual communication between members of the al Qaeda organization from its founding in 1988 until 2004 from open source data^[3]. What this open source dataset provide is a limited network in which who initiated communication with whom is unknown and the completeness of the network is uncertain.

The betweenness, closeness, and density measures increased from 1988 until 1994, and then leveled off. This situation could be explained by the varies of quality of intelligence gathering on al Qaeda and the rapid changes such as development and reorganization within the organization. Therefore, the CUSUM control chart was applied to the data from 1994 to 2004.

Through this dataset with more nodes on a longer time span, major event in Al-Qaeda’s history should be detectable. Through changes of the average betweenness CUSUM statistic, we could try to identify the point in time when the organization changed and began to plan the attacks.Accroding to the statistic, the most likely time that the change occurred is 1997^[2]. Looking at the events occurring within the Al-Qaeda network and the external environment in 1997 can help us to have a better understanding of the cause of the change.

We need to always be cautious towards the data collected retrospectively and data that likely to be incomplete. Based on SNCD on data collected within organizations, it is possibly for analyst to warn action carried out by the group or organization.

Sensitivity to Risk of False Positive

Sensitivity to the risk of false positive is an important consideration in detecting change in longitudinal network data. False positives occur when a change detection procedure indicates that a change may have occurred, when in fact there is no change^[2]. We need to be careful with the trade-off between false positives and rapid detection. The balance is determined by the interval of the change detection procedure. It is important to determine a desired risk for false positive, and then monitor longitudinal networks for change.

Specifically, the change detection procedure may miss real changes when a very low risk of false positive is set; and the procedure would signal changes more rapidly when the risk of false positive is set to a higher value. The analyst should carefully consider the trade-off between false positives and rapid detection when using SNCD^[2].

Limitations

The use of the cumulative sum (CUSUM) procedure has the following limitations:
1. The method is limited to normally distributed work measures, and a period of dynamic equilibrium must be assumed to estimate parameters of the control chart.^[1]
2. As it is a statistical approach, it highly depends on the data that is used to do the analysis. Limitations on the data will make it difficult to determine the validity of the results.^[1]
3. Due to the assumption that network measures are normally distributed, research on the distributions is needed. Preliminary work on these distributions suggests that the assumption of normality does not hold for small networks, extremely sparse networks, and for certain metrics.^[4] If the network measures are not normally distributed, the false alarm probability will increase. As a result, a different control chart must be used or a new approach at the problem made.
4. Findings are limited to modeling and detecting changes, but not the causes of the change.^[2]

As social network change detection is quite a new concept and not many applications of statistical process control method have been conducted, more limitations of the algorithm cannot yet be determined. Future research will provide much greater insight into the limitations of this approach to the problem.

Future work

It is important that future work examine the errors associated with this technique, both the false positives and false negatives. Due to the dependence on data obtained, future work should also consider the sensitivity of this approach to missing information, and to the reason why the information is missing. In order to rectify those shortcomings, future work should focus on near-complete datasets with high resolution. Near complete data means that the data should cover the communication network with little or no missing information for a large contiguous period. As addressed above in Limitations, if the network measures are not normally distributed, a different control chart must be used or a new approach at the problem made. That will need future work to find a more general solution.^[1]

It may also be possible to extend change detection to node level measures. Again the distributional assumptions would need to be verified. Node level change detection may help further isolate change in an organization by monitoring the behavior of key individuals, without the noise introduced by less influential agents. More work in this area will be beneficial.^[2]

It is also helpful to look at the sensitivity of the optimality constant, k and control limit values of the CUSUM Control Chart for network measure change detection. By using further Monte Carlo simulations, a researcher should determine which parameter value would be best in detecting certain types of changes such as sudden large changes or slow creeping shifts. Usage of control charts on comparing models and observations should also be studies to see what specific conclusions can be obtained.^[1]

Other features could also increase the usability and utility of these techniques, including auto-identification and visualization of critical features, and improved data extraction and fusion techniques.^[5]

Notes

↑ ^1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 Lua error in package.lua at line 80: module 'strict' not found.
↑ ^2.0 ^2.1 ^2.2 ^2.3 ^2.4 ^2.5 ^2.6 ^2.7 Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.

References

Baller, D., McCulloh, I., Carley, K.M., and Johnson, A.N. (2008). Specific Communication Network Measure Distribution Estimation. Sunbelt XXVIII, the annual conference for the International Network of Social Network Analysts, Saint Petersburg, FL, 24 January 2008.[1]
Carley, K. M. (2003). Dynamic network analysis. In P. Pattison (Ed.), Dynamic social network analysis: Workshop summary and papers: 133–145. Washington D.C.: The National Academies Press.
McCulloh, I., Garcia, G., Tardieu, K., MacGibon, J., Dye, H., Moores, K., Graham, J. M., & Horn, D. B. (2007). IkeNet: Social network analysis of e-mail traffic in the Eisenhower Leadership Development Program. (Technical Report, No. 1218). Arlington, VA: U.S. Army Research Institute for the Behavioral and Social Sciences.
McCulloh, I., Lospinoso, J., and Carley, K.M. (2007). Social Network Probability Mechanics. Proceedings of the World Scientific Engineering Academy and Society 12th International Conference on Applied Mathematics, Cairo, Egypt, 29–31 December 2007, pp. 319–325,[2]
McCulloh, I., Webb, M., Carley, K.M. (2007). Social Network Monitoring of Al-Qaeda. Network Science Report, Vol 1, pp 25–30.[3]

[:0-1] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 Lua error in package.lua at line 80: module 'strict' not found.

[Mc2009-2] 2.0 ^2.1 ^2.2 ^2.3 ^2.4 ^2.5 ^2.6 ^2.7 Lua error in package.lua at line 80: module 'strict' not found.

[Ca2006-3] Lua error in package.lua at line 80: module 'strict' not found.

[4] Lua error in package.lua at line 80: module 'strict' not found.

[5] Lua error in package.lua at line 80: module 'strict' not found.

[1]

[2]

[3]

[4]

[5]

v t e Social networks and social media
Types	City Personal Professional Sexual Value
Networks	Distributed social network (list) Enterprise social networking Mobile social network Personal knowledge networking
Services	List of social networking websites List of virtual communities with more than 1 million users List of virtual communities with more than 100 million active users
Concepts and theories	Assortative mixing Interpersonal bridge Organizational network analysis Small-world experiment Social aspects of television Social capital Social data revolution Social exchange theory Social identity theory Social network analysis Social web Structural endogamy
Models and processes	Aggregation Change detection Collaboration graph Collaborative consumption Giant Global Graph Lateral communication Social graph Social network analysis software Social networking potential Social television Structural cohesion
Economics	Collaborative finance Social commerce
Phenomena	Community recognition Complex contagion Consequential strangers Friend of a friend Friendship paradox Six degrees of separation Social invisibility Social network game Social occultation Tribe
Related topics	Researchers User profile Viral messages Virtual community

Social network change detection

Contents

Background

History

Applications

Newcomb Fraternity

Al-Qaeda

Sensitivity to Risk of False Positive

Limitations

Future work

See also

Notes

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools