An Analysis of Big Data on Health: Critique is Not Optional

The ‘social’ has always been a commercial and scientific resource – now in the digital age the competition regarding claims to which disciplines have justified understandings of this domain have intensified. The social sciences need to defend their subject area in order to preserve it. An application of the netnographic approach (Kozinets, 2010), social network analysis, data mining and machine-learning tools to highlight the certainties and uncertainties of Big Data and the Health Industry in order to start the process of uncovering the social and cultural forces that they are appropriating. What follows is the application of the tools of Big Data analytics on those that conduct Big Data analytics. There are competing discourses surrounding ‘Big Data’ and Health. On the one hand business, marketing and advertising interests are promoting Big Data as information that no longer requires theory or the scientific methodologies of old. On the other are voices from the academy; digital humanities and computational social sciences that wish to benefit from the volumes of available data. It is these (and other) competing discourses that are the target of this research. This paper argues that those engaged in ‘data without theory’ are generating a relational social mechanism similar to that of self-fulfilling prophesies of Merton, the network effects of Coleman and the bandwagon effects of Granovetter (Donati, 2015:66) and leaving no room for critique. (Continue reading)

Countering the Social Ignorance of ‘Social’ Network Analysis and Data Mining with Ethnography

A Case Study of the Singapore Blogosphere

Steven Eunan McDermott


This thesis questions on one level the assertion that the Internet is a force for democratisation in authoritarian regimes (Habermas, 2006), and at the same time another means for disseminating propaganda, fear and intimidation (Rodan, 1998). It overcomes the limitations of using automated data collection and analysis of blogs by supplementing these techniques with a prolonged period of participant observation and a detailed reading of the textual extracts in order to allow for meaning to emerge. It analyses the discourses and styles of discourse of the Singapore political blogosphere. Hurst (2006) and Lin and Sundaram et al., (2007) described the same blogosphere as isolated from the global blogosphere and clearly demarcated with no central topic. Countering the social ignorance of such automated data collection and analysis techniques, this study assigns meaning to data gathered from January 2009 to February 2010. This case study will help highlight the analytic framework, benefits and limitations of using social network analysis and an anthropological approach to networks.  It has targeted blogs using hyperlink network analysis and measured ‘importance’ with ‘betweenness centrality’ (de Nooy & Mrvar et al., 2005) in order to demarcate the boundaries of the sample of blogs that are archived for semantic and discourse analysis. Beyond a brief introduction to betweenness centrality, and the merits or otherwise, of combining various ranking of blogs such as Google’s PageRank, Hits and Blogrank algorithms it avoids the algorithm fetishism within hyperlink data collection and linguistic analysis of corpus collected from blogs; allowing for culture, identity and agency. It assesses which of White’s (2009) three disciplines and relative valuation orders the Singapore blogosphere adheres. The contention raised here is that social network analysis, or rather those elements within it that are focused exclusively on algorithms, are in danger of co-option by states and multinational corporations (Wolfe, 2010:3) unless they acknowledge sociocultural forces. The tools of social network analysis and data mining are moved beyond mere description, while avoiding prescription – and at the same time advancing its contribution to substantive theoretical questions (Scott, 2010). Ensuring space for agency in a field dominated by sociograms, statistics and algorithms with theory that places persons lacking recognition at its centre is important to this thesis. Focusing only on the relational aspects of the interaction and in the individual persons linked (Wolfe, 2010: 3) creates a limited representation of the wider phenomena under study and a narrow awareness of the context in which these networks exist. A people governed by one political party since 1963 (The People’s Action Party) with the government of Singapore is the focus of this case study. This paper also highlights the use of various software technology; blogs, IssueCrawler, HTTrack, NetDraw, and Leximancer while using an ethnographic approach to counter the social ignorance of automated electronic software.  The analysis of the Singaporean blogosphere from 2009 to 2010 provides a descriptive analysis of the argument that the non-democratic nature of Singapore society shapes the development of online public spheres.