John Howkins is Chairman of Howkins & Associates in London and Shanghai. He has advised global corporations, international organisations, and governments. This year marks twelve years since the publication of his influential book The Creative Economy: How People Make Money from Ideas. Now with a revised edition on the bookshelves, Michael Keane caught up with […]
Over the past decade clusters, bases and zones have become the poster children of China’s cultural and creative industries. Their visibility in tourist brochures gives an impression to visitors that China’s cultural assets are booming and that cities are becoming more liberal. While there is some justification for these impressions, behind the proliferation of clusters […]
China’s rock music industry is often overshadowed by the more commercial genres of pop and instrumental music. Tours by Western rock bands and artists are usually heavily monitored. Authorities scrutinise song lists for unhealthy lyrics that might provoke unrest or contain references to sensitive topics. Despite these barriers to entry, China’s rock scene has developed […]
The European fashion system that dominates how global fashion is produced, distributed, legitimized and consumed can be compared to the ebb and flow of an ocean tide. As this system washes over national markets, cultural assimilation occurs and as the tide retreats it takes with it fresh actors who refresh the system. In Shanghai, dramatic […]
Stepping into the hall of the Gehua Camp Experience Centre, the first thing that captured my eyes was a large Chinese character for ‘dream’ on the wall made from colourful interlocking LEGO plastic bricks. The message is clear. Each child symbolizes a brick to build the future of China. And like LEGO bricks, they are […]
Serangoon Road is an Australian-Singapore ten part TV series. A detective noir drama set in the sixties, Serangoon Road is set in a time when Singapore was breaking away from Malaysia and becoming an independent state. As well as the turbulence occurring in Malaysia, Singapore and Indonesia at the time, the drama also references conflict […]
Oh no! Feedweaver has encountered an error while processing the source entry "http://www.creativetransformations.asia/feed/". Please check the feed settings. The error was: RuntimeError: Timeout while trying to download data; the server might be too busy, or the data might be too big.
As part of our recent work investigating the Twitter Userbase, we have collected data on accounts registered around the 2011 triad of natural disasters; the Queensland Floods (January), Christchurch Earthquake (22 February) and Tokyo Earthquake & Tsunami (11 March). By […]
The final panel at Digital Methods in Vienna is on Web monitoring, and starts with a paper by Jakob Jünger on Facepager, a tool for gathering data from Facebook. Such data could be scraped directly from the Web pages, or retrieved through the API; Facepager takes the second route, which has specific implications for the kind of data which are available for it.
For example, popular Facebook pages show a general estimate of how many likes they've received (e.g. "700k"), while the API returns an exact number; this needs to be considered in any analysis which examines the actual user experience, of course.
Facepager takes a bunch of Facebook page names, and then enables the user to gather all posts or likes as well as a number of other types of data; these can be exported as data files for further processing and analysis. And despite the name, Facepager appears to capture Twitter data as well, and has a generic API interface which can connect with a variety of other services, too. The tools is available under an open source licence.
As with any automated, API-based data gathering tool, there are some methodological issues here, of course. APIs are often far from transparent, data are thus not necessarily complete, and the indicators for activity on social media platforms which emerge from the data are therefore not necessarily always entirely valid.
On Facebook, some comments may not be delivered, for example, but the reasons for this are far from clear; it may have something to do with the Facebook network of the researcher doing the data gathering, which would be a significant limitation, of course. Similarly, user activity metrics may be affected by automated posting, but such automated posts could be difficult to distinguish from genuine activity. And API functionality might chance and isn't always well-documented; the metrics they return might change overnight, for example, invalidating the research results.
There are also concerns over research ethics and user privacy, of course. The availability and apparent straightforwardness of data is problematic in its attractiveness for researchers.
Much of the focus of this site in recent years has been on Twitter research, often in collaboration with our various colleagues and friends around the world. We’ve tried our best to help along the development of Twitter research methods […]
The final speaker on this first day of "Compromised Data" is Sidneyeve Matrix, who shifts our focus towards geosocial information as generated by smartphones and other mobile devices. Only 12% of US users as surveyed by the Pew Centre posted Foursquare check-ins in 2013, for example, down from 18% in 2011 - but this may mask a greater take-up of other location-based services, not least the Frequent Locations functionality in iOS7.
There is a continuing trend towards the consumerisation of geodata. Geosocial cultural arrangements are explored through the use of mobile communication patterns, but such analysis is notoriously difficult - not because of a lack of data, but because of the difficulties in assigning meaning to the geolocated information which is available from a variety of platforms.
Some such geo-information originates from self-selecting users, switching geo-services on and off as necessary; other users may have them switched on by default because their are insufficiently familiar with the functionality. (Yet others may even deliberately game these services with incorrect information.) The increase in other locational sensors (e.g. in wearables) adds further geodata into the mix.
One way to understand these trends is through an analysis of newspaper coverage of geo-technologies in recent years. A substantial part of the rhetoric is about issues of privacy and their underlying creepiness, in fact; the coverage of projects such as Please Rob Me and a variety of social discovery apps (often with sexual undertones) highlights some key issues around geolocation.
Application developers often did not see the issues with their work, but as it turns out very few enlightened smartphone users are now prepared to generally reveal their geolocation through these and other services. This is the case at least for some of the general social media services and apps, but there are niche markets for location-aware wearables and associated apps which are rapidly developing nonetheless: for what it's worth, they are predicted to be worth US$18b by 2018.
This is the case for example for Nike+ Fuel, Jawbone, Fitbit, and similar mobile health apps, which often link with geo-enabled apparel and utilise gamification strategies to reward particular activity achievements and social community platforms to create mutual support and/or competition.
Another example are people finder apps like Life360, which aim to keep family members in virtual contact with each other. They are marketed as supporting personal safety and family connections by providing microsocial, "antisocial" community circles - but here, too, there are privacy and surveillance issues, for example between parents and their adolescent children. The conversations between developers and users around these issues turn out to be fascinating, unsurprisingly.
Collectively, there are millions of members in these niche communities, but they are largely underresearched, in part because no data analytics tools exist or because the data are not available to researchers. These platforms signal a shift from generic social media platforms to more specific, niche environments, and we need to find new approaches for tracking geosocial trends in these niche data publics, and for developing research collaborations with the platform operators.
This would also provide a more accurate and more rich story about the current processes for developing and monetising such services.
The next speaker at "Compromised Data" this afternoon is Asta Zelenkauskaite, who notes the increasing interweaving of social and mainstream media; based on the properties of 'big data' it therefore becomes important to explore how users engage with mass media and cross-media contexts. How relevant are 'big data' to the mass communication field?
Traditional media outlets have been mainly focussing on a quasi-passive engagement with media content, while social media now offer a two-way interaction by providing back channel functionality. Mass media content, user-generated content, and user interactions' digital imprints are coming together to shape this cross-media environment.
The first aspect of 'big data' is the sheer size of the data on user activities, of course; a second is the varity of activities, and a third the velocity of engagement. Veracity is also increasingly being highlighted, and in the end it is especially the value of 'big data' which needs to be explored. Questions here are first, value to whom?, and second, value of what?
Value may lie for example in interactivity and interest-based content discovery. Interactivity is increased through 'big data' environments, and this also enhances information discovery processes. Asta uses the example of Italian radio in this context: Italian radio stations emphasise the various social networks which can be used to engage with the radio stations and to discover additional media content - yet also notes that interaction with such tools is quite marginal, provides only limited content discovery opportunities, and increases source fragmentation.
This raises questions for possible content architecture: the current model for radio stations and mass media in general is one of top-down content access, where the mass media outlet controls the mainstream channel and user-generated channels are used only marginally and exist in the form of various, fragmented, competing spaces (which are also difficult to monetise because of this fragmentation).
An alternative model would provide for interest-based content access which brings together various information items of varying provenance in the same curated navigation space, enabling users to engage with a broader variety of content on their own terms. This requires us to conceptualise content in a fluid way, building on an interest-based information architecture matrix which users can navigate freely regardless of the differences between media outlets included in this matrix.
The value of 'big data' in this context would lie in more interest-based content discovery, increased interactivity, and more content variety across multiple media content streams. For researchers, it would also enable more cross-media analysis that builds on these data. But there is also the problem of potential information overload for users taking advantage of such cross-media systems; of ethical issues with the data veracity in this environment; and of a greater potential for data-enabled surveillance of users.
This is an argument for thinking about the value of 'big data' from a more user-centric perspective, then. Value extraction through user-centric approaches is still in its infancy, and the proprietary nature of the data makes this process even more difficult
The next speaker at "Compromised Data" is Joanna Redden, whose interest is in government uses of 'big data', especially in Canada. There's a great deal of hype surrounding 'big data' in government at the moment, which needs to be explored from a critical perspective; the data rush has been compared to the gold rush, with similarly utopian claims - here especially around the ability for 'big data' to support decision-making and democratic engagement, and the contribution 'big data'-enabled industries can make to the GDP.
But how are 'big data' actually being used in government contexts? New tools and techniques for the analysis of 'big data' are of course being used in government, but how these affect policy decisions remains unclear. Social media analysis is similarly being used for public policy and service delivery; sentiment analysis is used for some decisions around law enforcement and service delivery, but adoption to date is slow.
For public servants in this field, it has proven difficult to influence political leaders who so far have been successful by relying on their gut feelings rather than hard evidence in their decision-making. Even where 'big data' are being adopted as a support for decisions, however, the quality of the data as well as of the analysis must be questioned; the provenance of the data, the models used for 'big data' analytics, the conceptual models for understanding the findings all remain in ther very early stages.
In this new world, data scientists wield considerable power, and their training to exercise this power is limited - they may have the computing and statistics skills, but not necessarily the analytical, social sciences, or diplomatic skills to enable their insights to be effectively incorporated into decision-making skills. Even where 'big data' are being used, the selection of evidence may remain biassed and incomplete.
There is also a danger of civil servants dividing the populace into a number of data-circumscribed sub-populations, and of treating these populations as clients or consumers rather than citizens; further, information from sources other than 'big data' may be sidelined in decision-making processes unless clear evidence is also covered in the quantitative data sources themselves.
More generally, there is a 'climate of fear' in the public service in Canada at the moment, and scientists are unable to speak freely for fear of retribution from their political masters and funders. This is wrapt up with neoliberal ideologies in the current Canadian government - government is being reshaped to serve business and industry interests, and long-term measures of societal changes (like the census) are being abolished. This makes the weighting of more specific 'big data' sources (e.g. social media data) in relation to underlying demographic patterns all the more difficult.
With the turn to 'big data' there is a concern over a computational turn in public thinking - what is measured and what is ignored in this will have very substantial impacts on future public policy. We must be concerned about the congruence of data-driven decision-making and neoliberal rationality, extending market values into all aspects of our way of life. In this way, the turn to 'big data' rationalises neoliberal thinking, enabling the quantification of life based on calculated reasoning, a strong emphasis on individualism, a focus on measuring consumption, and the identification of 'useless' subgroups of society.
There is also a potential turn away from causality to mere correlation, where the reasons why certain things are happening are ignored by policies which merely use the levers which emerge out of correlated patterns. We need to be concerned about the ways in which 'big data' models from market research are being integrated into policy-making, and need to query the market principles of the 'big data' business itself.
The final presentation in this "Compromised Data" session is by Mary Francoli and Dan Paré, who focus on the question of engagement and mobilisation in a time of rapidly evolving social media use. One initial observation is that these terms lack definitional clarity - there are some very high-level definitions (e.g. building on UN definitions), but these remain vague; political and civic engagement are conflated, and specific forms of engagement are not necessarily defined in detail.
Simply voting is a form of engagement, for example, but is clearly different from other, more complex forms of political engagement. The literature increasingly links these types of activity with social media (and with the Net more broadly) - and the extent to such such forms of engagement occur, and how they interrelate with forms of offline political engagement, need to be studied in greater detail.
Some studies explore the time spent on engagement, assuming a zero sum relationship between social media engagement and other activities; some focus on the homology between online and offline engagement; some assume that online engagement can involve groups which are often excluded from other forms of participation.
The underlying assumptions behind these studies need to be queried and critiqued; sceptics which examine the former sometimes merely count the time spent, but fail to evaluate the quality of engagement, for example. Some of these counting-based exercises are ultimately very simplistic - counting is fine in itself, but we must also ask what we are really learning from these projects, and how their findings need to be interpreted.
What we are starting to see here is a pattern of overreliance on quantitative and empirical approaches - anempirical muddiness that offers little insight into more complex questions. How do we account instead for nonlinearities in current research, and how do we investigate the complexities of modern life in a social media-enhanced environment?
There is a paradox in assessing democracy - by some criteria, participation has clearly increased; by others, a crisi of participation is looming. Simply equating access to social media with democratisation is highly problematic; it tells us little about how social media platforms contribute to engagement and politicisation, and at worst takes a naive approach to the role of technology. Our model of democracy also needs to be queried - from elite through deliberative democracy to the monitorial citizen.
We also need the tools for situating social media platforms in their wider societal contexts. Dallas Smythe has suggested that technology is a myth: bureaucracy, science, capital, engineering, ideology, and propaganda all shape technology and the myths around it, so technology is always socially constructed rather than existing in a vacuum or proceeding under their own logic. Technologies like social media are inherently political and proceed in highly politicised contexts. Human agency must not be ignored in researching them.
Data-driven research should not be replaced with qualitative work, but qualitative perspectives should be used to disentangle terms such as mobilisation and engagement. Mobilisation and resistance should not be conflated with organisation - social media platforms are very good at facilitating mobilisation, for example, but not necessarily at facilitating formal political organisation. If social media are seen as drivers of change, we are potentially suffering from historical amnesia, as similar claims have also been made about other 'new' media technologies in last decades.
The next paper at the "Compromised Data" symposium is by Jean Burgess and me, and explores the more difficult forms of 'big data' research we're rarely conducting at present because the political economy of data access is weighted against specific approaches - in the specific context of Twitter research. I'll upload the slides and audio for it as soon as possible - for now, consider this a placeholder!
The next speaker at "Compromised Data" today is Carolin Gerlitz, who begins by suggesting that social media data are both standardised and vague at the same time. She notes the German Twitter community which is focussed around favouriting on Twitter: the Favstar sphere sees favourites as a sign of importance and validation, and taking away favourites is therefore a serious affront.
This is an example of how the communicative affordances of social media platforms are being utilised by their users; these standardised activities mark the grammar of action on such platforms, and are specific to the particular platforms. Twitter's grammar has been comparatively stable, while Facebook has modified its available actions on a continuous basis, which destabilised the meaning of such activities.
Standardisation produces comparable and countable numbers; but such actions remain vague as users may regard them to have different meanings - activities are therefore standardised in form, but vague in meaning. Twitter favourites are an example of this: introduced in 2006, favourites were seen at first as an unwanted step-child of the service, as somewhere between bookmarking and Facebook-style liking. Favourites were difficult to organise or manage, or to systematically explore.
Only when third-party developers developed further favouriting functionality did the utility of favourites increase. Feed reader tools began to use them; Favotter and other bookmark ranking tools began to deploy favourites as a measure of popularity for tweets, showing "tweets of the day" and highlighting the most favourited users.
This also supported the emergence of the Favstar scene, demonstrating the effects of third-party tools on the platform itself. This is a process of de- as well as recontextualisation of platform data; the same process is visible in the popularity of Klout scores, which turns platform data into a new measure of "influence" ("the ability to drive action") across different social media platforms.
This, then, connects the interaction grammars of a variety of social media platforms into one score; they are de- and recontextualised by a proprietary algorithm, and the results of this scoring feeds back into the activities of the users who actually care about their Klout scores. Users are reminded of the repercussions of their activities, and may shape their platform engagement strategically around what effects they may have on their Klout score.
Klout further offers perks to users with high scores in specific topic activities - it thus incentivises such strategic activities. Users' social media activities thus become valuable to the service's financial parties. Klout partners also take into account users' Klout scores as a measure of importance, privileging high-ranking users in job interviews, tech support, or a variety of other situations.
The original social media data are thereby turned into new metrics which themselves move into new fourth- and fifth-party contexts and become multivalent. Numbers are both standardised and remaining vague; activities such as favouriting become partible: they are still meaningful to the originating users, but also take on new meaning through the repurposing of such data in other contexts. Social media become multivalence machines.
Who can realise these multiple forms of value, then? Empirical engagement with such platforms proceeds largely from sampling, but such meaningful samples are difficult to create given the diverse meaningful practices which take place on such platforms. Most Twitter research draws on topical, non-representative samples, for example, building on a priori assumptions about Twitter use; representative sampling, on the other hand, draws on random or cluster samples which study emergent and variant use practices.
Platforms providers thus have good platform-political reasons for bad platform data; research creates new relations between social media data, also participating in the de- and recontextualisation of data.
The final presenter in this AoIR 2013 session (and thus, the final presenter this year!) is Hazel Kwon, whose aim is to better understand the flows of communication on social media during protests. Her frame of research in this is Emergent Norm Theory, whose emphasis is on the rapid and transformative potential of word of mouth on collective behaviours. This is a process of diffusion for a collective identity.
Protests can be understood as collective behaviours. They may be prompted by the circulation of rumours, which are characterised by the informal and improvised circulation of situational information; gradually, key themes and issues are being identified and converted into key messages that define the protest action. They draw on a special type of crowd, the diffuse crowd. But existing theories largely consider such phenomena in the context of physically co-located crowds; translation to social media environments must necessarily develop somewhat different understandings.
Can we, then, distinguish different types of message flows on a temporal basis? Are there geographical patterns? Do different role-takers emerge? Hazel has used a dataset for the Egyptian #Jan25 protests to explore this, including some 4,400 tweets, and saw a gradual decline of improvisation in communicative exchanges, a shift toward verification of information, and eventually the solidification of key shared messages. This is also geographically dispersed - improvisation was more prominent for Egypt-based users; solidification for non-Arab users. Core actors emerged for each of the communicative functions.
But this is only a preliminary study - more research needs to be done on this. This study may provide a glimpse of the emergence of a collective conscious, however.
The next AoIR 2013 paper is by Pieter Verdegem and Evelien D'Heer, who shift our focus to regional elections in Flanders. The role of Twitter in politics has been described from both optimistic and pessimistic perspectives; the Twittersphere has been seen by many to reflect existing social structures. Is there a move from formal and representative politics towards networked politics, though? From broadcasting to convergence logic?
Pieter and Evelien captured all @mentions in the #vk2012 debate, engaging in both content and network analysis. The hashtag was promoted by the public service broadcaster in Flanders, so it provides a useful point of entry into election-related discussions on Twitter and was frequented by politicians, journalists, and ordinary Twitter users. There was a significant spike in tweets on Election Day (14 Oct. 2012), with far less activity on other days - usually around 200 tweets per day. Activity picked up somewhat during the final week of the campaign.
Ordinary users were the biggest contributors - from around 50% in the pre-election phase to 70% in the post-election phase. During the week before Election Day, they contributed some 75% of all tweets. Politicians contributed more than 30% of tweets during the early stages, but were drowned out by other groups' greater activity in later stages.
Networks of interaction were highly decentralised, and media and political accounts received significantly more @mentions than ordinary users; this increases even further in the post-election phase. Twitter is largely centred around debates amongst ordinary users, therefore, but traditional political actors remain most visible in these debates. But there may be differences between such objectively measurable network structures and the relations between actors as the participants themselves may perceive them.
The next presenter on this AoIR 2013 panel is Christian Christensen, whose interest is in the minority parties in the US presidential election. He examined the tweets of four minority parties, defined as parties which had enough ballot listings across the states to technically be able to win the election: the Libertarians, the Greens, the Constitution Party, and the Justice Party. This, then, is a study of third party politics - and such parties have traditionally adhered to a polarising and populist style of politics.
In combination, the four parties' candidates had some 129,000 Twitter followers, led by the Libertarians with 100,000. They tweeted only a limited amount of time during the campaign, mostly during the debates and on the day before Election Day. Retweets of their tweets were often centred around a small number of original tweets, and were more or less proportional to their total number of followers.
Key issues raised by these parties were military, security, and human rights parties (drones, veteran affairs, surveillance, and the Bradley/Chelsea Manning case); the failure of the two-party system; and corporate power. This can be seen as enhancing content value in conversational ecologists, by riding waves of Twitter activity (e.g. around the debates) or by jumping on pre-existing hashtags like #election2012 or #debate.
But what is the relationship between offline power and online presence and representation? Is the existing offline support base related to the online resonance for these parties? Can even the small support base for these third parties affect the outcomes of narrow electoral races? Can Twitter be used as a bellwether for more marginal political perspectives, and a space for the discussion of such marginal perspectives which are not afforded significant space in the mainstream media? Do they indicate an undercurrent of dissatisfaction which is not addressed in mainstream media?
Well, with our Twitter and Society book officially launched, I'm now in a final AoIR 2013 session on politics and Twitter. First off, Kevin Driscoll is presenting on the role of Twitter in the US presidential election, noting how much "Twitter's opinion" was used as a yardstick for overall public opinion. There is some slippage here: "Twitter" as the Twitter community, "Twitter" as Twitter, Inc., and "Twitter" as a source of opinion data.
Kevin and his colleagues examined the Twitter activity around the three US presidential debates, following the live Twitter streams as the debates happened and dynamically adding more and more keywords to track on Twitter. They divided these tweets into retweets and original tweets. Some 0.01% of all users accounted for around 25% of all retweeted posts - and these users included politicians, pundits, journalists, comedians, and a variety of other accounts; 62 comic accounts were the source of 4% of all retweets.
But the line between comedians and pundits is difficult to established - some of the comedians made valid political points; some pundits attempted to be funny. Some 3.4% of all tweets were just about Big Bird, binders, and bayonets, in fact. There are also some strange oddities around the retweets - some jokes and other statements are straight plagiarisms of previously retweeted messages; some users appear to repeat their jokes across the debate.
How does this affect the way journalists may ethically use Twitter tracking tools? How can any reliable analysis be built on data which is clearly being gamed by some participants? If comedians imagine Twitter differently from other users, for example, how can this be accounted for? How can our tools more easily draw attention to the "weird" stuff we may find in the data?