Development action with informed and engaged societies
After nearly 28 years, The Communication Initiative (The CI) Global is entering a new chapter. Following a period of transition, the global website has been transferred to the University of the Witwatersrand (Wits) in South Africa, where it will be administered by the Social and Behaviour Change Communication Division. Wits' commitment to social change and justice makes it a trusted steward for The CI's legacy and future.
 
Co-founder Victoria Martin is pleased to see this work continue under Wits' leadership. Victoria knows that co-founder Warren Feek (1953–2024) would have felt deep pride in The CI Global's Africa-led direction.
 
We honour the team and partners who sustained The CI for decades. Meanwhile, La Iniciativa de Comunicación (CILA) continues independently at cila.comminitcila.com and is linked with The CI Global site.
Time to read
6 minutes
Read so far

The Potential of Social Media Intelligence to Improve People's Lives: Social Media Data for Good

0 comments
Affiliation

The Governance Lab

Date
Summary

Developed with support from Facebook, this report from the Governance Lab (GovLab) explores the premise that data - and, in particular, the data and analytical expertise held by social media companies - may provide for a new type of intelligence that could help develop solutions to today's challenges. It focuses on data collaboratives, an emerging form of public-private partnership in which actors from different sectors exchange information to create new public value. Such collaborative arrangements, for example between social media companies and humanitarian organisations or civil society actors, can be seen as possible templates for leveraging privately held data towards the attainment of public goals. The report not only captures the successes that data and technology have had to make the world a better place, but also it highlights some of the challenges the data and technology community must address to generate an even larger impact.

A survey of the landscape conducted by the GovLab identified 6 main categories of activity along a spectrum of openness and collaboration:

  • Data cooperatives or pooling: Corporations and other data-holders group together to create "data pools" with shared data resources;
  • Prizes and challenges: Corporations make data available to qualified applicants who compete to develop new apps or discover innovative uses for the data;
  • Research partnerships: Corporate data are often shared with universities and other academic organisations, giving researchers access to consumer datasets and other sources of data to analyse social trends;
  • Intelligence products: Shared (often aggregated) data are used to build a tool, dashboard, report, app, or another technical device to support a public or humanitarian objective;
  • Trusted intermediaries: Corporations share data with a limited number of known partners for data analysis and modelling, as well as other value chain activities; and
  • Application Programming Interfaces (APIs): APIs allow developers and others to access data for testing, product development, and data analytics. This includes instances where social media data were manually scraped by data users together with more open APIs; the latter have the advantage of more direct collaboration (e.g., cross-sector knowledge sharing) and more openly accessible data (e.g., minimal time and resources required to access data).

As detailed here, a number of recent examples show how social media data can be leveraged for public good. These include Facebook's sharing of population maps with humanitarian organisations following natural disasters; pre-dicting adverse drug reactions through social media data analysis in Spain; and the city of Boston, Massachusettes, United States (US)' use of crowdsourced data from Waze to improve transportation planning. These examples and 9 additional cases are discussed in the full report. By assessing these examples, the authors identify 5 key value propositions behind the use of social media data for public goals:

  1. Situational awareness and response: Data held by social media companies can help non-governmental organisations (NGOs), humanitarian organisations, and others better understand demographic trends, public sentiment, and the geographic distribution of various phenomena. Case studies included in the report:
    • Facebook Disaster Maps
    • Tracking Anti-Vaccination Sentiment in Eastern European Social Media Networks
    • Facebook Population Density Maps
  2. Knowledge creation and transfer: Widely dispersed datasets can be combined and analysed to create new knowledge, in the process ensuring that those responsible for solving problems have the most useful information at hand. Case studies included in the report:
    • Yelp Dataset Challenge
    • MIT Laboratory for Social Machines' Electome Project
    • LinkedIn's Economic Graph Research Program
  3. Public service design and delivery: Data collaboratives can increase access to previously inaccessible datasets, thereby enabling more accurate modelling of public service design and helping to guide service delivery in a targeted, evidence-based manner. Case studies included in the report:
    • Facebook Future of Business Survey
    • Easing Urban Congestion Using Waze Traffic Data
    • Facebook Insights for Impact Zika
  4. Prediction and forecasting: New predictive capabilities enabled by access to social media datasets can help institutions be more proactive, putting in place mechanisms based on sound evidence that mitigate problems or avert crises before they occur. Case studies included in the report:
    • Tracking the Flu Using Tweets
    • Predicting Floods with Social Media Metatags
    • Predicting Adverse Drug Events by Mining Health Social Media Streams and Forums
  5. Impact assessment and evaluation: Access to social media datasets can help institutions monitor and evaluate the real-world impacts of policies. This helps design better products or services, and enables a process of iteration and constant improvement. Case studies included in the report:
    • Sport England's #ThisGirlCan
    • Using Twitter Data to Analyze Public Sentiment on Fuel Subsidy Policy Reform in El Salvador
    • Using Twitter to Measure Global Engagement on Climate Change

GovLab has identified 7 benefits for social media companies that can follow from data collaboration (the Seven Rationales or 7Rs):

  1. Reciprocity: Corporations share their data and also gain access to other data sources or domain expertise that may be important to their own business decisions.
  2. Research Insights: Analysis enabled by data sharing can uncover new research insights that might be useful to the organisation sharing the data.
  3. Reputation: Sharing data for public good may enhance a firm's public image.
  4. Revenue: Sharing data doesn't always have to be free.
  5. Regulatory compliance: In some regions and situations, data sharing can be required.
  6. Responsibility: Opening up data can also improve the competitive business environment within which the business operates.
  7. Retainment of talent: Working on problems that matter enables companies to recruit or retain data science talent.

Despite the potential of data collaboratives, companies and public organisations often have concerns about sharing data. The report identifies 4 key risks and challenges, and discusses ways to mitigate them.

  • Privacy and security: Sharing information may result in disclosing personally or demographically identifiable information, which may create privacy and/or security violations. Data sharing must not result in any dilution of protections for individuals, many of whom might not even be aware that the data was collected about them in the first place.
  • Competitive concerns: Companies are often concerned that sharing data will threaten their commercial interests or affect their competitive advantage. GovLab's research into the field of cross-sector data-sharing suggests that there are often methods of balancing competitive risk with data sharing for public good, such as aggregating data or sharing insights from datasets rather than the raw data.
  • Generalisability, data bias, and quality: Social media data are often gathered from a particular demographic subset, possibly ignoring individuals, often from vulnerable communities, who are unrepresented in private or public datasets.
  • Barriers to a culture of data sharing and collaboration: There is a lack of understanding about the benefits of sharing social media data, and a lack of comfort and familiarity with such strategies. Embedding notions of collaboration, sharing, and mutual benefit in business operations is a challenge social media companies face.

GovLab asserts that social media corporations and organisations need to include the following 4 elements in frameworks to share data responsibly in order to assure concerns are addressed meaningfully and legitimately:

  • Risk and value assessments: Risks - including privacy, ethical, and commercial concerns - exist across the social media data lifecycle, and include: inaccurate, non-representative data entry during collection; insufficient, outdated, or inflexible security provisions during processing; incompatible cultural or institutional norms or expectations during sharing; aggregation or correlation of incomparable datasets during analysis; and controversial or incongruous data usage.
  • Data responsibility principles: Initiatives like the Signal Code and the Handbook on Data Protection in Humanitarian Action are important, but are limited to specific contexts and sectors. More broadly applicable and mutually agreed upon principles could lead to greater uptake and impact.
  • Data responsibility governance processes and functions: Such processes should be transparent and participatory, while being flexible and responsive to different needs and contexts. For example, to accommodate collaborative research using social media data, Facebook designed a review process that involved in-house training, different stages of review, and the application of evaluation criteria to determine whether to go ahead.
  • New methods and technologies that go beyond written policies: Data responsibility decision trees, such as the Center for Democracy & Technology's Digital Decision Tool, can be used to translate principles into a series of questions. Similarly, a transparency report showing with whom data is being shared and toward what public benefit could help allay concerns about government misuse of private-sector data assets.

This discussion leads to a series of recommendations for developing data collaboratives, which is grouped into 4 broad categories.

  • Stewards: Social media corporations should consider themselves, and act as, the standard bearers for a new corporate paradigm of stewardship of data as a public good. Social media companies should pioneer the role or position of Data Stewards within their organisations. Among their other roles, Data Stewards could help develop new coordinating mechanisms to unlock corporations' supply of social media data sets with potential public interest value. Such mechanisms should include: a due process to respond to data requests; a system for filtering or prioritising certain kinds of information; and a method to ensure that the data being released matches public needs and demands.
  • Evidence: A more detailed repository of case studies should be established to document impact and practice. Such a repository, which could build on the foundation offered by the case studies presented in this report, would highlight best practice in value propositions, technical arrangements, and legal frameworks for data collaboratives and give strategies for measuring impact. Companies and organisations involved in data sharing can together help develop better metrics of value and impact and new definitions of success.
  • Methods: Organisations and researchers should mine existing data collaborative experiments for examples of successful practice. Lessons and observations from this can be translated and shared as a toolkit or roadmap for corporations considering sharing data. Also needed is a better understanding of new techniques used by data collaboratives - natural language processing, neural networks, computational social science, network science, sentiment analysis, data-mining, and machine-learning - to analyse and seek insight from data. Many nonprofit and civil society organisations lack the necessary expertise to apply these techniques within data collaboratives. Corporate data sharing initiatives need also to consider sharing their expertise in handling data - for example, through training initiatives, educational programmes, informal peer- and practitioner-mentoring setups, and the development of affordable, user-friendly tools.
  • Data collaboratives movement: Efforts should be made to bring together various actors from the social media data community at dedicated convenings to: share lessons learned; identify pain points; and develop common solutions, procedures, and practices. Corporations could help facilitating such convenings by providing a venue (virtual or physical) where data providers and users can co-create ideas and insights. As the data collaboratives and social media intelligence movement continues to take shape, engagement with the populations whose attributes and behaviours often make up the data held by social media companies will be key. In addition to engaging with other skills, there needs to be broader engagement with other actors whose decisions can play an important role in the success of data collaboratives. Such actors could include policymakers, regulators, and potential funders, both for-profit and philanthropic.
Source

GovLab website, October 4 2017. Image credit: James Sutton