Internet Research Ethics
There is little research that is not impacted in some way on or through the Internet. The Internet, as a field, a tool, and a venue, has specific and far reaching ethical issues. Internet research ethics is a subdiscipline that fits across many disciplines, ranging from social sciences, arts and humanities, medical/biomedical, and hard sciences. Extant ethical frameworks, including consequentialism, utilitarianism, deontology, virtue ethics, and feminist ethics have contributed to the ways in which ethical issues in Internet research are considered and evaluated.
Conceptually and historically, Internet research ethics is related to computer and information ethics and includes such ethical issues as participant knowledge and consent, data privacy, security, confidentiality, and integrity of data, intellectual property issues, and community, disciplinary, and professional standards or norms. Throughout the Internet’s evolution, there has been debate whether there are new ethical dilemmas emerging, or if the existing dilemmas are similar to dilemmas in other research realms or despite technological influence (Elgesem 2002; Walther 2002; Ess & AoIR 2002; Marhkam & Buchanan, 2012). These debates are similar to philosophical debates in computer and information ethics. For example, many years ago, James Moor (1985) asked “what is special about computers” in order to understand what, if anything, is unique ethically. Today, the same question applies to Internet research (Ess & AoIR 2002; King 1996).
Yet, as the Internet has evolved into a more social and communicative tool and venue, the ethical issues have shifted from purely data driven to more human-centered. “On-ground” or face-to-face analogies, however, may not be applicable to online research. For example, the concept of the public park has been used as a site where researchers might observe others with little ethical controversy, but online, the concepts of public versus private are much more complex. Thus, some scholars suggest that the specificity of Internet research ethics calls for new regulatory and/or professional and disciplinary guidance. For these reasons, the concept of human subjects research policy and regulation informs this entry, along with disciplinary standards, which will explore the growing areas of ethical and methodological complexity, including personal identifiability, reputational risk and harm, notions of public space and public text, ownership, and longevity of data as they relate to Internet research. Specifically, the emergence of the social web raises issues around subject or participant recruitment practices, tiered informed consent models, and protection of various expectations and forms of privacy in an ever-increasing world of diffused and ubiquitous technologies; anonymity and confidentiality of data in spaces where researchers and their subjects may not fully understand the terms and conditions of those venues or tools; challenges to data integrity as research projects can be outsourced or crowdsourced to online labor marketplaces; and jurisdictional issues as more research is processed, stored, and disseminated via cloud computing or in remote server locales, presenting myriad legal complexities given jurisdictional differences in data laws. Further, the emergence of big data research has grown and has become the next discrete phase of Internet research, following social media and social computing.
As a result, researchers using the Internet as a tool for and/or a space of research—and their research ethics boards (REBs), also known as institutional review boards (IRBs) in the United States or human research ethics committees (HRECs) in other countries such as Australia—have been confronted with a series of new ethical questions: What ethical obligations do researchers have to protect the privacy of subjects engaging in activities in “public” Internet spaces? What are such public spaces? Is there any reasonable expectation of privacy in an era of pervasive and ubiquitous surveillance and data tracking? How is confidentiality or anonymity assured online? How is and should informed consent be obtained online? How should research on minors be conducted, and how do you prove a subject is not a minor? Is deception (pretending to be someone you are not, withholding identifiable information, etc) online a norm or a harm? How is “harm” possible to someone existing in an online space? How identifiable are individuals in large data sets? Do human subjects protections apply to big data? As more industry-sponsored research takes place, what ethical protections exist outside of current regulatory structures?
A growing number of scholars have explored these and related questions (see, for example, Bromseth 2002; Bruckman 2006; Buchanan 2004; Buchanan & Ess 2008; Johns, Chen & Hall 2003; Kitchin 2003, 2008; King 1996; Mann 2003; Markham & Baym 2008; McKee & Porter 2009; Thorseth 2003; Ess 2016; Zimmer & Kinder-Kurlanda, forthcoming), scholarly associations have drafted ethical guidelines for Internet research (Ess & Association of Internet Researchers 2002; Markham, Buchanan, and AoIR, 2012; Kraut et al. 2004), and non-profit scholarly and scientific agencies such as AAAS (Frankel & Siang 1999) have begun to confront the myriad of ethical concerns that Internet research poses to researchers and research ethics boards (REBs).
- 1. Definitions
- 2. Human Subjects Research
- 3. History and Development of IRE as a Discipline
- 4. Key Ethical Issues in Internet Research
- 5. Research Ethics Boards Guidelines
- Academic Tools
- Other Internet Resources
- Related Entries
The commonly accepted definition of Internet research ethics (IRE) has been used by Buchanan and Ess (2008, 2009), Buchanan (2010), and Ess & Association of Internet Researchers (AoIR) (2002):
IRE is defined as the analysis of ethical issues and application of research ethics principles as they pertain to research conducted on and in the Internet. Internet-based research, broadly defined, is research which utilizes the Internet to collect information through an online tool, such as an online survey; studies about how people use the Internet, e.g., through collecting data and/or examining activities in or on any online environments; and/or, uses of online datasets, databases, or repositories.
These examples were broadened in 2012 by the United States Secretary’s Advisory Committee to the Office for Human Research Protections, and included under the umbrella term Internet Research:
- Research studying information that is already available on or via the Internet without direct interaction with human subjects (harvesting, mining, profiling, scraping, observation or recording of otherwise-existing data sets, chat room interactions, blogs, social media postings, etc.)
- Research that uses the Internet as a vehicle for recruiting or interacting, directly or indirectly, with subjects (Self-testing websites, survey tools, Amazon Mechanical Turk, etc.)
- Research about the Internet itself and its effects (use patterns or effects of social media, search engines, email, etc.; evolution of privacy issues; information contagion; etc.)
- Research about Internet users: what they do, and how the Internet affects individuals and their behaviors Research that utilizes the Internet as an interventional tool, for example, interventions that influence subjects’ behavior
- Others (emerging and cross-platform types of research and methods, including m-research (mobile))
- Recruitment in or through Internet locales or tools, for example social media, push technologies
A critical distinction in the definition of Internet research ethics is that between the Internet as a research tool versus a research venue. The distinction between tool and venue plays out across disciplinary and methodological orientations. As a tool, Internet research is enabled by search engines, data aggregators, databases, catalogs, and repositories, while venues include such places or locales as conversation applications (IM/chat rooms, for example), MUDs, MOOs, MMORPGs, (forms of role-playing games, virtual worlds) newsgroups, home pages, blogs, micro-blogging (i.e., Twitter), RSS feeds, crowdsourcing applications, or online course software.
Another way of conceptualizing the distinction between tool and venue comes from Kitchin (2008), who has referred to a distinction in Internet research using the concepts of “engaged web-based research” versus “non-intrusive web-based research:” “Non-intrusive analyses refer to techniques of data collection that do not interrupt the naturally occurring state of the site or cybercommunity, or interfere with premanufactured text. Conversely, engaged analyses reach into the site or community and thus engage the participants of the web source” (p. 15). These two constructs provide researchers with a way of recognizing when considering of human subject protections might need to occur. McKee and Porter (2009), as well as Banks and Eble (2007) provide guidance on the continuum of human-subjects research, noting a distinction between person-based versus text-based. For example, McKee and Porter provide a range of research variables (public/private, topic sensitivity, degree of interaction, and subject vulnerability) which are useful in determining where on the continuum of text-based versus how person-based the research is, and whether or not subjects would need to consent to the research (pp. 87–88).
While conceptually useful for determining human subjects participation, the distinction between tool and venue or engaged versus non-intrusive web-based research is increasingly blurring in the face of social media and their third party applications. Buchanan (2016) has conceptualized three phases or stages of Internet research, and the emergence of social media characterize the second phase, circa 2006-2014. The concept of social media entails “A group of Internet-based applications that build on the ideological and technological foundations of Web 2.0, and that allow the creation and exchange of user-generated content” (Kaplan & Haenlein 2010). A “social network site” is a category of websites with profiles, semi-persistent public commentary on the profile, and a traversable publicly articulated social network displayed in relation to the profile.
A key moment that typified and called attention to many of these concerns emerged with the 2014 Facebook Emotional Contagion study. By virtue of agreeing to Facebook’s Terms of Service, did users consent to participation in research activities? Should there have been a debriefing after the experiment? How thoroughly did a university research ethics board review the study? Should industry-sponsored research undergo internal ethics review? In response to the outcry of the Contagion study, Ok Cupid’s Christian Rudder (2014) defended these sorts of experiments, noting “We noticed recently that people didn’t like it when Facebook ‘experimented’ with their news feed. Even the FTC is getting involved. But guess what, everybody: if you use the Internet, you’re the subject of hundreds of experiments at any given time, on every site. That’s how websites work.”
The phenomenon of the social web forces an ongoing negotiation between researchers and their data sources or human subjects, as seen in the Facebook contagion study and the subsequent reaction to it. Moreover, with the growing use and concentration of mobile devices, the notion of Internet research is expanding with a movement away from a “place-based” Internet to a dispersed reality. Data collection from mobile devices is on the increase. For example, mobile devices enable the use of synchronous data collection and dissemination from non-place based environments. Researchers using cloud-enabled applications can send and receive data to and from participants synchronously. The impact of such research possibilities for epidemiological research (Leibovici et al. 2010) to community-based participatory research (Parras et al. 2011) is staggering for its scientific potential while demanding for the concurrent ethical challenges. Many of these challenges require a careful consideration of traditional notions of human subjects research and how Internet research pushes the boundaries of these notions.
The practical, professional, and theoretical implications of human subjects protections has been covered extensively in scholarly literature, ranging from medical/biomedical to social sciences to computing and technical disciplines (see Beauchamp & Childress 2008; Emanual et al. 2003; Sieber 1992 and forthcoming; Wright 2006). Relevant protections and regulations continue to receive much attention in the face of research ethics violations (see, for example, Skloot 2010, on Henrietta Lacks; the U.S. Government’s admission and apology to the Guatemalan Government for STD testing in the 1940s; and Gaw & Burns 2011, on how lessons from the past might inform current research ethics and conduct).
The history of human subjects protections (Sparks 2002—see Other Internet Resources) grew out of atrocities such as Nazi human experimentation during World War II, which resulted in the Nuremberg Code, in 1947; subsequently followed by the Declaration of Helsinki on Ethical Principles for Medical Research Involving Human Subjects (World Medical Association 1964/2008). In response to the Tuskegee syphilis experiment, an infamous clinical study conducted between 1932 and 1972 by the U.S. Public Health Service studying the natural progression of untreated syphilis in rural African-American men in Alabama under the guise of receiving free health care from the government, the U.S. Department of Health and Human Services put forth a set of basic regulations governing the protection of human subjects (45 C.F.R. § 46), followed by the publication of the “Ethical Principles and Guidelines for the Protection of Human Subjects of Research” by the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research, known as the Belmont Report (NCPHSBBR 1979). The Belmont Report identifies three fundamental ethical principles for all human subjects research: Respect for Persons, Beneficence, and Justice.
To ensure consistency across federal agencies in the United States context in human subjects protections, in 1991, the Federal Policy for the Protection of Human Subjects, also known as the “Common Rule” was codified. Similar regulatory frameworks protecting human subjects have emerged across the world, including the Canadian Tri-Council, the Australian Research Council, The European Commission, The Research Council of Norway and its National Committee for Research Ethics in the Social Sciences and Humanities (NESH 2006; NESH 2014), and the U.K.’s NHS National Research Ethics Service and the Research Ethics Framework (REF) of the ESRC (Economic and Social Research Council) General Guidelines, and the Forum for Ethical Review Committees in Asia and the Western Pacific (FERCAP).
To date, the various U.S. regulatory agencies bound by the Common Rule have not issued formal guidance on Internet research. Similarly, few regulatory bodies in other countries have changed or redefined their regulations because of, or in light of, Internet research. However, guidelines for researcher and reviewer considerations have begun to emerge globally. Despite regional and cultural difference, Buchanan (2010) has outlined the similarities in the mission, scope, and intentions the REBs globally, predominantly around shared notions of risk and harm, justice, and respect for persons.
While stopping short of regulatory guidance, many research ethics boards are exploring the ways in which Internet research complicates traditional models of human subjects protections can be, or are, applied. For example, the United States Department of Health and Human Services (DHHS) and the Office for Human Research Protections (OHRP) operate with the following definition of human subjects (45 C.F.R. § 46.102(f) 2009).
Human subject means a living individual about whom an investigator (whether professional or student) conducting research obtains
- data through intervention or interaction with the individual, or
- identifiable private information.
Intervention includes both physical procedures by which data are gathered (for example, venipuncture) and manipulations of the subject or the subject’s environment that are performed for research purposes. Interaction includes communication or interpersonal contact between investigator and subject. Private information includes information about behavior that occurs in a context in which an individual can reasonably expect that no observation or recording is taking place, and information which has been provided for specific purposes by an individual and which the individual can reasonably expect will not be made public (for example, a medical record). Private information must be individually identifiable (i.e., the identity of the subject is or may readily be ascertained by the investigator or associated with the information) in order for obtaining the information to constitute research involving human subjects (OHRP, 2008).
Two novel notions entering human subjects discourse include “human non-subjects research” and “human harming research.” Brothers and Clayton (2010) propose human non-subjects as a conceptual category, not as a replacement of regulatory language but as another category for research review considerations. Human non-subjects research is emerging in light of technological advancements and research development which uses deidentified information on humans, for example, genetic data, or discrete variables from a data set, specifically in the contexts of tissue banking or deidentified data in repositories that are used for other research beyond when the samples or data were first collected. An individual may have consented to the original research, say, in a clinical trial, in such cases, reconsent may be impossible, yet the samples may still pose reidentification risks to humans. For instance, as data sets are shared, data may be scrubbed to remove all identifiers, or some identifiers may be kept with the data or the data custodian. Rothstein (2010) agrees, with a clear eye to privacy and risk: “The use of deidentified health information and biological specimens in research creates a range of privacy and other risks to individuals and groups. The current regulatory system, under which privacy protections are afforded identifiable information but no protections apply to deidentified information, needs to be revised” (p. 3). O’Rourke (2007) has provided guidance on the use of human specimens and banking, which could be adapted to other forms of data banking.
Carpenter and Dittrich (2011) and Aycock et al. (2012) refer to the notion of “human-harming research” as a variable in human subjects review in Internet, or more specifically, computer science or information/communication technology (ICT) research. Carpenter and Dittrich encourage
Review boards [to] transition from an informed consent driven review to a risk analysis review that addresses potential harms stemming from research in which a researcher does not directly interact with the at-risk individuals….[The] distance between researcher and affected individual indicates that a paradigm shift is necessary in the research arena. We must transition our idea of research protection from “human subjects research” to “human harming research.”
Similarly, Aycock et al. (2012) assert that
Researchers and boards must balance presenting risks related to the specific research with risks related to the technologies in use. With computer security research, major issues around risk arise, for society at large especially. The risk may not seem evident to an individual but in the scope of security research, larger populations may be vulnerable. There is a significant difficulty in quantifying risks and benefits, in the traditional sense of research ethics….An aggregation of surfing behaviors collected by a bot presents greater distance between researcher and respondent than an interview done in a virtual world between avatars. This distance leads us to suggest that computer security research focus less concern around human subjects research in the traditional sense and more concern with human harming research. (italics original)
These two conceptual notions are relevant for considering emergent forms of identities or personally identifiable information (PII) such as avatars, virtual beings, bots, textual and graphical information. Within the Code of Federal Regulations (45 C.F.R. § 46.102(f) 2009): New forms of representations are considered human subjects if PII about living individuals is obtained. PII can be obtained by researchers through scraping data sources, profiles or avatars, or other pieces of data made available by the human “behind the avatar or other representation” (Odwazny & Buchanan 2011). Fairfield agrees: “An avatar, for example, does not merely represent a collection of pixels—it represents the identity of the user” (2012, p. 701)
The multiple disciplines already long engaged in human subjects research (medicine, sociology, anthropology, psychology, communication) have established ethical guidelines intended to assist researchers and those charged with ensuring that research on human subjects follows both legal requirements and ethical practices. But with research involving the Internet—where individuals increasingly share personal information on platforms with porous and shifting boundaries, where both the spread and aggregation of data from disparate sources is increasingly the norm, and where web-based services, and their privacy policies and terms of service statements, morph and evolve rapidly—the ethical frameworks and assumptions traditionally used by researchers and REBs are frequently challenged.
At the time of this revision, the Department of Health and Human Services has been working on a revision to the Common Rule. The Notice of Proposed Rule Making includes revisions to categories of research, consent, and data security, among other changes that affect research in the social-behavioral-educational and biomedical realms.
An extensive body of literature has developed since the 1990s around the use of the Internet for research (Jones 1999; Hunsinger, Klastrup & Allen 2010; Consalvo & Ess 2010), with a growing emphasis on the ethical dimensions of Internet research.
A flurry of Internet research, and explicit concern for the ethical issues concurrently at play in it, began in the mid 1990s. In 1996, Storm King recognized the growing use of the Internet as a venue for research. His work explored the American Psychological Association’s guidelines for human subjects research with emergent forms of email, chat, listservs, and virtual communities. With careful attention to risk and benefit to Internet subjects, King offered a cautionary note:
When a field of study is new, the fine points of ethical considerations involved are undefined. As the field matures and results are compiled, researchers often review earlier studies and become concerned because of the apparent disregard for the human subjects involved. (King 1996,119)
The 1996 issue of Information Society dedicated to Internet research is considered a watershed moment, and included much seminal research, still of impact and relevance today (Allen 1996; Boehlefeld 1996; Reid 1996).
Sherry Turkle’s 1997 Life on the Screen: Identity in the Age of the Internet called direct attention to the human element of online game environments. Moving squarely towards person-based versus text-based research, Turkle pushed researchers to consider human subjects implications of Internet research. Similarly, Markham’s Life Online: Researching Real Experience in Virtual Space (1998) highlighted methodological complexities of online ethnographic studies, as did Jacobson’s 1999 methodological treatment of Internet research. The “field” of study changed the dynamics of researcher-researched roles, identity, and representation of participants from virtual spaces. Markham’s work in qualitative online research has been influential across disciplines, as research in nursing, psychology, and medicine has found the potential of this paradigm for online research (Flicker et al. 2004; Esyenbach & Till 2001; Seaboldt & Kupier 1997; Sharf 1996;).
Then, in 1999, the American Association for the Advancement of Science (AAAS), with a contract from the U.S. Office for Protection from Research Risks (now known as the Office for Human Research Protections), convened a workshop, with the goal of assessing the alignment of traditional research ethics concepts to Internet research. The workshop acknowledged
The vast amount of social and behavioral information potentially available on the Internet has made it a prime target for researchers wishing to study the dynamics of human interactions and their consequences in this virtual medium. Researchers can potentially collect data from widely dispersed population sat relatively low cost and in less time than similar efforts in the physical world. As a result, there has been an increase in the number of Internet studies, ranging from surveys to naturalistic observation. (Frankel & Siang 1999)
In the medical/biomedical contexts, Internet research is growing rapidly. Also in 1999, Gunther Eysenbach wrote the first editorial to the newly formed Journal of Medical Internet Research. There were three driving forces behind the inception of this journal, and Eysenbach calls attention to the growing social and interpersonal aspects of the Internet:
First, Internet protocols are used for clinical information and communication. In the future, Internet technology will be the platform for many telemedical applications. Second, the Internet revolutionizes the gathering, access and dissemination of non-clinical information in medicine: Bibliographic and factual databases are now world-wide accessible via graphical user interfaces, epidemiological and public health information can be gathered using the Internet, and increasingly the Internet is used for interactive medical education applications. Third, the Internet plays an important role for consumer health education, health promotion and teleprevention. (As an aside, it should be emphasized that “health education” on the Internet goes beyond the traditional model of health education, where a medical professional teaches the patient: On the Internet, much “health education” is done “consumer-to-consumer” by means of patient self support groups organizing in cyberspace. These patient-to-patient interchanges are becoming an important part of healthcare and are redefining the traditional model of preventive medicine and health promotion).
With scholarly attention growing and with the 1999 AAAS report calling for action, other professional associations took notice and began drafting statements or guidelines, or addendum to their extant professional standards. For example, The Board of Scientific Affairs (BSA) of the American Psychological Association established an Advisory Group on Conducting Research on the Internet in 2001; the American Counseling Association’s 2005 revision to its Code of Ethics; the Association of Internet Researchers (AoIR) Ethics Working Group Guidelines, The National Committee for Research Ethics in the Social Sciences and the Humanities Research Ethics Guidelines for Internet Research, among others, have directed researchers and review boards to the ethics of Internet research, with attention to the most common areas of ethical concern (see Other Internet References for links).
While many researchers focus on traditional research ethics principles, conceptualizations of Internet research ethics depend on disciplinary perspectives. Some disciplines, notably from the arts and humanities, posit that Internet research is more about context and representation than about “human subjects,” suggesting there is no intent, and thus minimal or no harm, to engage in research about actual persons. The debate has continued since the early 2000s. White (2002) argued against extant regulations that favored or privileged specific ideological, disciplinary and cultural prerogatives, which limit the freedoms and creativity of arts and humanities research. For example, she notes that the AAAS report “confuses physical individuals with constructed materials and human subjects with composite cultural works,” again calling attention to the person versus text divide that has permeated Internet research ethics debates. Another example of disciplinary differences comes from the Oral History Association, which acknowledged the growing use of the Internet as a site for research:
Simply put, oral History collects memories and personal commentaries of historical significance through recorded interviews. An oral history interview generally consists of a well-prepared interviewer questioning an interviewee and recording their exchange in audio or video format. Recordings of the interview are transcribed, summarized, or indexed and then placed in a library or archives. These interviews may be used for research or excerpted in a publication, radio or video documentary, museum exhibition, dramatization or other form of public presentation. Recordings, transcripts, catalogs, photographs and related documentary materials can also be posted on the Internet. (Ritchie 2003, 19)
While the American Historical Association (Jones 2008) has argued that such research be “explicitly exempted” from ethical review board oversight, the use of the Internet could complicate such a stance if such data became available in public settings or available “downstream” with potential, unforeseeable risks to reputation, economic standing, or psychological harm, should identification occur.
Under the concept of text rather than human subjects, Internet research rests on arguments of publication and copyright; consider the venue of a blog, which does not meet the definition of human subject as in 45 C.F.R. § 46.102f (2009), as interpreted by most ethical review boards. A researcher need not obtain consent to use text from a blog, as it is generally considered publicly available, textual, published material. This argument of the “public park” analogy that has been generally accepted by researchers is appropriate for some Internet venues and tools, but not all: Context, intent, sensitivity of data, and expectations of Internet participants were identified in 2004 by Sveninngsson as crucial markers in Internet research ethics considerations.
By the mid 2000s, with three major anthologies published, and a growing literature base, there was ample scholarly literature documenting IRE across disciplines and methodologies, and subsequently, there was anecdotal data emerging from the review boards evaluating such research. In search of empirical data regarding the actual review board processes of Internet research from a human subjects perspective, Buchanan and Ess surveyed over 700 United States ethics review boards, and found that boards were primarily concerned with privacy, data security and confidentiality, and ensuring appropriate informed consent and recruitment procedures (Buchanan and Ess 2009; Buchanan and Hvizdak 2009).
In 2008, the Canadian Tri-Council’s Social Sciences and Humanities Research Ethics Special Working Committee: A Working Committee of the Interagency Advisory Panel on Research Ethics was convened (Blackstone et al. 2008) ; and in 2010, a meeting at the Secretary’s Advisory Committee to the Office for Human Research Protections highlighted Internet research (SACHRP 2010). Such prominent professional organizations as the Public Responsibility in Medicine and Research (PRIM&R) and the American Educational Research Association (AERA) have begun featuring Internet research ethics regularly at their conferences and related publications.
Recently, disciplines not traditionally involved in human subjects research have begun their own explorations of IRE. For example, researchers in computer security are actively examining the tenets of research ethics in CS and ICT (Aycock et al. 2012; Dittrich, Bailey, Dietrich 2011; Carpenter & Dittrich 2011; Buchanan et al. 2011). Notably, the U.S. Federal Register requested comments on “The Menlo Report” in January 2012, which calls for a commitment by computer science researchers to the three principles of respect for persons, beneficence, and justice, while also adding a fourth principle on respect for law and public interest (Homeland Security 2011).
Principles of research ethics dictate that researchers must ensure there are adequate provisions to protect the privacy of subjects and to maintain the confidentiality of any data collected. A violation of privacy or breach of confidentiality presents a risk of serious harm to participants, ranging from the exposure of personal or sensitive information, the divulgence of embarrassing or illegal conduct, or the release of data otherwise protected under law.
Research ethics regulations express concern over subject privacy in terms of the level of linkability of data to individuals, and the potential harm disclosure of information could pose. For example, when discussing the possible exemption of certain research from human subject review, federal guidelines require oversight in these circumstances:
(i) information obtained is recorded in such a manner that human subjects can be identified, directly or through identifiers linked to the subjects; and (ii) any disclosure of the human subjects’ responses outside the research could reasonably place the subjects at risk of criminal or civil liability or be damaging to the subjects’ financial standing, employability, or reputation (45 C.F.R. § 46.101(b)(2) 2009).
The protection of privacy and confidentiality is typically achieved through a combination of research tactics and practices, including engaging in data collection under controlled or anonymous environments, the scrubbing of data to remove personally identifiable information (PII), or the use of access restrictions and related data security methods.
Compliance with federal guidelines also rests on the definition of what kind of data are considered PII, and therefore triggering special privacy considerations. The National Institutes of Health (NIH), for example, defines PII as follows:
any information about an individual maintained by an agency, including, but not limited to, education, financial transactions, medical history, and criminal or employment history and information which can be used to distinguish or trace an individual’s identity, such as their name, SSN, date and place of birth, mother’s maiden name, biometric records, etc., including any other personal information that is linked or linkable to an individual. (NIH 2008)
Typically, examples of identifying pieces of information have included personal characteristics (such as date of birth, place of birth, mother’s maiden name, gender, sexual orientation, and other distinguishing features and biometrics information, such as height, weight, physical appearance, fingerprints, DNA and retinal scans), unique numbers or identifiers assigned to an individual (such as a name, address, phone number, social security number, driver’s license number, financial account numbers), and descriptions of physical location (GIS/GPS log data, electronic bracelet monitoring information).
Internet research introduces new complications to these longstanding definitions and regulatory frameworks intended to protect subject privacy. For example, researchers increasingly are able continue to collect detailed data about individuals from sources such as Facebook, Twitter, blogs or public email archives, and these rich data sets can more easily be processed, compared, and combined with other data (and datasets) available online. In various cases, researchers (and sometimes even amateurs) have been able to re-identify individuals by analyzing and comparing such datasets, using data-fields as benign as one’s zip code (Sweeny 2002), random Web search queries (Barbaro & Zeller 2006), or movie ratings (Narayanan & Shmatikov 2008) as the vital key for reidentification of a presumed anonymous user. Prior to widespread Internet-based data collection and processing, few would have considered one’s movie ratings or zip code as personally-identifiable. Yet, these cases reveal that merely stripping traditional “identifiable” information such as a subject’s name, address, or social security number is no longer sufficient to ensure data remains anonymous (Ohm 2009), and requires the reconsideration of what is considered “personally identifiable information” (Schwartz & Solove 2011). This points to the critical distinction between data which is kept confidential versus data that is truly anonymous. Increasingly, data are rarely completely anonymous, as researchers have routinely demonstrated they can often reidentify individuals hidden in “anonymized” datasets with ease (Ohm 2009). This reality places new pressure on ensuring datasets are kept, at the least, suitably confidential through both physical and computational security measures.
Similarly, new types of data often collected in Internet research might also be used to identify a subject within a previously-assumed anonymous dataset. For example, Internet researchers might also collect Internet Protocol (IP) addresses when conducting online surveys or analyzing transaction logs. An IP address is a unique identifier that is assigned to every device connected to the Internet; in most cases, individual computers are assigned a unique IP address, while in some cases the address is assigned to a larger node or Internet gateway for a collection of computers. Nearly all websites and Internet service providers store activity logs that link activity with IP address, in many cases, eventually to specific computers or users. Current U.S. law does not hold IP addresses to be personally identifiable information, while other countries and regulatory bodies do. For example, the European Data Privacy Act at Article 29, holds that IP addresses do constitute PII. Buchanan et al. (2011), note, however, that under the U.S. Civil Rights Act, for the purposes of the HIPAA Act, IP addresses are considered a form of PII (45 C.F.R. § 164.514 2002). There could potentially be a reconsideration by other federal regulatory agencies over IP addresses as PII, and researchers and boards will need to be attentive should such change occur.
A similar complication emerges when we consider the meaning of “private information” within the context of Internet-based research. Federal regulations define “private information” as:
[A]ny information about behavior that occurs in a context in which an individual can reasonably expect that no observation or recording is taking place, and information that has been provided for specific purposes by an individual and that the individual can reasonably expect will not be made public (for example, a medical record) (45 C.F.R. § 46.102(f) 2009).
This standard definition of “private information” has two key components. First, private information is that which subjects reasonably expect is not normally monitored or collected. Second, private information is that which subjects reasonable expect is not normally publicly available. Conversely, the definition also suggests the opposite is true: if users cannot reasonably expect data isn’t being observed or recorded, or they cannot expect data isn’t publicly available, then the data does not rise to the level of “private information” requiring particular privacy protections. Researchers and REBs have routinely worked with this definition of “private information” to ensure the protection of subject privacy.
These distinctions take on greater weight, however, when considering the data environments and collection practices common with Internet-based research. Researchers interested in collecting or analyzing online actions of subjects—perhaps through the mining of online server logs, the use of tracking cookies, or the scraping of social media profiles and feeds—could argue that subjects do not have a reasonable expectation that such online activities are not routinely monitored since nearly all online transactions and interactions are routinely logged by websites and service providers. Thus, online data trails might not rise to the level of “private information”. However, numerous studies have indicated that average Internet users have incomplete understandings of how their activities are routinely tracked, and the related privacy practices and policies of the sites they visit (Hoofnagle & King 2009; Milne & Culnan 2004; Tsai et al. 2006). Hudson and Bruckman (2005) conducted empirical research on users’ expectations and understandings of privacy, finding that participants’ expectations of privacy within public chatrooms conflicted with what was actually a very public online space. Rosenberg (2010) examined the public/private distinction in the realm of virtual worlds, suggesting researchers must determine what kind of social norms and relations predominate an online space before making assumptions about the “publicness” of information shared within. Thus, it remains unclear whether Internet users truly understand if and when their online activity is regularly monitored and tracked, and what kind of reasonable expectations truly exist. This ambiguity creates new challenges for researchers and REBs when trying to apply the definition of “private information” to ensure subject privacy is properly addressed (Zimmer 2010).
This complexity in addressing subject privacy in Internet research is further compounded with the rise of social networking as a place for the sharing of information, and a site for research. Users increasingly share more and more personal information on platforms like Facebook, MySpace, or Twitter. For researchers, social media platforms provide a rich resource for study, and much of the content is available to be viewed and downloaded with minimal effort. Since much of the information posted to social media sites is publicly viewable, it thus fails to meet the standard regulatory definition of “private information.” Therefore, researchers attempting to collect and analyze social media postings might not treat the data as requiring any particular privacy considerations. Yet, social media platforms represent a complex environment of social interaction where users are often required to place friends, lovers, colleagues, and minor acquaintances within the same singular category of “friends”, where privacy policies and terms of service are not fully understood (Madejski et al. 2011), and where the technical infrastructures fail to truly support privacy projections (Bonneau & Preibush 2009) and regularly change with little notice (Stone 2009; Zimmer 2009, Other Internet Resources). As a result, it is difficult to understand with any certainty what a user’s intention was when posting an item onto a social media platform (Acquisti & Gross 2006). It could be been meant to be visible to only a small circle of friends, but the user failed to completely understand how to adjust the privacy settings accordingly. Or, the information might have previously been restricted to only certain friends, but a change in the technical platform suddenly made the data more visible to all.
Ohm (2010) warns that “the utility and privacy of data are linked, and so long as data is useful, even in the slightest, then it is also potentially reidentifiable” (p. 1751). With the rapid growth of Internet-based research, Ohm’s concern becomes even more dire. The traditional definitions and approaches to understanding the nature of privacy, anonymity, and precisely what kind of information deserves protection becomes strained, forcing researchers and REBs to consider more nuanced theories of privacy (Nissenbaum 2009) and approaches to respecting and projecting subject privacy (Markham 2012; Zimmer 2010).
Depending on the type of Internet research being carried out, recruitment of participants may be done in a number of ways. As with any form of research, the population or participants is selected for specific purposes (i.e., an ethnographic study of a particular group on online game players), or, can be selected from a range of sampling techniques (i.e., a convenience sample gleaned from the users of Amazon’s Mechanical Turk crowdsourcing platform). In the U.S. context, a recruitment plan is considered part of the informed consent process, and as such, any recruitment script or posting must be reviewed and approved by an REB prior to posting or beginning solicitation (if the project is human subjects research). Further, selection of participants must be fair, and risks and benefits must be justly distributed. This concept is challenging to apply in Internet contexts, in which populations are often self-selected and can be exclusive, depending on membership and access status, as well as the common disparities of online access based on economic and social variables. Researchers also face recruitment challenges due to online subjects’ potential anonymity, especially as it relates to the frequent use of pseudonyms online, having multiple or alternative identities online, and the general challenges of verifying a subject’s age and demographic information. Moreover, basic ethical principles for approaching and recruiting participants involve protecting their privacy and confidentiality. Internet research can both maximize these protections, as an individual may never be known beyond a screen name or avatar existence; or, conversely, the use of IP addresses, placement of cookies, availability and access to more information than necessary for the research purposes, may minimize the protections of privacy and confidentiality.
Much recruitment is taking place via social media; examples include push technologies, a synchronous approach in which a text or tweet is sent from a researcher to potential participants. Geolocational status through mobile devices and push technology recruitment, in tandem, allow for novel forms of recruitment for such research as in clinical trials. Other methods of pull technologies recruitment include direct email, dedicated web pages, YouTube videos, direct solicitation via “stickies” posted on fora or web sites directing participants to a study site, or data aggregation or scraping data for potential recruitment. Regardless of the means used, researchers must follow the terms of the site—from the specific norms and nuances governing a site or locale to the legal issues in terms of service agreements. For example, early pro-anorexia web sites (see Overbeke 2008) were often treated as sensitive spaces deserving spcicial consideration, and researchers were asked to respect the privacy of the participants and not engage in research (Walstrom 2004). In the gaming context, Reynolds and de Zwart (2010) ask:
Has the researcher disclosed the fact that he or she is engaged in research and is observing/interacting with other players for the purposes of gathering research data? How does the research project impact upon the community and general game play? Is the research project permitted under the Terms of Service?
Colvin and Lanigan (2005, 38) suggest researchers
Seek permission from Web site owners and group moderators before posting recruitment announcements, Then, preface the recruitment announcement with a statement that delineates the permission that has been granted, including the contact person and date received. Identify a concluding date (deadline) for the research study and make every effort to remove recruitment postings, which often become embedded within Web site postings.
Barratt and Lenton (2010), among others, agree:
It is critical, therefore, to form partnerships with online community moderators by not only asking their permission to post the request, but eliciting their feedback and support as well.
Mendelson (2007) and Smith and Leigh (1997) note that recruitment notices need to contain more than the typical flyers or advertisements used for newspaper advertisements. Mentioning the approval of moderators is important for establishing authenticity, and so is providing detailed information about the study and how to contact both the researchers and the appropriate research ethics board.
Given the array of techniques possible for recruitment, the concept of “research spam” requires attention. The Council of American Survey Research warns
Research Organizations should take steps to limit the number of survey invitations sent to targeted respondents by email solicitations or other methods over the Internet so as to avoid harassment and response bias caused by the repeated recruitment and participation by a given pool (or panel) of data subjects. (CASRO 2011, I.B.3)
Ultimately, researchers using Internet recruitment measures must ensure that potential participants are getting enough information in both the recruitment materials and any subsequent consent documents. Researchers must ensure that recruitment methods do not lead to an individual being identified, and if such identification is possible, are there significant risks involved?
As the cornerstone of human subjects protections, informed consent means that participants are voluntarily participating in the research with adequate knowledge of relevant risks and benefits. Providing informed consent typically includes the researcher explaining the purpose of the research, the methods being used, the possible outcomes of the research, as well as associated risks or harms that the participants might face. The process involves providing the recipient clear and understandable explanations of these issues in a concise way, providing sufficient opportunity to consider them and enquire about any aspect of the research prior to granting consent, and ensuring the subject has not been coerced into participating. Gaining consent in traditional research is typically done verbally, either in a face-to-face meeting where the researcher reviews the document, through telephone scripts, through mailed documents, fax, or video, and can be obtained with the assistance of an advocate in the case of vulnerable populations. Most importantly, informed consent was built on the ideal of “process” and the verification of understanding, and thus, requires an ongoing communicative relationship between and among researchers and their participants. The emergence of the Internet as both a tool and a venue for research has introduced challenges to this traditional approach to informed consent.
In most regulatory frameworks, there are instances when informed consent might be waived, or the standard processes of obtaining informed consent might be modified, if approved by a research ethics board. Various forms of Internet research require different approaches to the consent process. Some standards have emerged, depending on venue (i.e., an online survey platform versus a virtual world island). However, researchers are encouraged to consider waiver of consent and/or documentation, if appropriate, by using the flexibilities of their extant regulations.
Where consent is required but documentation is waived, a “portal” can be used to provide consent information. For example, a researcher may send an email to the participant with a link a separate portal or site information page where information on the project is contained. The participant can read the documentation and click on an “I agree” submission. Rosser et al. (2010) recommend using a “chunked” consent document, whereby individuals can read specific sections, agree, and then continue onwards to completion of the consent form, until reaching the study site.
In addition to portals, researchers will often make use of consent cards or tokens in virtual worlds; this alleviates concerns that unannounced researcher presence is unacceptable, or, that a researcher’s presence is intrusive to the natural flow and movement of a given locale. Hudson and Bruckman (2004, 2005) highlighted the unique challenges in gaining consent in chat rooms, while Lawson (2004) offers an array of consent possibilities for synchronous computer-mediated communication. There are different practical challenges in the consent process in Internet research, given the fluidity and temporal nature of Internet spaces.
If documentation of consent is required, some researchers have utilized alternatives such as electronic signatures, which can range from a simple electronic check box to acknowledge acceptance of the terms to more robust means of validation using encrypted digital signatures, although the validity of electronic signatures vary by jurisdiction.
Regardless of venue, informed consent documents are undergoing a discursive change. While the basic elements of consent remain intact, researchers must now acknowledge with less certainty specific aspects of their data longevity, risks to privacy, confidentiality and anonymity (see Privacy, above), and access to or ownership of data. Researchers must address and inform participants/subjects about potential risk of data intrusion or misappropriation of data if subsequently made public or available outside of the confines of the original research. Statements should be revised to reflect such realities as cloud storage (see below) and data sharing.
For example, Aycock et al. (2012, p. 141) describe a continuum of security and access statements used in informed consent documents:
- “No others will have access to the data.”
- “Anonymous identifiers will be used during all data collection and analysis and the link to the subject identifiers will be stored in a secure manner.”
- “Data files that contain summaries of chart reviews and surveys will only have study numbers but no data to identify the subject. The key [linking] subject names and these study identifiers will be kept in a locked file.”
- “Electronic data will be stored on a password protected and secure computer that will be kept in a locked office. The software ‘File Vault’ will be used to protect all study data loaded to portable laptops, flash drives or other storage media. This will encode all data… using Advanced Encryption Standard with 128-bit keys (AES-128)”
This use of encryption in the last statement may be necessary in research including sensitive data, such as medical, sexual, health, financial, and so on. Barratt and Lenton (2010), in their research on illicit drug use and online forum behaviors, also provide guidance about use of secure transmission and encryption as part of the consent process.
In addition to informing participants about potential risks and employing technological protections, U.S.-based researchers working with sensitive biomedical, behavioral, clinical or other types of research, may choose to obtain a Certificate of Confidentiality from the National Institutes of Health, which protects researchers and institutions from being compelled to disclose information that could identify research subjects in civil, criminal, or other legal demands (NIH 2011). However, these do not protect against release of data outside of the U.S. Given the reality of Internet research itself, which inherently spans borders, new models may be in order to ensure confidentiality of data and protections of data. Models of informed consent for traditional international research are fundamentally challenging due to cultural specificity and norms (Annas 2009; Boga et al. 2011; Krodstad et al. 2010); with Internet research, where researchers may be unaware of the specific location of an individual, consent takes on significantly higher demands. While current standards of practice show that consent models stem from the jurisdiction of the researcher and sponsoring research institution, complications arise in the face of age verification, age of majority/consent, reporting of adverse effects or complaints with the research process, and authentication of identity. Various jurisdictional laws around privacy are relevant for the consent process; a useful tool is Forrester’s Data Privacy Heat Map, which relies on in-depth analyses of the data privacy-related laws and cultures of countries around the world, helping researchers design appropriate approaches to privacy and data protection given the particular context (Forrester 2011).
In addition, as more federal agencies and funding bodies across the globe mandate making research data publicly-available (i.e., NSF, NIH, Wellcome Trust, Research Councils U.K.), the language used in consent documents will change accordingly to represent this intended longevity of data and opportunities for future, unanticipated use. This is not an entirely new concept nor is it specific to Internet research, but it should be noted that new language is required for consent.
Given the ease with which Internet data can flow between and among Internet venues (a Twitter feed can automatically stream to a Facebook page), the changes with which access to data can occur (early “private” newsgroup conversations were made “publicly searchable” when Google bought DejaNews), reuse and access by others is highly possible. Current data sharing mandates must be considered in the consent process. Alignment between a data sharing policy and an informed consent document is imperative. Both should include provisions for appropriate protection of privacy, confidentiality, security, and intellectual property.
There is general agreement in the U.S. that individual consent is not necessary for researchers to use publicly available data sets, under 45 C.F.R. § 46; recommendations were made by The National Human Subjects Protection Advisory Committee (NHRPAC) in 2002 regarding publicly available data sets (see Other Internet Resources). Data use or data restriction agreements are commonly used and set the parameters of use for researchers.
The U.K. Data Archive (2012) provides guidance on consent and data sharing:
Restricting access to data should never be seen as the only way to protect confidentiality. Obtaining appropriate informed consent and anonymising data enable most data to be shared:
For confidential data, the Archive, in discussion with the data owner, may impose additional access regulations, which can be:
- needing specific authorisation from the data owner to access data-placing confidential data under embargo for a given period of time until confidentiality is no longer pertinent
- providing access to approved researchers
- only providing secure access to data by enabling remote analysis of confidential data but excluding the ability to download data
Data sharing made public headlines in 2016 when a Danish researcher released a data set comprised of scraped data from nearly 70,000 users of the OkCupid online dating site. The data set was highly reidentifiable and included potentially sensitive information, including usernames, age, gender, geographic location, what kind of relationship (or sex) they’re interested in, personality traits, and answers to thousands of profiling questions used by the site. The researcher claimed the data were public and thus, such sharing and use was unproblematic. Zimmer (2016) was among many privacy and ethics scholars who critiqued this stance.
The Danish researchers did not seek any form of consent or debriefing on the collection and use of the data, nor did they have any ethics oversight. Many researchers and ethics boards are, however, starting to mitigate many of these ethical concerns by including blanket statements in their consent processes, indicating such precautions for research participants. For example,
“I understand that online communications may be at greater risk for hacking, intrusions, and other violations. Despite these possibilities, I consent to participate.”
A more specific example comes from the Canadian context when researchers propose to use specific online survey tools hosted in the United States; REBs commonly recommend the following type language for use in informed consent documents:
Please note that the online survey is hosted by Company ABC which is a web survey company located in the U.S.A. All responses to the survey will be stored and accessed in the U.S.A. This company is subject to U.S. Laws, in particular, to the U.S. Patriot Act/Domestic Security Enhancement Act that allows authorities access to the records that your responses to the questions will be stored and accessed in the U.S.A. The security and private policy for Company ABC can be viewed at http://…/.
4.3.1 Minors and Consent
Internet research poses particular challenges to age verification, assent and consent procedures, and appropriate methodological approaches with minors. Age of consent varies across countries, states, communities, and locales of all sorts. For research conducted or supported by U.S. federal agencies bound by the Common Rule, children are “persons who have not attained the legal age for consent to treatments or procedures involved in the research, under the applicable law of the jurisdiction in which the research will be conducted” (45 C.F.R. § 46.402(a) 2009). Goldfarb (2008) provides an exhaustive discussion of age of majority across the U.S. states, with a special focus on clinical research, noting children must be seven or older to assent to participation (see 45 C.F.R. § 46 Subpart D 2009).
Spriggs, from the Australian context, notes that while no formal guidance exists on Internet research and minors under the National Statement, she advises:
- Parental consent may be needed when information is potentially identifiable. Identifiable information makes risks to individuals higher and may mean that the safety net of parental consent is preferable.
- There is also a need to consider whether seeking parental consent would make things worse e.g., by putting a young person from a dysfunctional home at risk or result in disclosure to the researcher of additional identifying information about the identity and location of the young person. Parental consent may be “contrary to the best interests” of the child or young person when it offers no protection or makes matters worse. (Spriggs 2010, 30)
To assist with the consent process, age verification measures can be used. These can range from more technical software applications to less formal knowledge checks embedded in an information sheet or consent document. Multiple confirmation points (asking for age, later asking for year of birth, etc) are practical measures for researchers. Depending on the types of data, sensitivity of data, use of data, researchers and boards will carefully construct the appropriate options for consent, including waiver of consent, waiver of documentation, and/or waiver of parental consent.
Recent developments in cloud computing platforms have led to unique opportunities—and ethical challenges—for researchers. Cloud computing describes the deployment of computing resources via the Internet, providing on-demand, flexible, and scalable computing from remote locations. Examples include web-based email and calendaring services provided by Google or Yahoo, online productivity platforms like Google Docs or Microsoft Office 365, online file storage and sharing platforms like Dropbox or Box.net, and large-scale application development and data processing platforms such as Google Apps, Facebook Developers Platform, and Amazon Web Services.
Alongside businesses and consumers, researchers have begun utilizing cloud computing platforms and services to assist in various tasks, including subject recruitment, data collection and storage, large-scale data processing, as well as communication and collaboration (Allan 2011; Chen et al. 2010; Simmhan et al. 2008; Yogesh et al. 2009).
As reliance on cloud computing increases among researchers, so do the ethical implications. Among the greatest concerns is ensuring data privacy and security with cloud-based services. For researchers sharing datasets online for collaborative processing and analysis, steps must be taken to ensure only authorized personnel have access to the online data, but also that suitable encryption is used for data transfer and storage, and that the cloud service provider maintains sufficient security to prevent breaches. Further, once research data is uploaded to a third-party cloud provider, attention must be paid to the terms of service for the contracted provider to determine what level of access to the data, if any, might be allowed to advertisers, law enforcement, or other external agents.
Alongside the privacy and security concerns, researchers also have an ethical duty of data stewardship which is further complicated when research data is placed in the cloud for storage or processing. Cloud providers might utilize data centers spread across the globe, meaning research data might be located outside the United States, and its legal jurisdictions. Terms of service might grant cloud providers a license to access and use research data for purposes not initially intended or approved of by the subjects involved. Stewardship may require the prompt and complete destruction of research data, a measure complicated if a cloud provider has distributed and backed-up the data across multiple locations.
A more unique application of cloud computing for research involves the crowdsourcing of data analysis and processing functions, that is, leveraging the thousands of users of various online products and services to complete research related tasks remotely. Examples include using a distributed network of video game players to assist in solving protein folding problems (Markoff 2010), and leveraging Amazon’s Mechanical Turk crowdsourcing marketplace platform to assist with large scale data processing and coding functions that cannot be automated (Conley & Tosti-Kharas 2010; Chen et al. 2011). Using cloud-based platforms can raise various critical ethical and methodological issues.
First, new concerns over data privacy and security emerge when research tasks are widely distributed across a global network of users. Researchers must take great care in ensuring sensitive research data isn’t accessible by outsourced labor, or that none of the users providing crowdsourced labor are able to aggregate and store their own copy of the research dataset. Second, crowdsourcing presents ethical concerns over trust and validity of the research process itself. Rather than a local team of research assistants usually under a principle investigator’s supervision and control, crowdsourcing tends to be distributed beyond the direct management or control of the researcher, providing less opportunity to ensure sufficient training for the required tasks. Thus, researchers will need to create additional means of verifying data results to confirm tasks are completed properly and correctly.
Two additional ethical concerns with crowdsourcing involve labor management and authorship. Turks were not originally intended to be research subjects, first and foremost. However, researchers using Mechanical Turks must ensure that the laborers on the other end of the cloud-based relationship are not being exploited, that they are legally eligible to be working for hire, and that the incentives provided are real, meaningful, and appropriate (Scholz 2008; Williams 2010, Other Internet Resources).
Finally, at the end of a successful research project utilizing crowdsourcing, a researcher may be confronted with the ethical challenge of how to properly acknowledge the contributions made by (typically anonymous) laborers. Ethical research requires the fair and accurate description of authorship. Disciplines vary as to how to report relative contributions made by collaborators and research assistants, and this dilemma increases when crowdsourcing is used to assist with the research project.
Algorithmic processing is a corollary of big data research, and newfound ethical considerations have emerged. From “algorithmic harms” to “predictive analytics,” the power of today’s algorithms exceeds long-standing privacy beliefs and norms. Specifically, the National Science and Technology Council note:
“Analytical algorithms” as algorithms for prioritizing, classifying, filtering, and predicting. Their use can create privacy issues when the information used by algorithms is inappropriate or inaccurate, when incorrect decisions occur, when there is no reasonable means of redress, when an individual’s autonomy is directly related to algorithmic scoring, or when the use of predictive algorithms chills desirable behavior or encourages other privacy harms. (NSTC 2016, p. 18, Other Internet Resources).
While the concept of big data is not new, and the term has been in technical discourses since the 1990s, the public awareness and response to big data research is much more recent. Following the rise of social media-based research, Buchanan (2016) has delineated the emergence of “big data”-based research from 2012 to the present, with no signs of an end point.
Big data research is challenging for research ethcis boards, often presenting what the computer ethicist James Moor would call “conceptual muddles”: the inability to properly conceptualize the ethical values and dilemmas at play in a new technological context. Subject privacy, for example, is typically protected within the context of research ethics through a combination of various tactics and practices, including engaging in data collection under controlled or anonymous environments, limiting the personal information gathered, scrubbing data to remove or obscure personally identifiable information, and using access restrictions and related data security methods to prevent unauthorized access and use of the research data itself. The nature and understanding of privacy become muddled, however, in the context of big data research, and as a result, ensuring it is respected and protected in this new domain becomes challenging.
For example, the determination of what constitutes “private information” – and thus triggering particular privacy concerns – becomes difficult within the context of big data research. Distinctions within the regulatory definition of “private information” – namely, that it only applies to information which subjects reasonably expect is not normally monitored or collected and not normally publicly available – become less clearly applicable when considering the data environments and collection practices that typify big data research, such as the wholesale scraping of Facebook news feed content or public OKCupid accounts.
When considered through the lens of the regulatory definition of “private information,” social media postings are often considered public, especially when users take no visible, affirmative steps to restrict access. As a result, big data researchers might conclude subjects are not deserving of particular privacy consideration. Yet, the social media platforms frequently used for big data research purposes represent a complex environment of socio-technical interactions, where users often fail to understand fully how their social activities might be regularly monitored, harvested, and shared with third parties, where privacy policies and terms of service are not fully understood and change frequently, and where the technical infrastructures and interfaces are designed to make restricting information flows and protecting one’s privacy difficult.
As noted above) it becomes difficult to confirm a user’s intention when sharing information on a social media platform, and whether users recognize that providing information in a social environment also opens it up for widespread harvesting and use by researchers. This uncertainty in the intent and expectations of users of social media and internet-based platforms – often fueled by the design of the platforms themselves – create numerous conceptual muddles in our ability to properly alleviate potential privacy concerns in big data research.
The conceptual gaps that exist regarding privacy and the definition of personally identifiable information in the context of big data research inevitably lead to similar gaps regarding when informed consent is necessary. Researchers mining Facebook profile information or public Twitter streams, for example, typically argue that no specific consent is necessary due to the fact the information was publicly available. It remains unknown whether users truly understood the technical conditions under which they made information visible on these social media platforms or if they foresaw their data being harvested for research purposes, rather than just appearing onscreen for fleeting glimpses by their friends and followers. In the case of the Facebook emotional contagion experiment (Kramer, Guillory, & Hancock, 2014), the lack of obtaining consent was initially rationalized through the notion that the research appeared to have been carried out under Facebook’s extensive terms of service, whose data use policy, while more than 9,000 words long, does make passing mention to “research.” It was later revealed, however, that the data use policy in effect when the experiment was conducted never mentioned “research” at all (Hill, 2014).
The Facebook emotional contagion experiment, discussed above, is just one example in a larger trend of big data research conducted outside of traditional university-based research ethics oversight mechanisms. Nearly all online companies and platforms analyze data and test theories that often rely on data from individual users. Industry-based data research, once limited to marketing-oriented “A/B testing” of benign changes in interface designs or corporate communication messages, now encompasses information about how users behave online, what they click and read, how they move, eat, and sleep, the content they consume online, and even how they move about their homes. Such research produces inferences about individuals’ tastes and preferences, social relations, communications, movements, and work habits. It implies pervasive testing of products and services that are an integral part of intimate daily life, ranging from connected home products to social networks to smart cars. Except in cases where they are partnering with academic institutions, companies typically do not put internal research activities through a formal ethical review process, since results are typically never shared publicly and the perceived impact on users is minimal.
The growth of industry-based big data research, however, presents new risks to individuals’ privacy, on the one hand, and to organizations’ legal compliance, reputation, and brand, on the other hand. When organizations process personal data outside of their original context, individuals may in some cases greatly benefit, but in other cases may be surprised, outraged, or even harmed. Soliciting consent from affected individuals can be impractical: Organizations might collect data indirectly or based on identifiers that do not directly match individuals’ contact details. Moreover, by definition, some non-contextual uses – including the retention of data for longer than envisaged for purposes of a newly emergent use – may be unforeseen at the time of collection. As Crawford and Schultz (2014) note, “how does one give notice and get consent for innumerable and perhaps even yet-to-be-determined queries that one might run that create ‘personal data’?” (p. 108).
With corporations developing vast “living laboratories” for big data research, research ethics has become a critical component of the design and oversight of these activities. For example, in response to the controversy surrounding the emotional contagion experiment, Facebook developed an internal ethical review process that, according to its facilitators, “leverages the company’s organizational structure, creating multiple training opportunities and research review checkpoints in the existing organizational flow” (Jackman & Kanerva, 2016, p. 444). While such efforts are important and laudable, they remain open for improvement. Hoffmann (2016), for example, has criticized Facebook for launching “an ethics review process that innovates on process but tells us little about the ethical values informing their product development.” In short, while Internet companies like Facebook recognize the need to review the ethics of internal research projects, such efforts remain largely perfunctory and meant for easing public concerns, and not necessarily fully in line with the ethical deliberations that take place in academic settings.
While many researchers and review boards across the world work without formal guidance, many REBs have developed guidelines for Internet research. While many such guidelines exist, the following provide examples for researchers preparing for an REB review, or for boards developing their own policies.
- Bard College (New York) Suggestions for Internet Research
- Loyola University Chicago Policy for Online Survey Research Involving Human Participants
- Penn State Guidelines for Computer- and Internet-Based Research Involving Human Participants
- Queen’s University (Canada) Exemption Policy re: Research Ethics Review for Projects Involving Digital Data Collection
- U.K. Data Archive Further Resources
- University of Connecticut Guidance for Computer and Internet-Based Research Involving Human Participants
Additional resources are found in Other Internet Resources below.
- 45 C.F.R. § 46, “Protection of Human Subjects” [available online].
- 45 C.F.R. § 164.514, “Other requirements relating to uses and disclosures of protected health information,” [available online].
- Acquisti, A. and R. Gross, 2006, “Imagined Communities: Awareness, Information Sharing, and Privacy on the Facebook,” Proceedings of the 6th Workshop on Privacy Enhancing Technologies, 4258: pp. 36–58.
- Allan, Rob, 2011, “Cloud and Web 2.0 Services for Supporting Research,” 11/11, (November). [available online].
- Allen, C., 1996, “What’s Wrong with the ‘Golden Rule’? Conundrums of Conducting Ethical Research in Cyberspace,” The Information Society, 2(1): 175–187.
- Annas, George J., 2009, “Globalized Clinical Trials and Informed Consent,” New England Journal of Medicine, 360: 2050–2053.
- Aycock, J., E. Buchanan, S. Dexter, and D. Dittrich, 2012, “Human Subjects, Agents, or Bots” in Current Issues in Ethics and Computer Security Research, G. Danezis, S. Dietrich, and K. Sako (eds.), FC 2011 Workshops, LNCS, 7126, Springer, Heidelberg, pp. 138–145.
- Banks, W. and M. Eble, 2007, “Digital Spaces, Online Environments, and Human Participant Research: Interfacing with Institutional Review Boards,” in Digital Writing Research: Technologies, Methodologies, and Ethical Issues, H. McKee and D. DeVoss (eds.), Cresskill, NJ: Hampton Press, pp. 27–47.
- Barbaro, Michael and Tom Zeller Jr., 2006, “A Face Is Exposed for AOL Searcher No. 4417749,” The New York Times, August 9, 2006, pp. A1.
- Barratt, M. and S. Lenton, 2010, “Beyond Recruitment? Participatory online research with people who use drugs,” International Journal of Internet Research Ethics, 3(1): 69–86.
- Beauchamp, T., and J. Childress, 2008, Principles of Biomedical Ethics, Oxford: Oxford University Press.
- Blackstone, M., L. Given, L. Levy, M. McGinn, P. O’Neill, and T. Palys, 2008, “Extending the Spectrum: The TCPS and Ethical Issues in Internet-Based Research,” Social Sciences and Humanities Research Ethics Special Working Committee: A Working Committee of the Interagency Advisory Panel on Research Ethics (February). [available online]
- Boehlefeld, S.P., 1996, “Doing the Right Thing: Ethical Cyberspace Research,” The Information Society, 12(2): 141–152.
- Boga M, A. Davies, D. Kamuya, S.M. Kinyanjui, E. Kivaya, et al., 2011, “Strengthening the Informed Consent Process in International Health Research through Community Engagement: The KEMRI-Welcome Trust Research Programme Experience,” PLoS Med, 8(9): e1001089.
- Bonneau, J. and S. Preibusch, 2009, “The Privacy Jungle: On the Market for Data Protection in Social Networks,” Paper presented at the The Eighth Workshop on the Economics of Information Security (WEIS 2009).
- Bromseth, Janne C. H., 2002, “Public Places: Public Activities? Methodological Approaches and Ethical Dilemmas in Research on Computer-mediated Communication Contexts,” in Researching ICTs in Context, A. Morrison (ed.), Oslo: University of Oslo, pp. 33–61. [available online]
- Brothers, K. and E. Clayton, 2010, “Human non-subjects research: privacy and compliance,” American Journal of Bioethics, 10(9):15–7.
- Bruckman, A., 2006, “Teaching Students to Study Online Communities Ethically,” Journal of Information Ethics, 15(2): 82–98.
- Buchanan, E., 2010, “Internet Research Ethics: Past, Present, Future,” in The Blackwell Handbook of Internet Studies, C. Ess and M. Consalvo, (eds.), Oxford: Oxford University Press.
- ––– (ed.), 2004, Readings in Virtual Research Ethics: Issues and Controversies, Hershey: Idea Group.
- Buchanan, E. and C. Ess, 2006, “Internet Research Ethics at a Critical Juncture,” Journal of Information Ethics, 15(2): 14–17.
- –––, 2008, “Internet Research Ethics: The Field and Its Critical Issues,” in The Handbook of Information and Computer Ethics, H.Tavani and K. E. Himma (eds.), Boston: Wiley.
- –––, 2009, “Internet Research Ethics and the Institutional Review Board: Current Practices and Issues,” Computers and Society, 39(3): 43–49.
- –––, 2016, “Ethics in Digital Research,” in The Handbook of Social Practices and Digital Everyday Worlds, M. Nolden., G. Rebane, M. Schreiter (eds.), Springer.
- Buchanan, E., and E. Hvizdak, 2009, “REBs and Online Surveys: Ethical and Practical Considerations,” Journal of Empirical Research on Human Research Ethics, 4(2): 37–48.
- Buchanan, E., J. Aycock, S. Dexter, D. Dittrich, and E. Hvizdak, 2011, “Computer Science Security Research And Human Subjects: Emerging Considerations For Research Ethics Boards,” Journal of Empirical Research on Human Research Ethics, 6(2), 71–83.
- Carpenter, K., and D. Dittrich, 2011, “Bridging the Distance: Removing the Technology Buffer and Seeking Consistent Ethical Analysis in Computer Security Research,” In 1st International Digital Ethics Symposium. Loyola University Chicago Center for Digital Ethics and Policy.
- [CASRO] Council of American Survey Research, 2011, “CASRO Code of Standards and Ethics for Survey Research,” First adopted 1977 and revised since. [available online]
- Chen, X., G. Wills, L. Gilbert, and D. Bacigalupo, 2010, “Using Cloud for Research: A Technical Review,” JISC Final Report.
- Chen, J., N. Menezes, J. Bradley, and T.A. North, 2011, “Opportunities for Crowdsourcing Research on Amazon Mechanical Turk,” Human Factors, 5(3).
- Colvin, J., and J. Lanigan, 2005, “Ethical Issues and Best Practice for Internet Research,” Scholarship, 97(3): 34–39.
- Conley, C. A., and J. Tosti-Kharas, 2010, “Crowdsourcing Content Analysis for Behavioral Research: Insights from Mechanical Turk,” Academy of Management Conference.
- Consalvo, M., and C. Ess (eds), 2010, The Blackwell Handbook of Internet Studies, Oxford: Oxford University Press.
- Crawford, K., & Schultz, J., 2014, “Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms” Boston College Law Review, 55(1), 93-128.
- Dittrich, D., M. Baily, and S. Dietrich, 2011, “Building an Active Computer Security Ethics Community,” Security and Privacy 9(4): 32–40.
- Elgesem, Dag, 2002, “What is Special about the Ethical Issues in Online Research?” Ethics and Information Technology, 4(3): 195–203. [available online]
- Emanuel, E., R. Courch, J. Arras, J. Moreno and C. Grady (eds.), 2003, Ethical and Regulatory Aspects of Clinical Research: Readings and Commentary, Johns Hopkins Press.
- Ess, Charles and the Association of Internet Researchers Ethics Working committee, 2002, “Ethical Decision-Making and Internet Research: Recommendations from the AoIR Ethics Working Committee,” Approved by the AoIR, November 27, 2002. [available online]
- Eysenbach, G., 1999, “Welcome to the Journal of Medical Internet Research,” Journal of Medical Internet, 1(1): e5. [available online].
- Eysenbach, G. and J. Till, 2001, “Ethics Issues in Qualitative Research on Internet Communities,” British Medical Journal, 323: 1103.
- Fairfield, J., 2012, “Avatar Experimentation: Human Subjects Research in Virtual Worlds,” U.C. Irvine Law Review, 2: 695-772.
- Flicker, S., D. Haans, and H. Skinner, 2004, “Ethical Dilemmas in Research on Internet Communities,” Qualitative Health Research, 14(1): 124–134.
- Forrester Research, 2011, Forrester’s Global Data Protection and Privacy Heatmap. [available online]
- Frankel, Mark S., and Siang, Sanyin, 1999, “Ethical and Legal Aspects of Human Subjects Research in Cyberspace,” A Report of a Workshop, June 10–11, 1999, Washington, DC: American Association for the Advancement of Science. [available online]
- Gaw, A., and A. Burns, 2011, On Moral Grounds: Lessons from the History of Research Ethics, SA Press.
- Gilbert, B., 2009, “Getting to conscionable: Negotiating Virtual World’s End User License Agreements without Getting Externally Regulated,” Journal of International Commercial Law and Technology, 4(4): 238–251.
- Glickman S.W., S. Galhenage, L. McNair, Z. Barber, K. Patel, K.A. Schulman, and J. McHutchison, 2012, “The Potential Influence of Internet-Based Social Networking on the Conduct of Clinical Research Studies,” Journal of Empirical Research on Human Research Ethics, 7(1): 71–80.
- Goldfarb, N., 2008, “Age of Consent for Clinical Research,” Journal of Clinical Best Practices, 4(6) June. [available online]
- Hill, K., 2014. “Facebook Added ‘Research’ To User Agreement 4 Months After Emotion Manipulation Study,” Forbes.com. [available online]
- Hoffmann, A. L., 2016, “Facebook has a New Process for Discussing Ethics. But is It Ethical?” The Guardian. [available online]
- Homeland Security Department, 2011, “Submission for Review and Comment: ‘The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research’,” Federal Register: The Daily Journal of the United States Government. [available online].
- Hoofnagle, C. J. and J. King, 2008, “What Californians Understand About Privacy Online,” Research Report from Samuelson Law Technology & Public Policy Clinic, UC Berkeley Law: Berkeley, CA.
- Hudson, J. M. and A. Bruckman, 2004, “Go Away: Participant Objections to Being Studied and the Ethics of Chatroom Research,” Information Society, 20(2): 127–139.
- –––, 2005, “Using Empirical Data to Reason about Internet Research Ethics,” Proceedings of the 2005 Ninth European Conference on Computer-Supported Cooperative Work. [available online]
- Hunsinger, J., L. Kastrup, and M. Allen (eds), 2010, International Handbook of Internet Research, New York: Springer.
- Illingworth, N., 2001, “The Internet Matters: Exploring the Use of the Internet as a Research Tool,” Sociological Research Online, 6(2). [available online].
- Ingierd, I., and Fossheim, H., (eds.), 2015, Internet Research Ethics, Oslo: Cappelen Damm
- Jackman, M., & Kanerva, L., 2016. “Evolving the IRB: Building Robust Review for Industry Research,” Washington and Lee Law Review Online, 72(3): 442–457.
- Jacobson, D., 1999, “Doing Research in Cyberspace,” Field Methods, 11(2): 127–145.
- Johns, M., S.L. Chen, and J. Hall (eds.), 2003, Online Social Research: Methods, Issues, and Ethics, New York: Peter Lang.
- Jones, Arnita, 2008, “AHA Statement on IRB’s and Oral History Research,” American Historical Association Activities. [available online]
- Jones, S. (ed.), 1999, Doing Internet research: Critical Issues and Methods for Examining the Net, Thousand Oaks, CA: Sage.
- Kaplan, A. and M. Haenlein, 2010, “Users of the World Unite: The Challenges and Opportunities of Social Media,” Business Horizons, 53(1).
- King, S., 1996, “Researching Internet Communities: Proposed Ethical Guidelines for the Reporting of Results,” The Information Society, 12(2): 119–128.
- Kitchin, Heather A., 2003, “The Tri-Council Policy Statement and Research in Cyberspace: Research Ethics, the Internet, and Revising a ‘Living Document,’” Journal of Academic Ethics, 1: 397–418.
- –––, 2008, Research Ethics and the Internet: Negotiating Canada’s Tri-Council’s Policy, Winnipeg: Fernwood Publishing
- Kramer, A. D. I., Guillory, J. E., & Hancock, J. T., 2014, “Experimental evidence of massive-scale emotional contagion through social networks” Proceedings of the National Academy of Sciences, 111(24), 8788-8790. [available online]
- Kraut, R., J. Olson, M. Banaji, A. Bruckman, J. Cohen, and M. Cooper, 2004, “Psychological Research Online: Report of Board of Scientific Affairs’ Advisory Group on the Conduct of Research on the Internet,” American Psychologist, 59(4): 1–13.
- Krodstad, D., S. Diop, A. Diallo, F. Mzayek, J. Keating, O. Koita, and Y. Toure, 2010, “Informed Consent in International Research: The Rationale for Different Approaches,” The American Journal of Tropical Medicine and Hygiene, 83(4): 4743–4747.
- Lawson, D., 2004, “Blurring the Boundaries: Ethical Considerations for Online Research Using Synchronous CMC Forums,” in Readings in Virtual Research Ethics: Issues and Controversies, E. Buchanan (ed.), Hershey: Idea Group, pp. 80–100.
- Leibovici, D.G., S. Anand, J. Swan, J. Goulding, G. Hobona, L. Bastin, S. Pawlowicz, M. Jackson, M. and R. James, 2010, “Workflow Issues for Health Mapping ‘Mashups’ of OGC,” University of Nottingham, CGS Technical Report, 2010 DL1.
- Madejski, M., M. Johnson and S. Bellovin, 2011, “The Failure of Online Social Network Privacy Settings,” Columbia Research Report (CUCS-010-11), 1–20.
- Mann, C., 2003, “Generating Data Online: Ethical Concerns and Challenges for the C21 Researcher,” in Applied Ethics in Internet Research, M. Thorseth (ed.), Trondheim, Norway: NTNU University Press, pp. 31–49.
- Markham, A., 1998, Life Online: Researching Real Experience in Virtual Space, Altamira Press.
- –––, 2012, “Fabrication as Ethical Practice: Qualitative Inquiry in Ambiguous Internet Contexts.” Information, Communication & Society, 15(3): 334–353.
- Markham, A. and N. K. Baym (eds.), 2008, Internet Inquiry: Conversations about Method, Thousand Oaks, CA: Sage.
- Markoff, J., 2010, “In a Video Game, Tackling the Complexities of Protein Folding,” New York Times, August 4. [available online]
- McKee, H. A., and J. Porter, 2009, The Ethics of Internet Research: A Rhetorical, Case-based Process, New York: Peter Lang Publishing.
- Mendelson, Cindy, 2007, “Recruiting Participants for Research from Online Communities,” Computers, Informatics, Nursing, 25(6): 317–323.
- Milne, G. R. and M. J. Culnan, 2004, “Strategies for Reducing Online Privacy Risks: Why Consumers Read (or don’t read) Online Privacy Notices,” Journal of Interactive Marketing, 18(3): 15–29.
- Moor, J. H., 1985, “What is Computer Ethics,” Metaphilosophy, 16(4): 266–275.
- Moore, M. and L. Alexander, 2007, “Deontological Ethics,” The Stanford Encyclopedia of Philosophy (Winter 2007 Edition), Edward N. Zalta (ed.). [available online].
- Narayanan, A., and V. Shmatikov, 2008, “Robust de-anonymization of Large Sparse Datasets,” Proceedings of the 29th IEEE Symposium on Security and Privacy, Oakland, CA, May 2008, pp. 111–125. [available online]
- [NCPHSBBR] The National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research, 1979, “The Belmont Report.” [available online]
- [NESH] The National Committee for Research Ethics in the Social Sciences and the Humanities [Norway], 2006, “Guidelines for Research Ethics in the Social Sciences, Law, and Humanities,” Published September 2006. [available online].
- [NESH] The National Committee for Research Ethics in the Social Sciences and the Humanities [Norway], 2014, “Ethical Guidelines for Internet Research,” Published December 2014. [available online].
- [NIH] National Institutes of Health, 2008, Guide for Identifying Sensitive Information at the NIH, April.
- [NIH] National Institutes of Health, 2011, Certificates of Confidentiality Kiosk. June 28, 2011. [available online]
- Nissenbaum, Helen, 2009, Privacy in Context: Technology, Policy, and the Integrity of Social Life, Stanford, CA: Stanford University Press.
- Odwazny, L. and E. Buchanan, 2011, “Ethical and Regulatory Issues in Internet Research.” Advancing Ethical Research Conference, PRIM&R, MD: National Harbor.
- Ohm, P., 2019, “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization”, UCLA Law Review 57:1701–1777.
- [OHRP] U.S Department of Health and Human Services, 2008, “Office for Human Research Protections,” [available online]
- Oral History Association, 2012, “Oral History”, web site, accessed May 15, 2012. [available online]
- O’Rourke, P., 2007, “Report of the Public Responsibility in Medicine and Research (PRIM&R) Human Tissue/Specimen Banking Working Group.” [available online: Part I and Part II]
- Overbeke, G., 2008, “Pro-Anorexia Websites: Content, Impact, and Explanations of Popularity,” Mind Matters: The Wesleyan Journal of Psychology, 3: 49–62. [available online]
- Parras, J., J. Sullivan, R. Guerrero, J. Prochaska, H. Kelley, A. Nolen, and C. Hinojosa, 2011, “Creating Community Networks to Promote Complex Goals such as Environmental Justice,” Panel Presentation from the CEnR Workshop, October 27, 2011.
- Reid, E., 1996, “Informed Consent in the Study of On-line Communities: A Reflection on the Effects of Computer-mediated Social Research,” The Information Society, 12(2): 169–174.
- Reynolds, Ren, and Melissa de Zwart, 2010, “The Duty to ‘Play’: Ethics, EULAs and MMOs,” International Journal of Internet Research Ethics, 3(1): 48–68. [available online (pdf)]
- Ritchie, Donald A. 2003. Doing Oral History: A Practical Guide. New York: Oxford University Press.
- Rosenberg, A., 2010, “Virtual World Research Ethics and the Private/Public Distinction,” International Journal of Internet Research Ethics, 3(1): 23–37.
- Rosser, B.R.S., J.M. Oakes, J. Konstan, S. Hooper, K.J. Horvath, G.P. Danilenko, K.R. Nygaard, and D.J. Smolenski, 2010, “Reducing HIV Risk Behavior of MSM Through Persuasive Computing: Results of the Men’s ITernet Study (MINTS-II),” AIDS, 24(13): 2099–2107.
- Rothstein, M., 2010, “ Is Deidentification Sufficient to Protect Health Privacy in Research?” The American Journal of Bioethics, 10(9): 3–11.
- Rudder, C., 2014, “We Experiment on Humans!” OkTrends. [available online]
- [SACHRP] Secretary’s Advisory Committee to the Office for Human Research Protections, 2010, SACHRP July 20–21, 2010 Meeting Presentations, U.S. Department of Health and Human Services. [available online].
- Scholz, Trebor, 2008, “Market Ideology and the Myths of Web 2.0,” First Monday, 13(3) 3 March 2008. [available online]
- Schwartz, P. and D. Solove, 2011, “The PII Problem: Privacy and a New Concept of Personally Identifiable Information,” New York University Law Review, 86: 1814.
- Seaboldt J.A., and R. Kuiper, 1997, “Comparison of Information Obtained from a Usenet Newsgroup and from Drug Information Centers,” American Journal of Health-System Pharmacy, 154:1732–1735.
- Sharf, B.F., 1996, “Communicating Breast Cancer on-line: Support and Empowerment on the Internet,” Women Health, 26(1):65–84.
- Sieber, Joan E., 1992, Planning Ethically Responsible Research: A Guide for Students and Internal Review Boards, Thousand Oaks: Sage.
- –––, 2015, Planning Ethically Responsible Research: A Guide for Students and Internal Review Boards, (second edition) Thousand Oaks: Sage.
- Simmhan, Y., R. Barga, C. van Ingen, E. Lazowska, and A. Szalay, A., 2008, “On Building Scientific Workflow Systems for Data Management in the Cloud,” Fourth IEEE International Conference on eScience, December 7–12, 2008, Indianapolis, Indiana.
- Skloot, R., 2010, The Immortal Life of Henrietta Lacks, New York: Crown Publishers.
- Smith, M. A., and B. Leigh, 1997, “Virtual Subjects: Using the Internet as an Alternative Source of Subjects and Research Environment,” Behavioral Research Methods, Instruments and Computers, 29: 496–505.
- Spriggs, Merle, 2010, “Understanding Consent in Research Involving Children: The ethical issues.” The Royal Children’s Hospital Melbourne, version 4. available online]
- Stone, Brad, 2009, “Facebook Rolls Out New Privacy Settings,” New York Times, December 9. [available online]
- Sveningsson, M., 2004, “Ethics in Internet Ethnography” in Readings in Virtual Research Ethics: Issues and Controversies, E. Buchanan (ed.), Hershey: Idea Group, pp. 45–61.
- Sweeney, L., 2002, “K-Anonymity: A Model for Protecting Privacy,” International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, 10(5): 557–570.
- Thomas, J., 2004, “Reexamining the Ethics of Internet research: Facing the Challenge of Overzealous Oversight,” in Online Social Research: Methods, Issues, and Ethics, M. D. Johns, S. L. Chen, and J. Hall (eds.), New York: Peter Lang, pp. 187–201.
- Thorseth, M., 2003, “Applied Ethics in Internet Research,” Programme for Applied Ethics, Norwegian University of Science and Technology, Trondheim, Norway.
- Tsai, J., L. Cranor, A. Acquisti, and C. Fong, 2006, “What’s it to you? A Survey of Online Privacy Concerns and Risks,” NET Institute Working Paper No. 06-29.
- Turkle, S.,1997, Life on the Screen: Identity in the Age of the Internet, New York: Touchstone.
- U.K. Data Archive, 2012, Create & Manage Data: Access Control. Accessed May 16, 2012. [available online]
- Walstrom, M., 2004, “Ethics and Engagement in Communication Scholarship: Analyzing Public, Online Support Groups as Researcher/Participant-Experiencer,” in Readings in Virtual Research Ethics: Issues and Controversies, E. Buchanan (ed.), Hershey: Idea Group, pp.174–202.
- Walther, J., 2002, “Research Ethics in Internet-enabled Research: Human Subjects Issues and Methodological Myopia,” Ethics and Information Technology, 4(3). [available online]
- White, M., 2002, “Representations or People,” Ethics and Information Technology 4(3): 249–266. [available online].
- World Medical Association. 1964/2008. “Declaration of Helsinki: Ethical Principles for Medical Research Involving Human Subjects”, Adopted by the 18th World Medical Assembly. Amended 1975, 1983, 1989, 1996, 2000, 2002, 2004, 2008. [available online]
- Wright, David R., 2006, “Research Ethics and Computer Science: an Unconsummated Marriage,” Proceeding SIGDOC ’06 Proceedings of the 24th annual ACM International Conference on Design of Communication.
- Yogesh, S., C. Van Ingen, G. Subramanian, and J. Li, 2009, “Bridging the Gap Between the Gap between the Cloud and an eScience Application Platform”, Microsoft Research Tech Report MSR-TR-2009-2021. [available online]
- Zimmer, Michael, 2010, “But the Data is Already Public: On the Ethics of Research in Facebook,” Ethics & Information Technology, 12(4): 313–325.
- –––, 2016, “OkCupid Study Reveals the Perils of Big-Data Science,” Wired.com. [available online]
- Zimmer. M, and K. Kinder-Kurlanda (eds.), forthcoming, Internet Research Ethics for the Social Age: New Challenges, Cases, and Contexts, New York: Peter Lang Publishing.
How to cite this entry. Preview the PDF version of this entry at the Friends of the SEP Society. Look up this entry topic at the Indiana Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers, with links to its database.
- American Counseling Association: Ethics and Professional Standards, 2005 revision.
- Association of Internet Researchers Ethics Guidelines
- Research Ethics Guidelines for Internet Research, The (Norwegian) National Committee for Research Ethics in the Social Sciences and the Humanities, 2003.
- American Psychological Association: Advisory Group on Conduction Research on the Internet (http://www.apa.org/science/leadership/bsa/internet/)
- Council of European Social Science Data Archives
- Current Issues in Research Ethics: Privacy and Confidentiality.
- Ethical and Legal Aspects of Human Subjects Research in Cyberspace.
- Forrester Privacy and Data Protection by Country.
- IRB Forum
- IJIRE: International Journal of Internet Research Ethics
- International Journal of Internet Science
- Journal of Medical Internet Research
- Recommendations on Public Use Data Files, [NHRPAC] National Human Subjects Protection Advisory Committee, 2002.
- NSTC (National Science and Technology Council), 2016 (June), “National Privacy Research Strategy”
- Research Ethics Blog
- Secretary’s Advisory Committee to the Office for Human Research Protections (SACHRP), “Considerations and Recommendations Concerning Internet Research and Human Subjects Research Regulations, with Revisions”
- Secretary’s Advisory Committee to the Office for Human Research Protections (SACHRP), “Human Subjects Research Implications of ‘Big Data’ Studies”
- Sparks, Joel, 2002, Timeline of Laws Related to the Protection of Human Subjects, National Institutes of Health.
- The Belmont Report: Ethical Principles and Guidelines for the Protection of Human Subjects of Research
- The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research
- Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans
- Williams, George, 2010, “The Ethics of Amazon’s Mechanical Turk,” ProfHacker Blog, The Chronicle of Higher Education, March 1, 2010. [available online]
- U.S. Department of Health and Human Services: Can an electronic signature be used to document consent on parental permission?
- U.S. Department of Health and Human Services: What are the basic elements of informed consent?
- Zimmer, Michael, 2009, “Facebook’s Privacy Upgrade is a Downgrade for User Privacy,” MichaelZimmer.org blog, December 10, 2009. [available online]