Ethics, Information Security Research, and Institutional Review Boards

Several weeks ago, in “A Question of Ethics“, I asked EC readers whether it would be ethical “to deliberately seek out files containing PII as made available via P2P networks”. I had recently read an academic research paper that did just that, and was left conflicted. Part of me wondered whether a review board would pass such a research proposal, or whether the research in the paper was even submitted for review. Another part told me that the information was made publicly available, so my hand-wringing was unwarranted. In the back of my mind, I knew that as information security researchers increasingly used the methods of the social sciences and psychology these ethical considerations would trouble me again.
Through Chris Soghoian’s blog post regarding the ethical and legal perils possibly facing the authors of a paper which describes how they monitored Tor traffic, I realized I was not alone. Indeed, in a brief but cogent paper, Simson Garfinkel describes how even seemingly uncontroversial research activities, such as doing a content analysis on the SPAM one has received, could run afoul of existing human research subject review guidelines.
Garfinkel argues that strict application of rules governing research involving human subjects can provide researchers with incentives to actively work against the desired effect of the reviews. He further suggest thats

society would be better served with broader exemptions that could be automatically applied by researchers without going to an IRB [Institutional Review Board].

My concern at the moment is with the other side of this. I just read a paper which examined the risks of using various package managers. An intrinsic element of the research behind this paper was setting up a mirror for popular packages under false pretenses. I don’t know if this paper was reviewed by an IRB, and I certainly don’t have the expertise needed to say whether it should have been allowed to move forward if it was. However, the fact that deception was used made me uneasy. Maybe that’s just me, but maybe there are nuances that such research is beginning to expose and that we as an emergent discipline should strive to stay on top of.
[Update: The researchers whose Tor study was examined by Soghoian have posted a portion of a review conducted by the University of Colorado:

Based on our assessment and understanding of the issues involved in your work, our opinion was that by any reasonable standard, the work in question was not classifiable as human subject research, nor did it involve the collection of personally identifying information. While the underlying issues are certainly interesting and complex, our opinion is that in this case, no rules were violated by your not having subjected your proposed work to prior IRG scrutiny. Our analysis was confined to this IRG (HRC) issue.

This conclusion is in line with Richard Johnson's comment below, that this research was not on people, but on network traffic.]

6 thoughts on “Ethics, Information Security Research, and Institutional Review Boards

  1. Sadly, that particular blog post by Soghoian is almost entirely bias. And it’s rather ham-handed about it.
    (Let me disclose mine. They may seem conflicted to you, but life isn’t simple: I run a tor middleman, and have run tor exit nodes. I’ve run type I and type II anonymous remailers. I’m personally acquainted with tor developers and some of the authors of the paper being discussed. I tap all network traffic at my place of employment, including by inclusion any cleartext from our tor node, as and when deemed necessary for service assurance. I’ve helped give a seminar about wiretap legality with Professor Ohm.)
    I find Professor Ohm’s response to Soghoian telling. It appears Ohm sensed some kind of witch hunt attitude in Soghoian’s approach. It’s probably the same attitude I see pervading Soghoian’s post.
    Soghoian far overplays the logging of tor exit traffic. He buries the fact that the researchers logged only 150 bytes of each session. He similarly buries the source IP address logging. I believe he does this to milk the “they were logging” angle for all he can, rather than presenting it as what it actually was.
    Moving on, Soghoian’s IRB comments are red herrings. I’ve done human subjects research at the University of Colorado, Boulder, and in the computer science department to boot. When doing research on people, the guidelines and reviews are followed religiously. In contrast, this was research on network traffic.
    In fact, this kind of logging of packet headers for research into network traffic goes on as a matter of course. There are at least two groups doing such aggregate logging for network measurement purposes that I know of at all the universities of the researchers who wrote that paper.
    Further, there was no deception involved. Anyone with half a clue (tor users are in this camp) knows that all tor operators must be assumed to be mallory, logging everything they can. In fact, the researchers identified methods for detecting when this is happening in some cases.
    So please don’t mischaracterize the research. Don’t overreach. Don’t witch hunt based on trendy worries and hyperbole.
    Instead, go new school. Do gather real data about use, so designs for anonymity systems can skip hyperbole and zeitgeist in favor of facts.

  2. Garfinkel’s review is with regard to study of human subjects or involving human interaction. I can imagine many ways to do content analysis of spam without invoking those two clauses, or the need for an IRB.

  3. @Davi – I quite agree. In some cases, of course, humans are unavoidably part of the picture, and I am just concerned that we treat them fairly.
    @Richard – Thanks very much for the thoughtful comment. I invoke Soghoian to show how I came to learn of Garfinkel’s paper, and that others (eg., Soghoian) are discussing this question. My mention should not be taken as agreement with him, since whether I agree or not is not germane to the issue at hand. IANAL, am not now (nor have I ever been :^)) a member of an IRB. I can imagine circumstances in which studying network traffic could get dicey, however, especially when the analysis extends into the application layer (and beyond…). I am not looking to call anyone unethical here — and I acknowledge the danger of coming off that way, which is why I was so circumspect when I wrote the prior blog entry. My sole point is that as we go new school, two things need to be considered – adherence to the ethical protections put in place when people like me (a sociologist by training) abused their positions of trust; and possible adjustment (a la Garfinkel) of those ethical protections to accommodate the new environment which ubiquitous interpersonal connectivity has brought into existence.

  4. Not commenting on the Soghoian post, but about IRBs:
    I hear from my friends in experimental economics and from those doing user studies that the Berkeley IRB (the “Committee on Protection of Human Subjects”) has done a lot in the last few years to improve the interaction between the review board and their research. In particular, they’ve made it possible to obtain approval quickly for many types of user studies. I unfortunately don’t know the details.
    That doesn’t address the issue of whether IRB approval should have been sought for the Tor study, or for live network traffic studies in general, of course. Not to mention some of Simson’s other examples.
    Still, it shows that if you work at it you can make change. Unfortunately, in general, this kind of change may take longer than the horizon of a single project (or even a single grad student). So you need someone to step up and expend energy making this happen; absent such champions, the overhead of dealing with an IRB will tend to push people to do research that has less or zero need for IRB involvement. (or push that research to institutions that have less intrusive IRB oversight.)

  5. I only recently encountered your blog, so I wasn’t here for the earlier discussion. I hope it won’t be considered out of place fo me to respond to something said there in addressing the questions raised here.

    “To me it is blatantly obvious that in many cases they will not have been, but is the “presumption of voluntary disclosure” enough to say harvesting them is kosher?”
    Let’s go back to what Dan said. How is this really any different than an entity who does not secure their server properly and allows Google to index and cache the data? As a researcher looking for PII, you’d enter some search string, and you might see a link in Google, but you’d have to open the file to determine if there were PII in it. Would you argue that inspecting or downloading such files is “unethical” if you have reason to suspect that their exposure was unintentional?
    Isn’t it more the case that what you do with the data after you determine what’s in it that might be unethical (cf, my complaints about So how is your scenario really any different? Am I missing something here?

    If you’ll allow me to use Iblis’s own invention, the analogy, what this line of reasoning brought to mind is the fellow up on the hillside with a telescope. In his position you can point the telescope at the window of the young lady who dresses without the blinds drawn. One might presume that she is voluntarily disclosing her disclothing because she could have, but did not, draw the blinds. Yet it is at least as likely that she thought the fence in her back yard and the fact that her nearby neighbors’ views are blocked are sufficient and the notion of someone way up the hill with a telescope never occurred to her.
    Is it unethical only if you film and publish her disrobing? Or is the mere viewing of her nightly ritual suspect? Or pointing the telescope at houses rather than the stars? Is it OK to watch her each night if you are doing so as a cultural anthropologist, but not if you are a peeping Tom?
    It seems to me that a lot of academic ethics focuses on the issues of closest to hand: the ethics of publishing, and what you do with the data. As a philosophy and social psych major who hasn’t done research with human subjects in well over 20 years, I tend to think more about the individual rights and the morality of the issue, and it seems to me that there is something at the very least suspect about gathering data when you have substantial reason to believe that the disclosure was unintentional.
    It just occurred to me that I used a philosophically interesting phrase above when I spoke of a “peeping Tom”. Please note that Tom’s sin was that he didn’t avert his gaze when Lady Godiva intentionally rode naked down the street in protest. Her exposure was intentional. Tom was just the only person in town so uncouth as to look. Just a thought.
    To return to this post, I agree with your concerns, Chris. The average person has little understanding how their personal information is disclosed when they send email, put up a P2P server (which they may not even know was an effect of their running a P2P client) and other such things.
    Little did those of us who wrote our first mail servers back 3 decades or more ago realize what our decisions about protocols would mean in terms of Constitutional law, identity theft and the like all these years later. The disclosure that we are talking about presuming is often the product not of the decisions of the people whose information is disclosed, but of their ignorance and the decisions of nerds many many years before. It makes me, at least, queasy.

Comments are closed.