Looking for privacy in the clouds

DURHAM, N.C. -- Millions of Internet users have been enjoying the fun -- and free -- services provided by advertiser-supported online social networks like Facebook. But Landon Cox, a Duke University assistant professor of computer science, worries about the possible down side -- privacy problems.

When people post pictures or political opinions to share with their friends, they're actually turning them over to the owners of the network as well.

"My concern is that they're under the control of a central entity," Cox said. "The social networks currently control all the information that users throw into them. I don't think that's necessarily evil. But it raises some concerns."

For instance, MIT student experimenters have demonstrated the ability to sneak in and download more than 70,000 Facebook profiles. And a BBC technology program also showed how such personal information could be stolen.

"A disgruntled employee could leak information about social network users," Cox said. "They could also become attractive targets for hackers and other computer ne'er-do-wells."

Though users may not have caught this when they clicked to accept a site's terms of service, they've largely signed away the rights to their own data by joining an Online Social Network. "These rights commonly include a license to display and distribute all content posted by users in any way the provider sees fit," Cox said.

To delve deeper into these issues and begin the search for alternatives, Cox recently won a $498,000, three-year grant from the National Science Foundation. The funding is part of the federal stimulus package called the American Recovery & Reinvestment Act of 2009 (ARRA). He and two of his graduate students, Amre Shakimov and Dongtao Liu, are collaborating closely with Ramon Caceres at AT&T Labs in Florham Park, N.J., which is also a major supporter.

"What the grant will do is fund research into alternatives for providing social networking services that don't concentrate all this information in a single place," he said. Cox's notion is instead to create what network architects would call a "peer-to-peer" system architecture in which information is spread out. Being distributed, individual data is thus harder to steal or otherwise exploit.

"The basic idea is that users would control and store their own information and then share it directly with their friends instead of it being mediated through a site like Facebook. And there are some interesting challenges that go along with decomposing something like Facebook into a peer-to-peer system.

"Facebook is a great service because it's highly available and really fast. When you break something into thousands and millions of different pieces instead, you'd want to try to recreate the same availability and performance. That's the research challenge we're going to be looking at over the next three years."

Cox proposed three possible options in a report for the Association for Computing Machinery's Workshop for Online Social Networks in Barcelona in August 2009. In each, users would load their personal information into what is called a "Virtual Individual Server," or VIS.

One option would host each social network user's VIS on his or her own desktop. "But the problem with desktop machines is that they go down all the time," Cox said. "When desktops are shut off they are not available."

An alternative idea is to distribute VISs within redundant "clouds" of servers such as those offered by the Amazon Elastic Computer Cloud. "Amazon will run little computers on your behalf out in their infrastructure," Cox said. "The nice thing about that is the service will never go down. But the problem is that it's very expensive. It costs about $50 a month to have just one server out in the cloud."

A third notion is called "hybrid decentralization." The idea is to keep VISs on desktops when possible but switch to the more costly and reliable cloud distribution option when individual desktops go offline.

"So there are these different tradeoffs," Cox said. "Users can try to put their information in clouds of servers, which are going to be highly available but expensive. Or they could try to store it on their own machines, which would be cheap but subject to service interruptions."

Under his NSF stimulus grant, Cox will be able to pay Shakimov and Liu for three years and fund some of his own work to explore those options. Other AT&T Labs research participants besides Caceres are Alexander Varshavsky and Kevin Li. Amazon is also providing equipment support.

"The research will point in a couple of directions," he said. "Can we get a desktop machine to intelligently switch over to a cloud? Can we reduce the cost by only using a cloud when the desktop is not available?"

Or perhaps the same information can be put in a number of places in the hope that at least one of those computers is always working. "So in addition to serving my own stuff I might ask my friends to serve my stuff as well," Cox said.

"The problem there is that now you're trusting somebody else to serve and store your data. We have some interesting challenges ahead."

Source: Duke University