Showing posts with label The Internet. Show all posts
Showing posts with label The Internet. Show all posts

Wednesday, June 22, 2011

Breaking out of the internet filter bubble

Eli Pariser is the former executive director of the liberal activism site, MoveOn.org and co-founder of the international political site Avaaz.org. His new book, The Filter Bubble, examines how web personalization is influencing the content we see online. New Scientist caught up with him to talk about the filters he says are shaping our view of the world, and hear why he thinks it's so important to break out of the bubble.


What is the "filter bubble"?
Increasingly we don't all see the same internet. We see stories and ideas and facts that make it through a membrane of personalised algorithms that surround us on Google, Facebook, Yahoo and many other sites. The filter bubble is the personal unique universe of information that results and that we increasingly live in online.

You stumbled upon the filter bubble when you noticed Facebook friends with differing political views were being phased out of your feed, and people were getting very different results for the same search in Google. What made you think all of this was dangerous, or at least harmful?
I take these Facebook dynamics pretty seriously simply because it's a medium that one in 11 people now use. If at a mass level, people don't hear about ideas that are challenging or only hear about ideas that are likeable - as in, you can easily click the "like" button on them - that has fairly significant consequences. I also still have a foot in the social change campaigning world, and I've seen that a campaign about a woman being stoned to death in Iran doesn't get as many likes as a campaign about something more fuzzy and warm.

Do you think part of the problem is that Facebook is still largely used for entertainment?
It's definitely growing very rapidly as a news source. There was a PEW study that said 30 per cent of people under 30 use social media as a news source. I would be surprised if in 15 years surfing news looks like seeking out a bunch of different particular news agencies and seeing what's on their front page.

We have long relied on content filters - in the form publications or TV channels we choose. How is the filter bubble different?
First, yes we've always used filters of some sort, but in this case we don't know we are. We think of the internet as this place where we directly connect with information, but in fact there are these intermediaries, Facebook and Google, that are in the middle in just the same way that editors were in 20th century society. This is invisible; we don't even see or acknowledge that a lot of the time there is filtering at work.
The second issue is that it's passive. We're not choosing a particular editorial viewpoint, and because we're not choosing it, we don't have a sense of on what basis information is being sorted. It's hard to know what's being edited out.
And the final point is that it's a unique universe. It's not like reading a magazine where readers are seeing the same set of articles. Your information environment could differ dramatically from your friends and neighbours and colleagues.

You have suggested that the filter bubble deepens the disconnect between our aspirational selves, who put Citizen Kane high on the movie rental queue, and our actual selves, who really just want to watch The Hangover for the fifth time. Is there a danger inherent in that?
The industry lingo for this is explicit versus revealed preferences. Revealed preferences are what your behaviour suggests you want, and explicit preferences are what you're saying you want. Revealed preferences are in vogue as a way of making decisions for people because now we have the data to do that - to say, you only watched five minutes of Citizen Kane and then turned it off for something else.
But when you take away the possibility of making explicit choices, you're really taking away an enormous amount of control. I choose to do things in my long-term interest even when my short-term behaviour would suggest that it's not what I want to do all the time. I think there's danger in pandering to the short-term self.

What you're promoting has been characterized as a form of "algorithmic paternalism" whereby the algorithm decides what's best for us.
What Facebook does when it selects "like" versus "important" or "recommend" as the name of its button is paternalistic, in the sense that it's making a choice about what kinds of information gets to people on Facebook. It's a very self-serving choice for Facebook, because a medium that only shows you things that people like is a pretty good idea for selling advertising. These systems make value judgments and I think we need to hold them to good values as opposed to merely commercial ones. But, that's not to say that you could take values out of the equation entirely.

Your background is in liberal activism. Do you think the reaction to your ideas as algorithmic paternalism has to do with a perception that you're trying to promote your own political views?
If people think that, they misread me. I'm not suggesting we should go back to a moment where editors impose their values on people whether they want it or not. I'm just saying we can do a better job of drawing information from society at large, if we want to. If Facebook did have an "important" button alongside the "like" button, I have real faith that we would start to promote things that had more social relevance. It's all about how you construct the medium. That's not saying that my ideas of what is important would always trump, it's just that someone's ideas of what is important would rather than nobody's.

You've repeatedly made the case for an "important" button on Facebook, or maybe, as you've put it, an "it was a hard slog at first but in the end it changed my life" button. Do you think really what you're asking Facebook to do is grow up?
Yeah. In its most grandiose rhetoric Facebook wants to be a utility, and if it's a utility, it starts to have more social responsibility. I think Facebook is making this transition, in that it's moved extraordinarily quickly from a feisty insurgent that was cute, fun and new, to being central to lots of people's lives. The generous view is that they're just catching up with the amount of responsibility they've all of a sudden taken on.

Your argument has been called "alarmist", and as I'm sure you're aware, a piece in Slate recently suggested that you're giving these algorithms too much credit. What's your response to such criticism?
There are two things. One is that I'm trying to describe a trend, and I'm trying to make the case that it will continue unless we avert it. I'm not suggesting that it's checkmate already.
Second, there was some great research published in a peer-reviewed internet journal just recently which points out that the effects of personalisation on Google are quite significant: 64 per cent of results are different either in rank or simply different between the users that they tested. That's not a small difference. In fact, in some ways all the results below the first three are mostly irrelevant because people mostly click on the first three results. As Marissa Mayer talked about in an interview, Google actually used to not personalise the first results for precisely this reason. Then, when I called them again, they said, actually we're doing that now. I think that it's moving faster than many people realise.

You offer tips for bursting the filter bubble - deleting cookies, clearing browser history, etc. - but, more broadly, what kind of awareness are you hoping to promote?
I just want people to know that the more you understand how these tools are actually working the more you can use them rather than having them use you.
The other objective here is to highlight the value of the personal data that we're all giving to these companies and to call for more transparency and control when it comes to that data. We're building a whole economy that is premised on the notion that these services are free, but they're really not free. They convert directly into money for these companies, and that should be much more transparent.

Source  New Scientist

Monday, June 20, 2011

Bitcoin value plummets as main exchange is hacked


Bitcoin freefall: the market plummets as large amounts of bitcoins are sold off at rock-bottom prices - bigger circles correspond to larger transactions (Image: Mt. Gox)

Following reports of theft last week, the Bitcoin community suffered another major loss of confidence yesterday when its largest exchange, Mt. Gox, was compromised, causing Bitcoin's value to fall from around $17.5 to just a few cents.

Although the online peer-to-peer currency has no central authority, Mt. Gox has become one of the most important Bitcoin players by allowing people to convert bitcoins to US dollars and back. As Mt. Gox's owner Mark Karpeles explains, the site itself was not hacked, but someone gained access to a computer used by one of Mt. Gox's auditors and stole a read-only copy of its database containing details of over 60,000 accounts.

The attacker then attempted to sell coins from one large account, but was prevented from emptying the account by a $1000 per day withdrawal limit. Even so, the sale caused the value of Bitcoin against the dollar to completely collapse.

Karpeles says that no other accounts were compromised and all trades will be rolled back to before the time of the hack, restoring Bitcoin's value to $17.5. Mt. Gox will also require every user to go through an authentication process to verify ownership of their account, since the leaked database contained encrypted user passwords that if cracked could allow access to other accounts.

While these measures should go some way to restoring faith in Mt. Gox and Bitcoin in general, they are a far cry from Bitcoin's promise of complete decentralisation. Some comments on the Mt. Gox forum are protesting the roll back, but it seems that even digital currency enthusiasts want some form of protection when large amounts of money are at stake. The principles behind the currency itself remain sound, but without a secure way to exchange bitcoins for real-life money, many Bitcoin holders will be looking to cash out.

Source New Scientist

Sunday, June 19, 2011

Gender-spotting tool could have rumbled fake blogger

Software that guesses a writer's gender could have prevented the world being duped into believing a blog that opposed the Syrian government and was striking out for gay rights was written by a young lesbian living in the country.

It turned out the author of the blog, "Gay Girl in Damascus", was a man – something the online gender checker would have picked up on. When New Scientist fed the text of the last blog post into the software, it said that the author was 63.2 per cent likely to be male.
Developed by Na Cheng and colleagues at the Stevens Institute of Technology in Hoboken, New Jersey, the ever-improving software could soon be revealing the gender of online writers – whether they are blogging, emailing, writing on Facebook or tweeting. The team say the software could help protect children from grooming by predators who conceal their gender online.

The fake blog highlights the problem of people masking their identity online. The truth about Amina Abdullah only emerged when the blogger disappeared, supposedly snatched by militiamen.
Online contacts realised that none of them had ever met Amina, and it turned out her blog photo had been stolen from a Facebook page. Then a 40-year-old American, Tom MacMaster living in Edinburgh, UK, confessed that he had been writing the blog all along.

Gender analysis

To determine the gender of a writer or blogger, Cheng and her colleagues Rajarathnam Chandramouli and Koduvayur Subbalakshmi wrote software that allows users to either upload a text file or paste in a paragraph of 50 words or more for gender analysis.
After a few moments, the program spits out a gender judgement: male, female or neutral. The neutral option points to how much of the text has been stripped of any indication of gender. This is something particularly prevalent in scientific texts, the researchers say.
To write their program, the team first turned to vast tranches of bylined text from a Reuters news archive and the massive email database of the bankrupt energy firm Enron. They trawled these documents for "psycho-linguistic" factors that had been identified by previous research groups, such as specific words and punctuation styles.
In total they found 545 of these factors, says Chandramouli, which they then honed down to 157 gender-significant ones. These included differences in punctuation style or paragraph lengths between men and women.

Other gender-significant factors included the use of words that indicate the mood or sentiment of the author and the degree to which they use "emotionally intensive adverbs and affective adjectives such as really, charming or lovely" which were used more often by women, says Chandramouli. Men were more likely to use the word "I", for example, whereas women used questions marks more often.

Bayesian algorithms

Finally, the software combined these cues using a Bayesian algorithm, which guesses gender based on the balance of probabilities suggested by the telltale factors. The work will appear in an upcoming edition of the journal Digital Investigation.
It doesn't always work, however. When the software is fed text, its judgement on a male or female writer is only accurate 85 per cent of the time – but that will improve as more people use it. That's because users get the chance to tell the system when it has guessed incorrectly, helping the algorithm learn. The next version will analyse tweets and Facebook updates.

Bernie Hogan, a specialist in social network technology at the Oxford Internet Institute in the UK, thinks there is a useful role for such technology. "Being able to provide some extra cues as to the gender of a writer is a good thing – it can only help."
Even a "neutral" decision might indicate that someone is trying to write in a gender voice that does not come naturally to them, he says. "It could be quite telling."

Testing the gender software

What did the gender identifier make of three well-known authors? We fed it some sample text to find out.
V. S. Naipaul, a winner of the Nobel prize for literature, claims he can tell a woman's writing by reading just two paragraphs of text, and controversially thinks female authors are no match for his writing. The software's verdict on this extract from his book The Enigma of Arrival: 88.4 per cent male.
Mary Evans was a female novelist who famously wrote under the male nom de plume George Eliot. The software has the measure of her, though. Its analysis of the writer's gender from the first paragraphs of Middlemarch: 94.6 per cent female.

More than 14,000 of Sarah Palin's emails were released by the state of Alaska last week after a lengthy campaign by various media organisations to obtain access to them. One email from the archive was put through the system, but the software got it wrong: 70.77 per cent male.

Source New Scientist

Saturday, June 18, 2011

New Search Engine Looks for Uplifting News

Semantic search technology aimed at a positive slant advances with a system that can spot optimism in news articles.

Good news, if you haven't noticed, has always been a rare commodity. We all have our ways of coping, but the media's pessimistic proclivity presented a serious problem for Jurriaan Kamp, editor of the San Francisco-based Ode magazine—a must-read for "intelligent optimists"—who was in dire need of an editorial pick-me-up, last year in particular. His bright idea: an algorithm that can sense the tone of daily news and separate the uplifting stories from the Debbie Downers.

Talk about a ripe moment: A Pew survey last month found the number of Americans hearing "mostly bad" news about the economy and other issues is at its highest since the downturn in 2008. That is unlikely to change anytime soon: global obesity rates are climbing, the Middle East is unstable, and campaign 2012 vitriol is only just beginning to spew in the U.S. The problem is not trivial. A handful of studies, including one published in the Clinical Psychology Review in 2010, have linked positive thinking to better health. Another from the Journal of Economic Psychology the year prior found upbeat people can even make more money.

Kamp, realizing he could be a purveyor of optimism in an untapped market, partnered with Federated Media Publishing, a San Francisco–based company that leads the field in search semantics. The aim was to create an automated system for Ode to sort and aggregate news from the world's 60 largest news sources based on solutions, not problems. The system, released last week in public beta testing online and to be formally introduced in the next few months, runs thousands of directives to find a story's context. "It's kind of like playing 20 questions, building an ontology to find either optimism or pessimism," says Tim Musgrove, the chief scientist who designed the broader system, which has been dubbed a "slant engine". Think of the word "hydrogen" paired with "energy" rather than "bomb."

Web semantics developers in recent years have trained computers to classify news topics based on intuitive keywords and recognizable names. But the slant engine dives deeper into algorithmic programming. It starts by classifying a story's topic as either a world problem (disease and poverty, for example) or a social good (health care and education). Then it looks for revealing phrases. "Efforts against" in a story, referring to a world problem, would signal something good. "Setbacks to" a social good, likely bad. Thousands of questions later every story is eventually assigned a score between 0 and 1—above 0.95 fast-tracks the story to Ode’s Web interface, called OdeWire. Below that, a score higher than 0.6 is reviewed by a human. The system is trained to only collect themes that are "meaningfully optimistic," meaning it throws away flash-in-the-pan stories about things like sports or celebrities.

No computer is perfect, of course, and like IBM's Watson that held its own on Jeopardy! earlier this year, Ode’s slant engine continues to improve with time—and with each mistake. During one test, the slant system that runs Ode labeled a story about the FBI being "asleep at the switch" as positive, perhaps thinking it addressed sleep deprivation. Nor is it ideologically neutral: the U.S. losing ground to China is not such bad news to, well, China.

The goal is not to be naive, either—drowning out the gloom to focus on rainbows and unicorns. "Ignoring reality is not what this is about," Kamp says. "It's looking at the same reality, just looking at a different angle." High unemployment is a problem that seems all bad, he says, but if you approach it from a side door—perhaps profiling people who have founded new business and learned new skills or industries that have benefited from the downturn—it turns into a story that can inspire others, and maybe even lower the jobless rate faster.

Slant identification may have a big future. Researchers say it could eventually specialize Web content for pockets of consumers and make ads more engaging. Its potential to track attitudes in writing could even help address the age-old lament of how liberal or conservative the mainstream media actually is. Gone, too, could be the journalism axiom of "if it bleeds, it leads". If Ode has its way, solution-based news could become the hot new thing for the overwhelmed and dispirited. Imagine a new newsroom mantra: if it succeeds, it leads.

Sync your desktop and phone with a single photo

I don't know about you, but I am forever pulling up Google Maps on screen, sending them to the printer...and then leaving for an assignment without the printouts. To the rescue comes an MIT artificial intelligence expert and a Google engineer who together developed an app that lets you move onscreen data like maps from your computer screen to a phone - and magically open the mapping program in the exact same state on the mobile device.


To use their forthcoming Deep Shot app, you simply point the phone camera at the PC or Mac screen and click the shutter. "The phone automatically opens up the corresponding application in the corresponding state. The same process can also work in reverse, moving data from the phone to a desktop computer," says MIT.
How? MIT's Tsung-Hsiang Chang and Google's Yang Li have written code that runs on both the computer and the phone. It makes visible onscreen the uniform resource identifier (URI), of which the web link, or uniform resource locator (URL), is a mere subset. Unlike an URL, the URI is the gobbledegook you get when you press the "link" button on a Google Maps or Street View page (hover your cursor over this link and look at the bottom corner of your screen to see it). This describes all the map data on the page and crucially also scales the data for the screen window.

Snap the screen and the URI code is recognised by the phone's app, calling up the mapping app program in the very same running state. It's very cool stuff, although not everyone is so impressed. But best of all, when Google decides to release the app, it'll save me a lot of wasted colour printouts.

High Wired: Does Addictive Internet Use Restructure the Brain?

Brain scans hint excessive time online is tied to stark physical changes in the brain.

Kids spend an increasing fraction of their formative years online, and it is a habit they dutifully carry into adulthood. Under the right circumstances, however, a love affair with the Internet may spiral out of control and even become an addiction.

Whereas descriptions of online addiction are controversial at best among researchers, a new study cuts through much of the debate and hints that excessive time online can physically rewire a brain.
The work, published June 3 in PLoS ONE, suggests self-assessed Internet addiction, primarily through online multiplayer games, rewires structures deep in the brain. What's more, surface-level brain matter appears to shrink in step with the duration of online addiction.

"I'd be surprised if playing online games for 10 to 12 hours a day didn't change the brain," says neuroscientist Nora Volkow of the National Institute on Drug Abuse, who wasn't involved in the study. "The reason why Internet addiction isn't a widely recognized disorder is a lack of scientific evidence. Studies like this are exactly what is needed to recognize and settle on its diagnostic criteria," she says.

Defining an addiction
Loosely defined, addiction is a disease of the brain that compels someone to obsess over, obtain and abuse something, despite unpleasant health or social effects. And "internet addiction" definitions run the gamut, but most researchers similarly describe it as excessive (even obsessive) Internet use that interferes with the rhythm of daily life.
Yet unlike addictions to substances such as narcotics or nicotine, behavioral addictions to the Internet, food, shopping and even sex are touchy among medical and brain researchers. Only gambling seems destined to make it into the next iteration of the Diagnostic and Statistical Manual of Mental Disorders, or DSM, the internationally recognized bible of things that can go awry with the brain.
Nevertheless, Asian nations are not waiting around for a universal definition of Internet addiction disorder, or IAD.

China is considered by many to be both an epicenter of Internet addiction and a leader in research of the problem. As much as 14 percent of urban youth there—some 24 million kids—fit the bill as Internet addicts, according to the China Youth Internet Association. By comparison, the U.S. may see online addiction rates in urban youth around 5 to 10 percent, say neuroscientists and study co-authors Kai Yuan and Wei Qin of Xidian University in China.
The scope of China's problem may at first seem extraordinary, but not in the context of Chinese culture, says neuroscientist Karen M. von Deneen, also of Xidian University and a study co-author.
Parents and kids face extreme pressure to perform at work and in school, but cheap Internet cafes lurk around the corner on most blocks. Inside, immersive online game realities like World of Warcraft await and allow just about anyone to check out of reality.

"Americans don't have a lot of personal time, but Chinese seem to have even less. They work 12 hours a day, six days a week. They work very, very hard. Sometimes the Internet is their greatest and only escape," according to von Deneen. "In online games you can become a hero, build empires, and submerge yourself in a fantasy. That kind of escapism is what draws young people."
Out of sight of parents, some college kids further cave to online escapism or use gaming to acquire resources in-game and sell them in the real world. In a recent case Chinese prison wardens allegedly forced inmates into the latter practice to convert digital gold into cold-hard cash.

Several studies have linked voluntary and excessive online use to depression, poor school performance, increased irritability and more impulsiveness to go online (confounding addicts' efforts, if they want to at all, to stop pouring excessive time into online games). To study the effects of possible Internet addiction on the brain, researchers began with the Young Diagnostic Questionnaire for Internet addiction.
This self-assessment test, created in 1998 by psychiatrist Kimberly Young of Saint Bonaventure University in New York State, is an unofficial standard among Internet addiction researchers, and it consists of eight yes-or-no questions designed to separate online addicts from those who can manage their Internet use. (Questions range from, "Do you use the Internet as a way of escaping from problems or of relieving an anxious mood?" to "Have you taken the risk of losing a significant relationship, job, educational or career opportunity because of the Internet?".)

The China-based research team picked 18 college-age students who satisfied addict criteria, and these subjects said they spent about 10 hours a day, six days a week playing online games. The researchers also selected 18 healthy controls who spent less than two hours a day online (an unusually low number, says von Deneen). All of the subjects were then plopped into an MRI machine to undergo two types of brain scans.

Brain drain
One set of images focused on gray matter at the brain's wrinkled surface, or cortex, where processing of speech, memory, motor control, emotion, sensory and other information occurs. The research team simplified this data using voxel-based morphometry, or VBM—a technique that breaks the brain into 3-D pixels and permits rigorous statistical comparison of brain tissue density among people.
The researchers discovered several small regions in online addicts' brains shrunk, in some cases as much as a 10 to 20 percent. The affected regions included the dorsolateral prefrontal cortex, rostral anterior cingulate cortex, supplementary motor area and parts of the cerebellum.

What's more, the longer the addiction's duration, the more pronounced the tissue reduction. The study's authors suggest this shrinkage could lead to negative effects, such as reduced inhibition of inappropriate behavior and diminished goal orientation.
But imaging neuroscientist Karl Friston of University College London, who helped pioneer the VBM technique, says gray matter shrinkage is not necessarily a bad thing. "The effect is quite extreme, but it's not surprising when you think of the brain as a muscle," says Friston, who was not involved in the study. "Our brains grow wildly until our early teens, then we start pruning and toning areas to work more efficiently. So these areas may just be relevant to being a good online gamer, and were optimized for that."

(Friston says London taxi drivers provide a telling comparative example of the brain's ability to reshape itself with experience. In the 2006 study, researchers compared taxi drivers' brains with those of bus drivers. The former showed increased gray matter density in their posterior hippocampi—a region linked to maplike spatial navigation and memory. That probably comes as no surprise to London cabbies, who spend years memorizing a labyrinthine system of 25,000 streets, whereas bus drivers have set routes.)
As another crucial part of the new study on Internet addiction, the research team zeroed in on tissue deep in the brain called white matter, which links together its various regions. The scans showed increased white matter density in the right parahippocampal gyrus, a spot also tied to memory formation and retrieval. In another spot called the left posterior limb of the internal capsule, which is linked to cognitive and executive functions, white matter density dropped relative to the rest of the brain.

Disorder under construction
What the changes in both white and gray matter indicate are murky, but the research team has some ideas.
The abnormality in white matter in the right parahippocampal gyrus may make it harder for Internet addicts to temporarily store and retrieve information, if a recent study is correct. Meanwhile, the white matter reduction in the left posterior limb could impair decision-making abilities—including those to trump the desire to stay online and return to the real world. The long-term impacts of these physical brain changes are even less certain. Rebecca Goldin, a mathematician at George Mason University and director of research for STATS, says the recent study is a big improvement over similar work published in 2009. In this older study a different research group found changes in gray matter in brain regions of Internet addicts. According to Goldin, however, the study lacked reliable controls.

The sample sizes of both studies were small—fewer than 20 experimental subjects each. Yet Friston says the techniques used to analyze brain tissue density in the new study are extremely strict. "It goes against intuition, but you don't need a large sample size. That the results show anything significant at all is very telling," Friston notes.
In the end all of the researchers interviewed by Scientific American emphasized significance only goes so far in making a case for IAD as a true disorder with discrete effects on the brain. "It's very important that results are confirmed, rather than simply mining data for whatever can be found," Goldin says.

Source Scientific American

Spies can send messages hidden in a Google search

THE peculiar list of search options that Google suggests as you type in a query could be hijacked to let people communicate secretly.

So says Wojciech Mazurczyk at the Warsaw University of Technology in Poland, who specialises in steganography - the art of hiding messages in plain sight.
Mazurczyk and his team dream up new ways in which spies or terrorists might try to communicate undetected, allowing security agencies to develop ways of eavesdropping on them. To avoid arousing suspicion, the method used must be as commonplace as possible, and what could be more ordinary than seeing someone googling in a cyber cafe? It wouldn't warrant a second glance, Mazurczyk told a security conference in Prague, Czech Republic, last month.

So the team turned to Google Suggest to see if it could hide messages. Google Suggest works by listing up to 10 suggestions each time a letter is added to a search term, based on the most popular searches made by other Google users that begin with the same letters. The words offered change as each new letter is added.
Some of the options that appear as your search term takes shape can often seem quite strange: "runescape", "rotten tomatoes" and "rock and chips" for "r", "ro" and "roc" in "rocket", for instance. This is the key to how Mazurczyk's team adds its own search suggestions to the list to encode secret messages.

To do this, the team infects a target computer with malware called StegSuggest. This intercepts the Google Suggest lists exchanged between Google and the infected computer, and adds a different word to the end of each of the 10 suggestions in the list on that particular machine. The added words are chosen from the 4000 most used words in English to make sure they do not appear too outlandish.

The receiver types in a random search term and notes down the additional word in each suggestion. These 10 extra words are then looked up in a "codebook" shared by receiver and sender that contains all 4000 words, which gives each word a 10-bit binary number. The numbers are linked together into a chain which is converted into text using a separate program on the receiver's home PC, revealing the hidden message.
However, Ross Anderson, a cryptography and security specialist at the University of Cambridge, thinks there is enough traffic between sender and receiver to alert authorities that something suspicious is afoot - thereby undermining the process.

Source New Scientist

Friday, June 17, 2011

US government sites caught in crossfire of hacker war

The hacker group LulzSec has turned its attention to the US government, following a month-long campaign against various media and video game companies. While previous attacks involved stealing data from poorly-secured servers, including the site of the US Senate, the group has now begun using denial of service attacks (DDoS) to take down websites including cia.gov, which went offline last night but is currently accessible.

These kinds of attacks, which flood websites with traffic and cause them to go offline, were also used by the hacking group Anonymous in defence of WikiLeaks. Some reports suggest LulzSec is using the same DDoS tool as Anonymous, the Low Orbit Ion Cannon (LOIC), which allows anyone to join in an attack by running the software. In another twist, LulzSec has also declared war on 4chan, the anarchic message board with strong ties to Anonymous.

Not content with just carrying out online DDoS attacks, LulzSec are also using the phone networks to cause havoc. The group has set up a phone number and encouraged others to call in with hack requests. They then redirect their phone number elsewhere, causing victims including the FBI to receive hundreds of calls - the group claims between five and 20 people were ringing the line every second.

As a "thank you" to their fans, LulzSec has today released a collection over over 62,000 email and password combinations, though the group hasn't stated which websites these details give access to. LulzSec Twitter followers are now reporting that they have used the leaked details to compromise accounts on Facebook, Gmail and World of Warcraft.

LulzSec's string of successful if simple hacks demonstrate that the websites of many companies and organisations just aren't sufficiently secure. While LulzSec says their actions are just intended to amuse, what happens when a group with less benign motives follows in their footsteps? It's time for IT managers everywhere to take another look at their defences, before they fall foul of the next attack.

Source New Scientist

Wednesday, June 8, 2011

Computer Crash Test: Will Your Internet Access Come to a Screeching Halt on June 8?

A 24-hour evaluation will determine whether millions of people worldwide can connect to the new 128-digit Internet protocol address system.
On June 8 Google, Yahoo, Facebook, Comcast and others will turn on IPv6 (and turn off IPv4) for 24 hours to see what happens.

Every computer, modem, server and smart phone that connects to the Internet has a unique Internet protocol (IP) address, so users can find it. The address format, known as IPv4, was standardized in 1977 as a 32-digit binary number, making a then-seemingly unlimited 4.3 billion addresses (2^32) available.

They're all used up.

How? Well, for decades the Internet Assigned Numbers Authority has doled out blocks of IPv4 addresses, as needed, to five Regional Internet Registries around the world, which then assign addresses to users one by one. In February, however, the authority gave each registry one final block of 16 million addresses. The regions are burning through them now, and one region—Asia–Pacific—has already hit zero.

Since 1999 the authority has offered blocks of newer IPv6 addresses that are 128 digits long, resulting in an unimaginable 340 undecillion possible addresses (that's 340 followed by 36 zeroes). But until 2008 or so, few organizations bothered to ask their registries for them. Now Internet carriers, Web companies and Internet service providers (ISP) large and small are sucking up IPv6 addresses for their old and new machines, but in most cases without making the IPv6 addresses live. So a moment of truth has arrived—Google, Yahoo, Facebook, Comcast and others are turning IPv6 on for 24 hours to see what happens. The exact start time for Global IPv6 Day varies depending upon location (in some places, the test actually began on Tuesday). To find out when testing begins in your region, consult this Web site.

All but the oldest computers and phones have been configured to handle both schemes, but "home gateways—the DSL modems or cable modems—may not be," says Geoff Huston, chief scientist for the Asia–Pacific regional registry. And the IPv6 option in your computer or phone may not be turned on. In these cases, if you try to access an IPv6 address on June 8, you will either experience a delay of up to 75 seconds, as your system finds its way to the IPv4 address for the site you're trying to reach—or you may just never connect.

"We're hoping that on IPv6 World Day we'll see v6 traffic go up, so we'll have a better idea of how many users are capable of using IPv6," says Timothy Winters, a senior manager at the University of New Hampshire InterOperability Laboratory (UNH-IOL), which offers broadband service–providers and network equipment–makers the facilities and expertise for testing their products. "Right now, when users go to sites like Google or Facebook we're not getting good statistics because users can only get there via IPv4."

On the flip side, testers also expect problems with IPv6 to surface during this daylong stress test. "Some things are going to break, and when that happens it gives us some idea of how many users can't use v6," Winters says. The best evidence for determining just how many people's systems are not IPv6-compatible will come from calls to ISPs complaining of poor service, or none at all. "If Google or Facebook did this testing on their own, you'd have all of these Internet users calling their ISPs, but the ISPs wouldn't know the IPv6 test might be behind their customers' problems," he adds. "[June 8] is a nice day for us to try out v6 and then we can shut it down and fix all of the problems that may arise."

The parties undertaking the test have scattered diagnostics across the Internet, and will be able to see if 10 or 1 or 0.1 percent of users, for example, experience problems. They will also be able to tell if the problem is your machine, their machines, your service provider or some other node in the Internet. Even a small portion of disconnects is a big deal, however; 1 percent of the Net's two billion users represents 20 million people.

As the last few IPv4 addresses are expended, region by region, over the next few years, new machines will be forced to have only IPv6 addresses. Old IPv4-only machines may not be able to find the new IPv6 machines. Internet operators will therefore have to run systems in both formats for at least several transition years, and perhaps longer, which will add cost and could slow access. "At some point, IPv6 will dominate, and everyone will optimize for it," Huston says. "When that will be, I can't say. We're a large, diverse industry, and no one is in charge."

Source Scientific American

Saturday, May 28, 2011

Wolfram Alpha Turns 2: ‘People Just Need What We Are Doing’

Steven Wolfram, the man behind computing-application Mathematica and the search engine Wolfram Alpha, has a short attention span that’s married to a long-term outlook.



Wolfram Alpha is an online service that computes the answers to queries (e.g., age pyramid for the Philippines or glycogen degradation pathway rather than searching for those terms showing up on webpages.
When asked what his favorite query is, the particle physicist and MacArthur “genius” award recipient says he’s enamored that Wolfram Alpha can tell you about the plane you just saw flying over your town — in his case “flights visible from Concord, Massachusetts.”
But Wolfram’s no plane-spotter.
“My life consists of watching all the new domains being put into Wolfram Alpha,” Wolfram said. “Whatever thing we just finished is the thing I’m most excited about.”

And you might understand Wolfram’s excitement about being able to know the tail number of a plane overhead when you get that answering that question isn’t easy.
For one, there are a lot of planes in the sky. And two, even if you know which planes are in the sky, radar data is delayed, so Wolfram Alpha has to project a plane’s course. And it’s got to take into account that people can’t actually see planes that are very high in the sky.
While that might sound like Wolfram has a short attention span, he’s also taking the long view, as Wolfram Alpha has just passed its second birthday.

“This is my third big life project,” Wolfram said. “Two is early in the life spectrum.”
Wolfram Alpha’s team is now 200 strong, a mix of programmers, linguistic curators and subject-matter experts.
And their to-do list? It’s decades long.
“If you were to look at our whole to-do list, which is a scary thing to do, to finish it would take 20 years,” Wolfram said. “That doesn’t scare me too much, since I’ve been working on Mathematica for 25 years.”
Wolfram Alpha may have a search box, but it’s doubtful that it’s the default search box for anyone, except perhaps Rainman.

But traffic to Wolfram Alpha is in the millions of visits per day, according to Wolfram, and the company is “slightly profitable.” That’s in no small part because high school and college students have figured out at least part of what Wolfram Alpha is useful for — whether they are working on trigonometry equations, music theory or economic models.
“That’s not the worst place to have a core base of users, given they grow up,” Wolfram says.
Wolfram says he takes encouragement from looking at the streams of queries that people put into the search box. Those show that people are trying to use Wolfram Alpha for complicated things like comparing the economies of two countries. And there aren’t many tourists who just show up to see a funny Easter egg in the software, or to enter junk queries.
But Wolfram is frustrated a bit that users don’t know the full power of Wolfram Alpha.
“The mental model for when to go to Wolfram Alpha is not fully fleshed out yet,” Wolfram says.
One of the company’s solution for that is to create a wide range of very focused apps, such as its app for computer network administrators, and those for classes, including astronomy, calculus and algebra.
Wolfram Alpha has also partnered with general purpose search engines such as Bing and DuckDuckGo. The key there, according to Wolfram, is figuring out which of the queries into a general search engine would benefit from a calculated answer, not just a list of links. One of the challenges is that searchers are used to getting search results in single digit milliseconds — while Wolfram Alpha takes considerably longer — say 500 milliseconds — because it’s calculating answers.

One way to solve that is to cache some popular precomputed answers, and — for others — to indicate to searchers that they can get more details on Wolfram Alpha.
“We compute it and do the computation in the background, so by the time they show up, it looks like it was there but it wasn’t,” Wolfram said.
The long-term challenge for Wolfram Alpha is getting more and more datasets into the system. While the process has gotten smoother, each dataset comes with its own unique complexities — meaning that there’s no cookie-cutter approach that will speed new datasets into the engine.
“Our main conclusion is that there is an irreducible amount of work that requires humans and algorithms,” Wolfram said.

The company is also branching out into datasets that one wouldn’t expect from a high-powered calculator, such as info on sports and pop culture, areas that Wolfram Alpha clearly shied away from at first.
“I thought, ‘Gosh, what can you compute about people?’” Wolfram said. “Well, it turns out there’s a lot you can compute, such as what people were born in this city and who was alive at the same time as other people. In every area there is a lot more to compute than you think.”
He’s now thinking about how you can ingest people’s networks of friends (the so-called social graph), how images can be imported and calculated, and what happens when Wolfram Alpha allows people to upload their own data sets.

What’s also becoming apparent is that there are a lot more places that Wolfram Alpha is turning out to be useful than just the website. Makers of software such as spreadsheets and specialized financial applications are turning to the company’s API ,so that they can include computational functions in portions of their software. That means more-diverse revenue for the company, which surprised Wolfram, because when the company launched, he suspected there were only two or three ways for it to make money.
Now he says it’s looking like there are 15 channels or even more.
“People just need what we are doing. It seems like it is a foundational component in so many places,” Wolfram said. “The big debate internally is which of these channels will be the most lucrative, but I think it is still not at all clear.”

And if you think the word channel makes Wolfram sound like an executive, you’d be right.
“I had thought when I started Wolfram Alpha that that stuff isn’t so interesting, and I was going to hire people to figure that out,” Wolfram said. “That didn’t work out so well.”
“So I decided I should learn it, and it’s actually kind of interesting,” Wolfram said. “Now is a fascinating time of platform turbulence, which we haven’t seen since probably about 20 years ago in the rise of PC workstations.”

Wolfram Alpha is also self-funded, as was Mathematica.
And in typical Wolfram style, that makes him both more conservative and more radical than others.
“For 23 years, Mathematica has been a simple private company,” Wolfram said. “For better or worse, that allows one to do much crazier projects than you can through the traditional VC route.”
But doing crazy things doesn’t extend to adding 300 new employees to try to build even faster, even if there’s not enough revenue to pay their salaries.
“I’ve been lucky enough to run a company that’s been profitable for 23 years, so I developed the habit of doing things that way,” Wolfram said.
That’s a way of doing business, that if you think about it, computes much better than getting tens of millions in funding for an iPhone app.

Source WIRED

Tuesday, May 24, 2011

Smart software cracks sound-based CAPTCHA security

Efforts to make the web more accessible have unwittingly made it less secure, according to computer scientists who have developed software to crack the audio CAPTCHAs used by websites as part of their sign-up process.

You're probably familiar with traditional CAPTCHAs, the obscured words used to verify that a new user is a person rather than a bot, but the image-based security measure is difficult for visually impaired people to use. To help such users websites also offer audio CAPTCHAs, in which a computerised voice reads out letters or digits distorted by noise, but their security hadn't been as extensively studied as the visual versions.
Now, researchers have used software called Decaptcha to crack commercial audio CAPTCHAs used by eBay, Microsoft, Yahoo and others, with success rates from 41 to 89 per cent. The system known as reCAPTCHA - developed by the original inventors of the CAPTCHA and now owned by Google - was more resilient to attack, with only 1.5 per cent of CAPTCHAs broken. Even such a low success rate renders audio CAPTCHAs useless, as an attacker in control of a large botnet of infected computers can easily afford to make 100 attempts for every successfully created account.

Decaptcha uses a number of audio-processing techniques to remove noise and identify the individual digits in an audio CAPTCHA. The software has to be trained for 20 minutes on each type of CAPTCHA and can then solve tens of CAPTCHAs per minute on an ordinary desktop computer.
The researchers say their techniques leave most modern audio CAPTCHAs unusable, and alternatives must be developed. Decaptcha struggles only with CAPTCHAs that include semantic noise, which are sounds that share characteristics with spoken digits such as music or vocal tracks. For example, reCAPTCHA uses background conversations to obscure the digits, making it hard for the software to pick them out.

Humans can also find these CAPTCHAs difficult to understand, however, which means reCAPTCHA has a high failure rate. The researchers suggest using music rather than vocal tracks could create CAPTCHAs that are still hard for Decaptcha but easier for humans, because we can tune in to the correct sounds. They presented their work yesterday at the IEEE Symposium on Security and Privacy in Oakland, California.

Source New Scientist

Monday, May 23, 2011

World Record in Ultra-Rapid Data Transmission

Transfer of 700 DVDs in One Second Only – Highest Bit Rate on a Laser 
Das Team um Professor Leuthold (re.):  René Schmogrow, David Hillerkuß und die  Professoren Wolfgang Freude und Christian Koos.(v.re.na.li.) (Foto: Gabi Zachmann)
The team of Professor Leuthold (right): David Hillerkuß, René Schmogrow, and professors Wolfgang Freude and Christian Koos (from right to left). (Photo: Gabi Zachmann)

Scientists of Karlsruhe Institute of Technology (KIT) have succeeded in encoding data at a rate of 26 terabits per second on a single laser beam, transmitting them over a distance of 50 km, and decoding them successfully. This is the largest data volume ever transported on a laser beam. The process developed by KIT allows to transmit the contents of 700 DVDs in one second only. The renowned journal “Nature Photonics” reports about this success in its latest issue (DOI: 10.1038/NPHOTON.2011.74).

With this experiment, the KIT scientists in the team of Professor Jürg Leuthold beat their own record in high-speed data transmission of 2010, when they exceeded the magic limit of 10 terabits per second, i.e. a data rate of 10,000 billion bits per second. This success of the group is due to a new data decoding process. The opto-electric decoding method is based on initially purely optical calculation at highest data rates in order to break down the high data rate to smaller bit rates that can then be processed electrically. The initially optical reduction of the bit rates is required, as no electronic processing methods are available for a data rate of 26 terabits per second.

The team of Leuthold applies the so-called orthogonal frequency division multiplexing (OFDM) for record data encoding. For many years, this process has been used successfully in mobile communications. It is based on mathematical routines (Fast Fourier Transformation). “The challenge was to increase the process speed not only by a factor of 1000, but by a factor of nearly a million for data processing at 26 terabits per second,” explains Leuthold who is heading the Institutes of Photonics and Quantum Electronics and Microstructure Technology at KIT. “The decisive innovative idea was optical implementation of the mathematical routine.” Calculation in the optical range turned out to be not only extremely fast, but also highly energy-efficient, because energy is required for the laser and a few process steps only.

Control of the signal levels: Professor Jürg Leuthold. (Photo: Gabi Zachmann)
Control of the signal levels: Professor Jürg Leuthold. (Photo: Gabi Zachmann)
“Our result shows that physical limits are not yet exceeded even at extremely high data rates”, Leuthold says while having in mind the constantly growing data volume on the internet. In the opinion of Leuthold, transmission of 26 terabits per second confirms that even high data rates can be handled today, while energy consumption is minimized.

“A few years ago, data rates of 26 terabits per second were deemed utopian even for systems with many lasers.” Leuthold adds, “and there would not have been any applications. With 26 terabits per second, it would have been possible to transmit up to 400 million telephone calls at the same time. Nobody needed this at that time. Today, the situation is different.” Video transmissions predominate on the internet and require extremely high bit rates. The need is growing constantly. In communication networks, first lines with channel data rates of 100 gigabits per second (corresponding to 0.1 terabit per second) have already been taken into operation. Research now concentrates on developing systems for transmission lines in the range of 400 Gigabits/s to 1 Tbit/s. Hence, the Karlsruhe invention is ahead of the ongoing development. Companies and scientists from all over Europe were involved in the experimental implementation of ultra-rapid data transmission at KIT. Among them were members of the staff of Agilent and Micram Deutschland, Time-Bandwidth Switzerland, Finisar Israel, and the University of Southampton in Great Britain.


Literature:
26 Tbit s-1 line-rate super-channel transmission utilizing all-optical fast Fourier transform processing. D. Hillerkuss, R. Schmogrow, T. Schellinger, M. Jordan, M. Winter, G. Huber, T. Vallaitis, R. Bonk, P. Kleinow, F. Frey, M. Roeger, S. Koenig, A. Ludwig, A. Marculescu, J. Li, M. Hoh, M. Dreschmann, J. Meyer, S. Ben Ezra, N. Narkiss, B. Nebendahl, F. Parmigiani, P. Petropoulos, B. Resan, A. Oehler, K. Weingarten, T. Ellermeyer, J. Lutz, M. Moeller, M. Huebner, J. Becker, C. Koos, W. Freude, and J. Leuthold. Nature Photonics. DOI: 10.1038/NPHOTON.2011.74

By entering the DOI number, an abstract may be obtained at: http://dx.doi.org/


Source Karlsruhe Institute of Technology

Saturday, May 21, 2011

Aurasma app is augmented reality, augmented



Arriving a quarter hour early for a central London briefing this morning I decided to sit in leafy St James's Square, near Piccadilly, to scan the latest tweets. But before I could do so I spotted a man on a street corner staring intently at an iPad 2 he had trained on the gates to the square.
He showed me what he was watching: onscreen, Marilyn Monroe was apparently dancing in a bright yellow summer dress in the morning sunshine on the edge of the square before us. The guy was beaming at the sheer quality of the augmented reality imagery.


As you may have guessed, this iPad 2 user was the technology entrepreneur I had come to meet: Mike Lynch, co-founder and chief executive of Autonomy, the British software house. I wanted to hear about the augmented reality app his firm has just developed - in part because I couldn't fathom the link between AR and the firm's claim to fame to date: predictive software.

Autonomy has gone from nothing in 1996 to a firm worth £7 billion today by leveraging the theories of an 18th-century English mathematician and cleric called the Reverend Thomas Bayes, who worked out how to calculate the probability that certain variables are associated, whether they are words, behaviours or images. Lynch and his colleagues built their business on a pattern recognition engine called the Intelligent Data Operating Layer (Idol) that uses algorithms based on Bayes' ideas.

On the London tube, Autonomy's Bayesian algorithms that analyse CCTV images to calculate the likelihood someone will try to commit suicide by jumping under a train - allowing the track current to be turned off and help sought. If you follow Formula 1, Ross Brawn's Mercedes F1 team identifies the potential source of every advantage gained by rival teams by training Autonomy algorithms on post-race video. To prevent fraud or noncompliance with the financial laws, workplace emails are analysed to infer risk. And police can use Idol to seek hidden patterns in crime reports.

But that's in the PC world. Now, says Lynch, they want to exploit the awesome and growing power of smartphones like Android and iPhone/iPad. To do this they have written an app they've called Aurasma that allows anybody to associate real world items with online content, which they liken to an aura - hence the name.

"We won't be creating the content but we will be providing the infrastructure - including a 10,000-computer server farm - that allows it to be delivered to the real world," says Lynch.
The idea is that media companies can use Aurasma to relate printed matter - street posters, newspapers, magazines - to compelling video and online content they have made themselves or from TV stations and movie studios. Such use will require payments to Autonomy.

But for the rest of us, the service will be free: you can create your own content you'd like to relate to a place, a building or a park, say. And a social network will be built around this, too, allowing users to follow people whose environmental multimedia content they like.

To make it work, Autonomy's coders have rewritten their Bayesian algorithms for iOS and Android. Because Idol is a robust, probabilistic decision-making system, it means that users do not have to train their phone cameras on a flat, brightly-lit subject. The printed matter can be bent away from the camera at odd 3D angles, be dimly lit and yet still be recognised. And the probability calculation ensures that the displayed video stays withing the bounds of the matter being looked at. So a newspaper photo of David Beckham will pull up video of him playing for LA Galaxy that stays within the image frame on the newspaper's image.

Aurasma for print media hits the Apple Appstore next week, with a version for TV stations arriving in a month. Augmented reality is a hot field of endeavour and Autonomy will have its work cut out making a dent in this nascent field. Major publishers like Carlton Books are already shipping books that use it, for instance.
But Lynch is unruffled by the task. As I leave his office for the sunshine of St James's Square, it's clear he's particularly proud of having squeezed an 18th-century cleric into our pockets.

Paul Marks

Source New Scientist

Wednesday, May 18, 2011

Taking control of your data into your own hands

THE iPhone secretly tracks your location. Amazon has lost your files in the cloud. Hackers have stolen the details of 100 million customers from Sony. This string of revelations has left many people wondering who they can trust with their data.

Step forward David Wetherall at the University of Washington in Seattle and colleagues, who are developing tools to monitor the data transmission of apps and provide easy-to-understand "privacy revelations" about each one. "There is much value in simply revealing to users how they are being tracked," says Wetherall, who presented the concept at the HotOS conference in Napa, California, this week.

Whenever you sign up to a website or install an app, you are potentially giving the company behind the service access to your personal data - even if you don't realise it. Tech companies take steps to protect and inform their users about data usage, for example Apple vets iPhone apps sold through its store, and Google's Android lists the permissions granted to an app prior to installation. But Wetherall's team believes these don't go far enough.

The team gives the example of a sound-meter app funded by adverts, which has access to a phone's microphone to monitor sound levels and to the internet to download ads. Any app with these permissions can also record and upload sound without the user's knowledge, they say.

Tools to halt data leaks already exist, such as WhisperMonitor, an Android app released last week that allows users to monitor and prevent outbound traffic. But Wetherall's team wants to predict data leaks before they happen. To do so, they are developing an app that would run in the background on smartphones or browsers and analyse the flow of information, alerting users before an app tries to access data or pass it to other parties. Crowdsourcing user experiences could also help, allowing people who experience a leak to warn others against using an app.

Unknown data access is just one problem, however. Trusting companies to look after legitimately collected data is also a concern, as shown by the Sony customers who now find themselves at risk of phishing and other types of fraud. Millions more passwords were also put in danger last week when the online password manager service LastPass admitted it had suffered a potential data breach.

With password leaks now a regular occurrence, a switch to biometric "passwords" might be tempting. But a study due to be presented at the IEEE Symposium on Security and Privacy in Oakland, California, later this month suggests this can actually make a system less secure.

Lorie Liebrock and Hugh Wimberly at New Mexico Tech in Socorro asked 96 volunteers to create two user accounts, one secured by just a password, the other by a password and fingerprint reader. They found the passwords chosen for use with the fingerprint reader were 3000 times easier to break, potentially making the overall security of the system lower than simple password use alone.

As these latest leaks illustrate, believing others will keep your data secure can have disastrous consequences.

Source New Scientist

Tuesday, May 3, 2011

Ranking research

How to use social tools to rank the relevance of research papers

A new approach to evaluating research papers exploits social bookmarking tools to extract relevance. Details are reported in the latest issue of the International Journal of Internet Technology and Secured Transactions.
Social bookmarking systems are almost indispensible. Very few of us do not use at least one system whether it's Delicious, Connotea, Trunk.ly, Reddit or any of countless others. For Academics and researchers CiteULike is one of the most popular and has been around since November 2004. CiteUlike [[http://www.CiteULike.org]] allows users to bookmark references but also embeds more conventional bibliographic management. As users of such systems quickly learn the only way to make them useful for others is to ensure that you tag your references comprehensively, but selectively.
On the whole, social bookmarking is very useful but it could be even more so if, rather than using similarity ranking or query-dependent ranking for generating search results if it had a better ranking system.
Researchers in Thailand have now proposed "CiteRank", a combination of a similarity ranking with a static ranking. "Similarity ranking measures the match between a query and a research paper index," they explain. "While a static ranking, or a query-independent ranking, measures the quality of a research paper." Siripun Sanguansintukul of Chulalongkorn University in Bangkok and colleagues have used a group of factors including number of groups citing the posted paper, year of publication, research paper post date, and priority of a research paper to determine a static ranking score, which is then combined with the query-independent measure to give the CiteRank.
The team tested their new ranking algorithm by asking literature researchers to rate the results it produced in ranking research papers obtained from the search engines based on an index that uses, TTA, tag-title-abstract. The weighted algorithm CiteRank 80:20 in which a combination of similarity ranking 80% and static ranking 20% was most effective. They found that many literature researchers preferred to read more recent paper or just-posted papers but they also rated highly classic papers that emerged in the results if they were posted across different user groups or communities. Users found good papers based on priority rating but TTA was still important.
"CiteRank combines static ranking with similarity ranking to enhance the effectiveness of the ranking order," explains Sanguansintukul. "Similarity ranking measures the similarity of the text (query) with the document. Static ranking employed the factors posted on paper. Four factors used are: year of publication, posted time, priority rating and number of groups that contained the posted paper."
"Improving indexing not only enhances the performance of academic paper searches, but also all document searches in general. Future research in the area consists of extending the personalization; creating user profiling and recommender system on research paper searching." the team says. The experimental factors that emerged from the study, can help in the optimization of the algorithm to adjust rankings and to improve search results still further.
###
"CiteRank: combination similarity and static ranking with research paper searching" in International Journal of Internet Technology and Secured Transactions, 2011, 3, 161-177

Source  EurekaAlert!

Monday, May 2, 2011

Eli Pariser: Beware online "filter bubbles"

As web companies strive to tailor their services (including news and search results) to our personal tastes, there's a dangerous unintended consequence: We get trapped in a "filter bubble" and don't get exposed to information that could challenge or broaden our worldview. Eli Pariser argues powerfully that this will ultimately prove to be bad for us and bad for democracy.