How (Much) To Trust Wikipedia - CITRIS and the Banatao Institute

by Gordy Slack

Find the demo on Wikipedia Trust at http://wiki-trust.cse.ucsc.edu/

Professor Luca de Alfaro at UC Santa Cruz has developed a tool that gauges the trustworthiness of Wikipedia stories based on the author's reputation.

A team of engineers led by UC Santa Cruz computer scientist Luca de Alfaro is working on a solution
to Wikipidia's central paradox. The online encyclopedia is among the world's
ten most-visited websites. The English edition alone has more than two million
articles, and about seven percent of all Internet users are said to use the
site daily. Unlike other, more traditional reference works whose authority
stems from the expertise of hand-picked authors and editors, Wikipedia is a
collaborative effort, written by thousands of volunteers, some of whom are
experts in their fields and others who are not. Its openness to contributions
from anyone with access to the Internet is key to Wikipedia's success, but it
is also the site's Achilles' heel.

Mature articles on Wikipedia are, on average, very reliable; some studies have
argued that they compete with the Encyclopedia Britannica for accuracy. But
because anyone can edit Wikipedia, the integrity of an article can easily be compromised
by a malicious vandal or even a well-meaning contributor. Users who turn
to the reference seeking an important answer to a specific question, therefore,
can never be sure what kind of authority lies behind any particular piece of
information.

De Alfaro's solution is a software tool that automatically evaluates and
indicates the trust value of each of Wikipedia's billions of words. "We
are trying to provide a simple visual guide that shows how reliable the
information on Wikipedia is," he says.

In late November, a live prototype of de Alfaro's system was launched with
support of both CITRIS and the Wikimedia Foundation, the non-profit that runs
Wikipedia.

Underlying the program developed by de Alfaro's team is a reputation system that
grades Wikipedia authors. The system assigns, and updates, a numerical
value of reputation to each author. Authors start with a low value of
reputation, and they gain as they make contributions that are preserved by
subsequent authors. Authors' reputations can also lose value, if the
changes they make to Wikipedia articles are undone.

"The reputation we compute for authors is a good predictor of future
behavior: authors with high reputations really do tend to make longer-lasting
contributions to Wikipedia," says de Alfaro.

The trustworthiness of Wikipedia text is computed on the basis of the edits it
has received over the course of time – an idea first suggested by a Stanford
group led by Richard Fikes and Deborah McGuinness. De Alfaro's system assigns
to each word a value of trust derived from the reputation of the author of the
word, as well as on the reputation of all visitors who edited nearby text,
thereby lending to it some of their reputation.

"When people edit text and leave some sentences unchanged, they implicitly
vote for the correctness of the text they have left unchanged," explains
de Alfaro. "We are essentially automating the usual process of text
revision: for each piece of text, we take into account all the people who
revised it, giving more weight to people of higher reputation."

"At the user level," says de Alfaro, "text in clear background
is very likely to be accurate, whereas text with orange background is less
certain. The darker the text highlighting, the less confident you can be about
the quality of the content," he says.

"The things marked in the darkest orange are recent changes by
low-reputation authors. Recent changes by high-reputation authors are marked in
a mid-level of orange. Text gradually fades back to white as high-trust people
review and revise it," says de Alfaro.

"One of the benefits of our system is that it becomes very difficult to
surreptitiously tamper with Wikipedia text," says de Alfaro.

The text of articles that have been stable for many revisions appears on a
mostly white background; many authors have lent their reputation to it.
Against this white background, any recent modification stands out clearly,
sending a red flag to users.

"Every recent change, no matter how high the reputation of the author, is
still reflected in some degree of orange shading. Nobody can
single-handedly create trusted (white-background) information: full trust, or a
white background, can only be achieved by collaboration and agreement."

"We made the program entirely data driven," says de Alfaro.
"Rankings are not calculated based on human evaluations of authors the way
they are on E-Bay, where buyers and sellers grade one another after each
transaction." De Alfaro believes that a data-driven approach can be
equally accurate and friendlier to contributors, who are not asked to rate each
other. In particular, de Alfaro hopes that the lack of personal rankings
will avoid ad hominem attacks and
help protect the open and collaborative Wikipedia culture.

The Wikimedia Foundation is still deciding whether to adopt the system as an
optional overlay for its live site. Even if it does, casual users who prefer to
browse without the filter would be able to turn it off and just view the latest
text, says de Alfaro.

There are good ethical and cultural reasons to avoid putting human judgment
into the mix when evaluating Wikipedia's text, but there is also a resource
advantage to applying a mechanical system to the entire site. In a couple of
days of processing time de Alfaro's program can read and colorize the whole
Wikipedia site and eventually, if it is adopted by Wikipedia and woven into its
interface, evaluate each edit and entry as they are made.

Related site:

Luca
de Alfaro’s research site: http://trust.cse.ucsc.edu/

The demo on Wikipedia Trust is at: http://wiki-trust.cse.ucsc.edu/