Information Technology Reference

In-Depth Information

2.3 Classification-Based Algorithms

Analogously to reducing ranking to regression, one can also consider reducing rank-

ing to a classification problem. Classification is a kind of supervised learning prob-

lem in which the target variable that one is trying to predict is discrete. When for-

malizing ranking as classification, one regards the relevance degree given to a doc-

ument as a category label. Here we introduce some representative algorithms in this

sub-category.

2.3.1 Binary Classification for Ranking

There have been several works that study the use of a classification model for rel-

evance ranking in information retrieval, such as [
4
,
9
] and [
16
]. Here we take [
16
]

and [
9
] as examples to illustrate the basic idea.

m

j

SVM-Based Method

Given documents
x

={

x
j
}

1
, and their binary relevance

=

m

j
=

judgments
y

={

y
j
}

associated with a query
q
, one regards all the relevant doc-

1

uments (i.e.,
y
j
=+

1) as positive examples while all the irrelevant documents (i.e.,

y
j
=−

1) as negative examples, and therefore formulates ranking as a binary classi-

fication problem.

In particular, Support Vector Machines (SVM) [
21
,
22
] are use to perform the

classification task in [
16
]. The formulation of SVM is as follows when applied to

the scenario of ranking (here we suppose linear scoring functions are used, i.e.,

f(w,x)

w
T
x
),

=

m
(i)

n

min
1

ξ
(i)

j

2

2

w

+

λ

i

=

1

j

=

1

w
T
x
(i)

j

+
ξ
(i)

j

if
y
(i)

j

s.t.

≤−

1

,

=

0
.

(2.2)

w
T
x
(i)

j

−
ξ
(i)

j

if
y
(i)

j

≥

1

,

=

1
.

ξ
(i)

j

1
,...,m
(i)
,i
=

≥

0
,j
=

1
,...,n,

where
x
(i)

j
,y
(i
j
are the
j
th document and its label with regards to query
q
i
and
m
(i)

is the number of documents associated with query
q
i
.

The constraints in the above optimization problem correspond to whether each

training document
x
(i)

j
is classified into the right binary class. Actually the loss

function behind the above optimization problem is a hinge loss defined on each

document. For example, if the label
y
(i)

j

1, and the model output
w
T
x
(i)

j

is

+

is no

less than

+

1, then there is a zero loss for this document. Otherwise, the loss will

be
ξ
(i)

j
, which is usually called the soft margin term. We plot the curve of the hinge

loss in Fig.
2.2
for ease of understanding.