Adapted from Hang Li, MSR Asia, 09

Learning to Rank Pradeep Ravikumar !

CS 371R

Ranking

Coke or Pepsi or Sprite)?

Ranking

Ranking Chess Players

Ranking

Ranking Chess Players

Golf Players, etc.

Whither Machine Learning?

Coke, Pepsi, Sprite Coke > Sprite > Pepsi List of Objects

Ranking of Objects

Machine Learning Magic Box

Whither Machine Learning?

Coke, Pepsi, Sprite Coke > Sprite > Pepsi List of Objects

Ranking of Objects From “training” data

Machine Learning Magic Box

Ranking

offerings

Objects

O o o o , , ,

ranking(of(offerings

Ranking of Objects

o s ,

request

Query

o s ,

2 s f s o

( , ) o s n

16 Ranking via Machine Learning Learning(to(Rank

1 ,

1 2 , 1 1 ,

O o o o , , ,

o s f N

) , (

1 m m m m m m o s f o o s f o

1 1 ,

1 2 , 1 1 ,

) , ( ) , ( 2 ,

1 m s

Ranking( System

Learning( System

, 2 , 1 ,

1 n o o o s m n m m m m o o o s

1 “Training” data

Is learning to rank necessary for IR?

Has been successfully used in web-document search
• Has been the focus of increasing research at the intersection of machine
learning + IR

Example: Machine Translation

sentence(source(language

f e

Generative

ranked(sentence candidates(in(target( language

e 1 0 0 0

Re5Ranking(

~ e

Model re5ranked(top sentence(in(target language(

138 Example: Collaborative Filtering I tem1 I tem2 I tem3 ...

Use r1

2 ...

? ? ? Use r M

3 Example: Collaborative Filtering I tem1 I tem2 I tem3 ...

Use r1

4 Use r2

2 ...

? ? ? Use r M

3 Input: User Ratings on some items Output: User Ratings on other items Example: Collaborative Filtering I tem1 I tem2 I tem3 ...

Use r1

4 Use r2

2 ...

? ? ? Use r M

3 Any user is a query; based on this query,  

rank the set of items (based on user-item features)

Example: Key Phrase Extraction

Input: Document • Output: Key phrases in document
• Pre-processing step: extraction of possible phrases in document
Ranking approach: rank the extracted phrases

Other IR applications of learning to rank

Question Answering • Document Summarization Opinion Mining • Sentiment Analysis • Machine Translation • Sentence Parsing • ….

Our focus: ranking documents

ranking(based(on( documents relevance,(importance,( preference

D d , d , , d

ranking(of(documents query

d q

1 q d q ,

( q , d ) f d q , n q Traditional approach to “learning to rank” Probabilistic(Model

N d d d

1 q

) , | ( ~ ) , | ( ~ ) , | ( ~

1 n n

P d q r d P d q r d P d q r d

query documents ranking(of(documents

} , 1 { ) , | (

R P d q r Modern approach to learning to rank Learning(to(Rank

D d d d , ,

1 2 , 1 1 ,

) , ( d q f

1 1 , 1 m m m m m m d q f d d q f d

1 ,

1 2 , 1 1 ,

) , ( ) , ( 2 ,

1 m q

Ranking( System

Learning( System

, 2 , 1 ,

1 n d d d q m n m m m m d d d q

Learning to Rank: Training Process

# # d

# 1 ,

& &

d d & &

1 ,

2 &

q q

$ $

1 $

& &

d d

1 , n

1 % %

1.(Data(Labeling (rank) d d

# # # m ,

& & & d d m ,

& & & q q m m $

$ $ & & & & & & d d m

, n

m % % %

Learning to Rank: Training Process

d y

# #

1 ,

1 1 ,

# 1 ,

& &

d y

1 ,

2 1 ,

1 ,

2 &

d & &

q q

$ $

1 $

& &

d y d

1 , n 1 , n

1 , n

1 % %

1.(Data(Labeling 2.(Feature( (rank) Extraction d d y

# # # m ,

1 m , 1 m ,

& & & d d y m ,

2 m , 2 m ,

& & & q q m m $

$ $ & & & & & & d d y m m m

, n , n , n

m m m % % % Learning to Rank: Training Process features are functions of d y x y

# #

1 ,

1 1 ,

# 1 ,

query AND document & &

d y x y

1 ,

2 1 ,

1 ,

2 &

d & &

q q

$ $

1 $

& &

d y x y d

1 , n 1 , n 1 , n 1 , n

1 , n

1 1 % %

1.(Data(Labeling 2.(Feature( 3.(Learning (rank) Extraction x

( ) f d d y x y

# # # m m m ,

1 m , 1 m , 1 , 1 ,

& & &

x y

d d y m ,

2 m ,

m ,

2 m , 2 m ,

& & & q q m m $

$ $ & & & & & & x y d d y m , n m , n m m m

, n , n , n m m

m m m % % %

Learning to Rank: Testing Process

1.(Data(Labeling (rank)

& & % & & $ #

1 2 , 1 1 ,

m n m

m m m

d d d q

& & % & & $ #

m m m m d d d q

& & % & & $ #

Learning to Rank: Testing Process

1.(Data(Labeling (rank)

m m m x x x

& & % & & $ #

m m n m n m m m m m m y d y d y d q

1 1 ,

1 2 , 1 1 ,

, 1 , 1 2 ,

& & % & & $ #

2.(Feature( Extraction & & % & & $ #

d d d q

m m m

m n m

1 2 , 1 1 ,

& & % & & $ #

Ground truth ranks Learning to Rank: Testing Process 1.(Data(Labeling (rank)

2.(Feature( Extraction 3.((Ranking( with ) (x f

1 2 , 1 1 ,

m m n m n m m m m m y x y x y x

1 1 ,

1 2 , 1 1 ,

, 1 , 1 2 ,

& & % & & $ #

m m n m n m m m m m m y d y d y d q

1 1 ,

, 1 , 1 2 ,

& & % & & $ #

d d d q

m m m

m n m

1 2 , 1 1 ,

& & % & & $ #

Ground truth ranks Learning to Rank: Testing Process 1.(Data(Labeling (rank)

2.(Feature( Extraction

m m n m n m m m m m m y d y d y d q

& & % & & $ #

, 1 , 1 2 ,

1 2 , 1 1 ,

1 1 ,

& & % & & $ #

m m m

, 1 , 1 2 ,

1 2 , 1 1 ,

1 1 ,

m m n m n m m m m m y x y x y x

Ground truth ranks

d d d q

m n m

28 3.((Ranking( with ) (x f

1 1 , 1 1 ,

& & % & & $ #

) ( ) ( ) (

1 , 1 ,

1 ,

1 2 , 1 2 ,

1 2 , 1 1 ,

1 m m m

n m n m n m

m m m m m m

y x f x y x f x y x f x

4.((Evaluation Evaluation

& & % & & $ #

1 2 , 1 1 ,

Result

Issues in learning to rank

Data Labeling • Feature Extraction • Evaluation Measure
• Learning Method (Model, Loss Function, Algorithm)

Doc(A Doc(B Doc(C

Data Labeling E.g.,(relevance(of(documents(w.r.t.(query

Ground truth/supervision: “ranks” of set of documents/objects

Query

Data Labeling: Supervision takes different forms

Multiple levels (e.g., relevant, partially relevant, irrelevant) ‣ Widely used in IR
Ordered Pairs ‣ e.g. ordered pairs between documents (e.g. A>B, B>C) ‣ e.g. implicit relevance judgments derived from click-through data
Fully ordered list (i.e. a permutation) of documents is given

Data Labeling: Implicit Relevance Judgement

ranking(of(documents(at(search(system Doc(A users(often(clicked(on(Doc(B Doc(B ordered(pair

B(>(A Doc(C Feature Extraction Query5document(feature Document(feature

Feature(Vectors .............

Doc(A BM25 PageRank Query .............

Doc(B PageRank BM25

.............

Doc(C PageRank BM25

Feature Extraction: Example Features

BM25 (a classical ranking score function)
• Proximity/distance measures between query and document
Whether query exactly occurs in document
PageRank
…

Evaluation Measures

• Allows us to quantify how good the predicted ranking is when compared with ground truth
ranking
Typical evaluation metrics care more about ranking top results correctly
Popular Measures ‣ NDCG (Normalized Discounted Cumulative Gain) ‣ MAP (Mean Average Precision) ‣ MRR (Mean Reciprocal Rank) ‣ WTA (Winners Take All)

Kendall’s Tau

-! #

Comparing(agreement(between(two(permutations

P Q

n n ( 1 )

1 n n

( 1 ) Number(of(pairs(

P Q

pairs(

Completely(disagree(

Kendall’s Tau

! # 0 "-1

Example

C((A((B

3 2 -

Learning to Rank Methods

Three major approaches ‣ Pointwise approach ‣ Pairwise approach ‣ Listwise approach

Pointwise Approach

• Transforming ranking to regression, classiﬁcation, or ordinal regression
Query-document group structure is ignored

Pointwise Approach

Pointwise(Approach

Learning Regression Classification Ordinal2Regression

Feature(vector

Input

Real(number Category Ordered category

Output

y x y (x ) sign( ( )) y thresold ( ( x )) f f f

Ranking(function

Model

x ( ) f

Loss2

Ordinal(regression( Regression(loss Classification loss Function loss

Pairwise Approach

• Transforming ranking to pairwise classiﬁcation
Query-document group structure is ignored

Pairwise Approach Pairwise(Approach

Learning

Ordered(feature(vector(pair

Input

( x , x ) i j

Classification on(order(of(vector(pair

Output

y sign( ( x x )) i , j f i j

Model

Ranking function( x

( ) Ranking function(( f

Loss2

Pairwise classification(loss( Function Pairwise Ranking Converting(document(list(to(document(pairs

Query

Doc(A Doc(B Doc(C

Query Doc(A Doc(B

Doc(C

Doc(A Doc(B Doc(C Rank(1 Rank(2 Rank(3

Transformed Pairwise Classiﬁcation Problem

Transformed(Pairwise(Classification(Problem x x

1 3 ( ; ) f x w

Positive(Examples x x

3 x x

2 x x

1 x x

Negative(Examples

x x

Listwise Approach

Entire list of documents as input-instance
Query-document group structure is used
Straightforwardly represents learning to rank problem ‣ Many state of the art methods: ListNet, ListMLE, etc.

Listwise Approach

Learning &2Ranking

Feature(vectors

Input

n { } x x

1 i i

Permutation(on(feature(vectors

Output

n sort( { ( )} ) f x

1 i i Ranking(function

Model

( ) f x

Listwise loss(function Loss2Function

(ranking(evaluation(measure)

Adapted from Hang Li, MSR Asia, 09

CS 371R

Machine Learning Magic Box

Machine Learning Magic Box

Objects

Ranking of Objects

Query

Is learning to rank necessary for IR?

Example: Key Phrase Extraction

Result

Data Labeling: Supervision takes different forms

Data Labeling: Implicit Relevance Judgement

Feature Extraction: Example Features

Evaluation Measures

C((A((B

Pointwise Approach

Learning Regression Classification Ordinal2Regression

Input

Output

Model

Loss2

Pairwise Approach

Learning

Input

Output

Model

Loss2

Query

Listwise Approach

Learning &2Ranking

Input

Output

Model

Dokumen yang terkait

PERAN BALAI PEMASYARAKATAN PADA SISTEM PERADILAN PIDANA ANAK DI TINJAU DALAM PERSPEKTIF HAK ASASI MANUSIA (The Role of Balai Pemasyarakatan on Juvenile Justice System Reviewed from Human Rights Perspective)

KONFLIK AGRARIA DI MALUKU DITINJAU DARI PERSPEKTIF HAK ASASI MANUSIA (Agrarian Conflict in Maluku Viewed from the Perspective of Human Rights)

PEREMAJAAN DAN PENGEMBANGAN WILAYAH PERKOTAAN MELALUI PENGGUSURAN DALAM PERSPEKTIF HAK ASASI MANUSIA DI KOTA SURABAYA (Rejuvenation and Development of Urban Areas through Eviction Viewed from Human

Disaster Victimization And Victimology: Lessons from the Aceh and Yogyakarta Earthquakes in Indonesia

Jurnal Penelitian Perhubungan Udara WARTA ARDHIA Evaluasi Fasilitas Peralatan Baggage Handling Di Bandar Udara Hang Nadim Batam Evaluation On Baggage Handling Equipment Facility In Hang Nadim Airport, Batam

Maintenance of Fire Fighting Vehicle in Hang Nadim Airport Batam

Bahan Bakar Alternatif Pengganti Minyak Tanah Dalam Pengomprongan Tembakau Virginia: Tinjauan Dari Aspek Ekonomi Alternative Fuel to Replace Kerosene to Dry Virginia Tobacco: View from Economic Aspect

The Importance of Protecting the High Seas from Bottom Trawling

Antimicrobial activities of marine fungi from Malaysia

Biological Activities of Ethanolic Extracts from Deep-Sea Antarctic Marine Sponges

Dokumen yang Anda mencari sudah siap untuk unduhkan