Diapositive 1
Objectives
Semantic Analysis
Word Sense Disambiguation
Text Indexing in IR
Lexical Transfer in MT
Conceptual vector
Reminiscent of Vector Models (Salton, Sowa, LSI)
Applied on pre-selected concepts (not terms)
Concepts are not independent
Propagation
on morpho-syntactic tree (no surface analysis)

Conceptual vectors
An idea
= a combination of concepts = a vector
The Idea space
= vector space
A concept
= an idea = a vector
= combination of itself + neighborhood
Sense space
= vector space + vector set

Conceptual vectors
Annotations
Helps building vectors
Can take the form of vectors
Set of k basic concepts Ñ example
Thesaurus Larousse = 873 concepts
A vector = a 873 uple
Encoding for each dimension C = 215

Vector construction
Concept vectors
H : Thesaurus hierarchy
V(Ci) : <a1, É, ai, É , an>
aj = 1/ (2 ** Dum(H, i))

Vector construction
Concept vectors
C : mammals
L4 : zoologie, mammals, birds, fish, É
L3 : animals, plants, living beings
L2 : É , time, movement, matter, life , É ,
L1 : the society, the mankind, the world

Vector construction
Concept vectors
Vector construction
Term vectors
Example : cat
Kernel
c:mammal, c:stroke
nmammal  + nstroke
Augmented with weights
c:mammal, c:stroke, 0.75*c:zoology, 0.75*c:love É
nzoology + nmammal + 0.75 nstroke + 0.75 nlove É
Iteration for neighborhood augmentation

Vector construction
Term vectors
Vector space
Basic concepts are not independent
Sense space
= Generator Space  of a real kÕ vector space (unknown)
= Dim kÕ £ k
Relative position of points

Conceptual vector distance
Angular Distance DA(x, y)  = angle (x, y)
0 £ DA(x, y) £ p
if 0 then colinear - same idea
if p/2 then nothing in common
if p then DA(x, -x) with -x  as anti-idea of x

Conceptual vector distance
Distance = acos(similarity)
DA(x, y) = acos(x.y/|x||y|))
DA(x, x) = 0
DA(x, y) = DA(y, x)
DA(x, y) + DA(y, z) ³ DA(x, z)
DA(0, 0) = 0  and  DA(x, 0) = p/2 by definition
DA(ax, by) = DA(x, y) with ab > 0
DA(ax, by) = p - DA(x, y) with ab < 0
DA(x+x, x+y)  = DA(x, x+y) £ DA(x, y)

Conceptual vector distance
Example
DA(tit, tit) = 0
DA(tit, passerine) = 0.4
DA(tit, bird) = 0.7
DA(tit, train) = 1.14
DA(tit, insect) = 0.62
tit = kind of insectivorous passerine É

Conceptual lexicon
Set of (word, vector) = (w, n)*
Monosemy
word
ˆ 1 meaning
ˆ 1 vector
(w, n)

Conceptual lexicon
Polyseme building
Polysemy
word
ˆ n meanings
ˆ n vectors
{(w, n),
 (w.1,
n1) É (w.n, nn) }

Conceptual lexicon
 Polyseme building
n(w) = Πn(w.i) = Πn.i
bank :
bank.1: Mound
bank.3: River border, É
bank.2: Money institution
bank.3: Organ keyboard
bank.4: É

Conceptual lexicon
 Polyseme building
n(w) = classification(w.i)

Lexical scope
LS(w) = LSt(t(w))
LSt(t(w)) = 1 if t is a leaf
LSt(t(w)) = (LS(t1) + LS(t2))
/(2-sin(D(
t(w)))
otherwise
n(w) = nt(t(w))
nt(t(w)) = n(w) if t is a leaf
nt(t(w)) = LS(t1)nt(t1) + LS(t2)nt(t2)
otherwise

Vector Statistics
Norm (N)
[0 , 1]  * C (215=32768)
Intensity (I)
Norm / C
Usually I = 1
Standard deviation (SD)
SD2 = variance
variance = 1/n * Œ(xi - m)2 with m as the arith mean

Vector Statistics
Variation coefficient  (CV)
CV = SD / mean
No unity - Norm independent
Pseudo Conceptual strength
If A Hyperonym B Þ CV(A) > CV(B)
(we donÕt have † )
vector Ç fruit juice È (N)
MEAN = 527, SD = 973 CV = 1.88
vector Ç drink È (N)
MEAN = 443, SD = 1014 CV = 2.28

Vector operations
Sum
V = X + Y Þ vi = xi + yi
Neutral element : 0
Generalized to n terms : V = ΠVi
Normalization of sum : vi /|V|* c

Vector operations
Term to term product
V = X € Y Þ  vi = xi * yi
Neutral element : 1
Generalized to n terms V = Í Vi

Vector operations
Amplification
V = X ^ n  Þ  vi = sg(vi) * |vi|^ n
… V = V ^ 1/2 and n… V = V ^ 1/n
V € V = V ^ 2 if " vi ³ 0
Normalization of ttm product to n terms

V = n
Í Vi

Vector operations
Product + sum
V =  X € Y  = ( X € Y ) + X + Y
Generalized n terms : V = nÍ Vi + Œ Vi
Simplest request vector computation in IR

Vector operations
Subtraction
V = X - Y  Þ  vi = xi - yi
Dot subtraction
V = X × Y  Þ  vi = max (xi - yi, 0)
Complementary
V = C(X) Þ  vi = (1 - xi/c) * c
 etc.

Intensity Distance
Intensity of normalized ttm product
0 £ I(… (X € Y)) £ 1 if |x| = |y| = 1
DI(X, Y) = acos(I(… X € Y))
DI(X, X) = 0 and DI(X, 0) = p/2
DI(tit, tit) = 0 (DA = 0)
DI(tit, passerine) = 0.25 (DA = 0.4)
DI(tit, bird) = 0.58 (DA = 0.7)
DI(tit, train) = 0.89 (DA = 1.14)
DI(tit, insect) = 0.50 (DA = 0.62)

Relative synonymy
SynR(A, B, C) Ñ C as reference feature
SynR(A, B, C) = DA(A€C, B€C)
DA(coal,night) = 0.9
SynR(coal, night, color) = 0.4
SynR(coal, night, black) = 0.35

Relative synonymy
SynR(A, B, C) = SynR(B, A, C)
SynR(A, A, C) = D (A € C, A € C) = 0
SynR(A, B, 0) = D (0, 0) = 0
SynR(A, 0, C) =  p/2
SynA(A, B) = SynR(A, B, 1)
= D (A € 1, B € 1)
= D (A, B)

Subjective synonymy
SynS(A, B, C) Ñ C as point of view
SynS(A, B, C) = D(C-A, C-B)
0 £ SynS(A, B, C) £  p
normalization:
0 £ asin(sin(SynS(A, B, C))) £  p/2

Subjective synonymy
When |C| ¨ ´ then SynS(A, B, C) ¨ 0
SynS(A, B, 0) = D(-B, -A) = D(A, B)
SynS(A, A, C) = D(C-A, C-A) = 0
SynS(A, B, B) = SynS(A, B, A)  = 0
SynS(tit, swallow, animal) = 0.3
SynS(tit, swallow, bird) = 0.4
SynS(tit, swallow, passerine) = 1

Semantic analysis
Vectors propagate on syntactic tree

Semantic analysis
Semantic analysis
Initialization - attach vectors to nodes

Semantic analysis
Propagation (up)

Semantic analysis
Back propagation (down)
n(Ni j) = (n(Ni j) € n(Ni)) + n(Ni j)

Semantic analysis
Sense selection or sorting

Sense selection
Recursive descent
on t(w) as decision tree
DA(nÕ, ni)
Stop on a leaf
Stop on an internal node

Vector syntactic schemas
S: NP(ART,N)
ˆ n(NP) = V(N)
S: NP1(NP2,N)
ˆ n(NP1) = a n(NP1)+ n(N) 0<a<1
n(sail boat) = n(sail) + 1/2 n(boat)
n(boat sail) = 1/2 n(boat) + n(sail)

Vector syntactic schemas
Not necessary linear
S: GA(GADV(ADV),ADJ)
ˆ n(GA) = n(ADJ)^p(ADV)
p(very) = 2
p(mildly) = 1/2
n(very happy) = n(happy)^2
n(mildly happy) = n(happy)^1/2

Iteration & convergence
Iteration with convergence
Local
D(ni, ni+1) £ e for top n
Global
D(ni, ni+1) £ e for all n

Lexicon construction
Manual kernel
Automatic definition analysis
Global infinite loop = learning
Manual adjustments

Application
machine translation
Lexical transfer
n source ¨ n target
Knn search  that minimizes  DA(nsource, ntarget)
Submeaning selection
Direct
Transformation matrix

Application
Information Retrieval on Texts
Textual document indexation
Language dependant
Retrieval
Language independent - Multilingual
Domain representation
horse Ç equitation
Granularity
Document, paragraphs, etc.

Application
Information Retrieval on Texts
Index = Lexicon =  (di , ni )*
Knn search  that minimizes  DA(n(r), n(di))

Search engine
 Distances adjustments
Min DA(n(r), n(di)) may pose problems
Especially with small documents
Correlation between CV & conceptual richness
Pathological cases
Ç plane È and Ç plane plane plane plane É È
Ç inundation È Ç Ç blood È D = 0.85 (liquid)

Search engine
 Distances adjustments
Correction with relative intensity
Request vs retrieved doc (nr and nd)
D = … (DA(nr , nd) * DI(nr , nd))
0 £ I(nr , nd) £ 1 ¨ 0 £ DI(nr , nd) £  p/2

Conclusion
Approach
statistical (but not probabilistic)
thema (and rhema ?)
Combination of
Symbolic methods (IA)
Transformational systems
Similarity
Neural nets
With large Dim (> 50000 ?)

Conclusion
Self evaluation
Vector quality
Tests against corpora
Unknown words
Proper nouns of person, products, etc.
Lionel Jospin, Danone, Air France
Automatic learning
Badly handled phenomena?
Negation & Lexical functions (Meltchuk)

Diapositive 49