Diapositive 1 |
Objectives |
Semantic Analysis | |||
Word Sense Disambiguation | |||
Text Indexing in IR | |||
Lexical Transfer in MT | |||
Conceptual vector | |||
Reminiscent of Vector Models (Salton, Sowa, LSI) | |||
Applied on pre-selected concepts (not terms) | |||
Concepts are not independent | |||
Propagation | |||
on morpho-syntactic tree (no surface analysis) |
Conceptual vectors |
An idea | ||
= a combination of concepts = a vector | ||
The Idea space | ||
= vector space | ||
A concept | ||
= an idea = a vector | ||
= combination of itself + neighborhood | ||
Sense space | ||
= vector space + vector set | ||
Conceptual vectors |
Annotations | ||
Helps building vectors | ||
Can take the form of vectors | ||
Set of k basic concepts Ñ example | ||
Thesaurus Larousse = 873 concepts | ||
A vector = a 873 uple | ||
Encoding for each dimension C = 215 | ||
Vector construction Concept vectors |
H : Thesaurus hierarchy | ||
V(Ci) : <a1, É, ai, É , an> | ||
aj = 1/ (2 ** Dum(H, i)) | ||
Vector construction Concept vectors |
C : mammals | ||
L4 : zoologie, mammals, birds, fish, É | ||
L3 : animals, plants, living beings | ||
L2 : É , time, movement, matter, life , É , | ||
L1 : the society, the mankind, the world | ||
Vector construction Concept vectors |
Vector construction Term vectors |
Example : cat | |||
Kernel | |||
c:mammal, c:stroke | |||
nmammal + nstroke | |||
Augmented with weights | |||
c:mammal, c:stroke, 0.75*c:zoology, 0.75*c:love É | |||
nzoology + nmammal + 0.75 nstroke + 0.75 nlove É | |||
Iteration for neighborhood augmentation |
Vector construction Term vectors |
Vector space |
Basic concepts are not independent | |||
Sense space | |||
= Generator Space of a real kÕ vector space (unknown) | |||
= Dim kÕ £ k | |||
Relative position of points |
Conceptual vector distance |
Angular Distance DA(x, y) = angle (x, y) | ||
0 £ DA(x, y) £ p | ||
if 0 then colinear - same idea | ||
if p/2 then nothing in common | ||
if p then DA(x, -x) with -x as anti-idea of x |
Conceptual vector distance |
Distance = acos(similarity) | |||
DA(x, y) = acos(x.y/|x||y|)) | |||
DA(x, x) = 0 | |||
DA(x, y) = DA(y, x) | |||
DA(x, y) + DA(y, z) ³ DA(x, z) | |||
DA(0, 0) = 0 and DA(x, 0) = p/2 by definition | |||
DA(ax, by) = DA(x, y) with ab > 0 | |||
DA(ax, by) = p - DA(x, y) with ab < 0 | |||
DA(x+x, x+y) = DA(x, x+y) £ DA(x, y) |
Conceptual vector distance |
Example | ||
DA(tit, tit) = 0 | ||
DA(tit, passerine) = 0.4 | ||
DA(tit, bird) = 0.7 | ||
DA(tit, train) = 1.14 | ||
DA(tit, insect) = 0.62 | ||
tit = kind of insectivorous passerine É |
Conceptual lexicon |
Set of (word, vector) = (w, n)* | ||
Monosemy | ||
word | ||
ˆ 1 meaning | ||
ˆ 1 vector | ||
(w, n) |
Conceptual lexicon Polyseme building |
Polysemy | ||
word | ||
ˆ n meanings | ||
ˆ n vectors | ||
{(w, n), (w.1, n1) É (w.n, nn) } |
Conceptual lexicon Polyseme building |
n(w) = Œ n(w.i) = Œ n.i | ||
bank : | ||
bank.1: Mound | ||
bank.3: River border, É | ||
bank.2: Money institution | ||
bank.3: Organ keyboard | ||
bank.4: É |
Conceptual lexicon Polyseme building |
n(w) = classification(w.i) |
Lexical scope |
LS(w) = LSt(t(w)) | ||
LSt(t(w)) = 1 if t is a leaf | ||
LSt(t(w)) = (LS(t1) + LS(t2)) /(2-sin(D(t(w))) |
||
otherwise | ||
n(w) = nt(t(w)) | ||
nt(t(w)) = n(w) if t is a leaf | ||
nt(t(w)) = LS(t1)nt(t1) + LS(t2)nt(t2) | ||
otherwise | ||
Vector Statistics |
Norm (N) | ||
[0 , 1] * C (215=32768) | ||
Intensity (I) | ||
Norm / C | ||
Usually I = 1 | ||
Standard deviation (SD) | ||
SD2 = variance | ||
variance = 1/n * Œ(xi - m)2 with m as the arith mean |
Vector Statistics |
Variation coefficient (CV) | ||||
CV = SD / mean | ||||
No unity - Norm independent | ||||
Pseudo Conceptual strength | ||||
If A Hyperonym B Þ CV(A) > CV(B) | ||||
(we donÕt have † ) | ||||
vector Ç fruit juice È (N) | ||||
MEAN = 527, SD = 973 CV = 1.88 | ||||
vector Ç drink È (N) | ||||
MEAN = 443, SD = 1014 CV = 2.28 |
Vector operations |
Sum | ||
V = X + Y Þ vi = xi + yi | ||
Neutral element : 0 | ||
Generalized to n terms : V = Œ Vi | ||
Normalization of sum : vi /|V|* c | ||
Vector operations |
Term to term product | |||
V = X € Y Þ vi = xi * yi | |||
Neutral element : 1 | |||
Generalized to n terms V = Í Vi | |||
Vector operations |
Amplification | |||
V = X ^ n Þ vi = sg(vi) * |vi|^ n | |||
… V = V ^ 1/2 and n… V = V ^ 1/n | |||
V € V = V ^ 2 if " vi ³ 0 | |||
Normalization of ttm product to n
terms V = n… Í Vi |
|||
Vector operations |
Product + sum | ||
V = X € Y = ( X € Y ) + X + Y | ||
Generalized n terms : V = n… Í Vi + Œ Vi | ||
Simplest request vector computation in IR |
Vector operations |
Subtraction | ||
V = X - Y Þ vi = xi - yi | ||
Dot subtraction | ||
V = X × Y Þ vi = max (xi - yi, 0) | ||
Complementary | ||
V = C(X) Þ vi = (1 - xi/c) * c | ||
etc. |
Intensity Distance |
Intensity of normalized ttm product | |||
0 £ I(… (X € Y)) £ 1 if |x| = |y| = 1 | |||
DI(X, Y) = acos(I(… X € Y)) | |||
DI(X, X) = 0 and DI(X, 0) = p/2 | |||
DI(tit, tit) = 0 (DA = 0) | |||
DI(tit, passerine) = 0.25 (DA = 0.4) | |||
DI(tit, bird) = 0.58 (DA = 0.7) | |||
DI(tit, train) = 0.89 (DA = 1.14) | |||
DI(tit, insect) = 0.50 (DA = 0.62) |
Relative synonymy |
SynR(A, B, C) Ñ C as reference feature | ||
SynR(A, B, C) = DA(A€C, B€C) | ||
DA(coal,night) = 0.9 | ||
SynR(coal, night, color) = 0.4 | ||
SynR(coal, night, black) = 0.35 | ||
Relative synonymy |
SynR(A, B, C) = SynR(B, A, C) | ||
SynR(A, A, C) = D (A € C, A € C) = 0 | ||
SynR(A, B, 0) = D (0, 0) = 0 | ||
SynR(A, 0, C) = p/2 | ||
SynA(A, B) = SynR(A, B, 1) | ||
= D (A € 1, B € 1) | ||
= D (A, B) |
Subjective synonymy |
SynS(A, B, C) Ñ C as point of view | ||
SynS(A, B, C) = D(C-A, C-B) | ||
0 £ SynS(A, B, C) £ p | ||
normalization: | ||
0 £ asin(sin(SynS(A, B, C))) £ p/2 | ||
Subjective synonymy |
When |C| ¨ ´ then SynS(A, B, C) ¨ 0 | ||
SynS(A, B, 0) = D(-B, -A) = D(A, B) | ||
SynS(A, A, C) = D(C-A, C-A) = 0 | ||
SynS(A, B, B) = SynS(A, B, A) = 0 | ||
SynS(tit, swallow, animal) = 0.3 | ||
SynS(tit, swallow, bird) = 0.4 | ||
SynS(tit, swallow, passerine) = 1 |
Semantic analysis |
Vectors propagate on syntactic tree |
Semantic analysis |
Semantic analysis |
Initialization - attach vectors to nodes |
Semantic analysis |
Propagation (up) |
Semantic analysis |
Back propagation (down) | |
n(Ni j) = (n(Ni j) € n(Ni)) + n(Ni j) |
Semantic analysis |
Sense selection or sorting |
Sense selection |
Recursive descent | |||
on t(w) as decision tree | |||
DA(nÕ, ni) | |||
Stop on a leaf | |||
Stop on an internal node |
Vector syntactic schemas |
S: NP(ART,N) | ||
ˆ n(NP) = V(N) | ||
S: NP1(NP2,N) | ||
ˆ n(NP1) = a n(NP1)+ n(N) 0<a<1 | ||
n(sail boat) = n(sail) + 1/2 n(boat) | ||
n(boat sail) = 1/2 n(boat) + n(sail) |
Vector syntactic schemas |
Not necessary linear | |||
S: GA(GADV(ADV),ADJ) | |||
ˆ n(GA) = n(ADJ)^p(ADV) | |||
p(very) = 2 | |||
p(mildly) = 1/2 | |||
n(very happy) = n(happy)^2 | |||
n(mildly happy) = n(happy)^1/2 |
Iteration & convergence |
Iteration with convergence | ||
Local | ||
D(ni, ni+1) £ e for top n | ||
Global | ||
D(ni, ni+1) £ e for all n |
Lexicon construction |
Manual kernel | |
Automatic definition analysis | |
Global infinite loop = learning | |
Manual adjustments | |
Application machine translation |
Lexical transfer | |||
n source ¨ n target | |||
Knn search that minimizes DA(nsource, ntarget) | |||
Submeaning selection | |||
Direct | |||
Transformation matrix |
Application Information Retrieval on Texts |
Textual document indexation | |||
Language dependant | |||
Retrieval | |||
Language independent - Multilingual | |||
Domain representation | |||
horse Ç equitation | |||
Granularity | |||
Document, paragraphs, etc. |
Application Information Retrieval on Texts |
Index = Lexicon = (di , ni )* | |
Knn search that minimizes DA(n(r), n(di)) |
Search engine Distances adjustments |
Min DA(n(r), n(di)) may pose problems | |||
Especially with small documents | |||
Correlation between CV & conceptual richness | |||
Pathological cases | |||
Ç plane È and Ç plane plane plane plane É È | |||
Ç inundation È Ç Ç blood È D = 0.85 (liquid) | |||
Search engine Distances adjustments |
Correction with relative intensity | ||
Request vs retrieved doc (nr and nd) | ||
D = … (DA(nr , nd) * DI(nr , nd)) | ||
0 £ I(nr , nd) £ 1 ¨ 0 £ DI(nr , nd) £ p/2 |
Conclusion |
Approach | ||
statistical (but not probabilistic) | ||
thema (and rhema ?) | ||
Combination of | ||
Symbolic methods (IA) | ||
Transformational systems | ||
Similarity | ||
Neural nets | ||
With large Dim (> 50000 ?) |
Conclusion |
Self evaluation | |||
Vector quality | |||
Tests against corpora | |||
Unknown words | |||
Proper nouns of person, products, etc. | |||
Lionel Jospin, Danone, Air France | |||
Automatic learning | |||
Badly handled phenomena? | |||
Negation & Lexical functions (Meltchuk) |
Diapositive 49 |