conceptual vectors

operators
and functions

 
   

Generality

Vector. a vector V is a n-uple. Each value are mathematicaly defined between 0 and 1.

Practicaly, each component vi is defined between 0 and encoding-size. Note that the choice of the encoding-size is theorically without consequence as for any interval of R there is a bijection between [0, 1]. But, from a practical point of view, we have a tradeoff between the discrete set cardinal and representation size. In our prototype, we set the encoding-size to 215.

When a vector has a mathematical norm equal to 1, then the implementation norm is equal to encoding-size.

n is the dimension of V, i.e. the number of concepts of which is defined.

Star metaphor. When meaningful, we will use the star metaphor to illustrate our concepts and function. This metaphor often guided us in the naming process of functions. Basically the metaphor is the following :

A meaning is a point in the sence space, and a vector is the coordinate set of this meaning.

A star if a know word as found in dictionaries. It is a point that is visible that that "attracts" attention. Its position of defined by its meanings that are not visible (but can be guessed).

The space contains constellations of words. Some part of the space are dense with stars, some other are quite empty.

The origin point (all coordinates equal to 0) if the orign of the space. Unless otherwise, star are viewed from this point of view and as such can be compared by angular distance (but also by euclidian or other distance functions).

Basic Properties

Norm. The implementation norm of vector V.

Norm(V) = sqrt(v12 + … + vn2).

Mass. The Mass is the mathematical norm of V usually noted |V|.

Mass(V) = Norm(V)/encoding-size

If the mass is equal to 1, then V is normalized. Maybe otherwise valued especially with term to term product. A high mass will "weight" more in vector operations and tends attract more other meanings.

Dim. The dimension of vector V, i.e the value of n.

Basic Vector operations

Normalization. V = N(X) is the normalization function of a vector X. The resulting vector V is normalized (its implementation norm = 1 * encoding-size). Each vi of V is equal to xi/|X|*encoding-size.

Sum. V = X1 + … + Xm is the generalized sum between m vectors X1, … Xm. A component of V noted as vi is equal to x1 + … + xn. Generally the result V is normalized, so we have :

V =N(X1 + … + Xm)

Product. V = X1 * … * Xm is the generalized term to term product between n vectors. A component of V noted as vi is equal to x1 * … * xn. Generally the result V is NOT normalized as Mass(sqrtn(V)) is generally a good indicator of similarity.

n-Square-Root. V = sqrtn(X) is the nth square root applied to each term of X. This is the usual normalizing function applied to term to term product result. The norm may be < 1. A component of V noted as vi is equal to sqrtn(xi).

n-Power. V = pown(X) is the nth power applied to each component of X. A generalisation of the function above. Obviously, we have pow1/n(X) = sqrtn(X). By definition, if n = 0, all terms are set to 1 (even if the term is equal to 0, although 00 is not defined).

Amplification. V = Ampn(X) is the signed nth power applied of each component of X. A component of V noted as vi is equal to (xi)n * sign(xi). [note that sign(x) = -1 iff x < 0, sign(x) = 1 iff x > 0, and sign(x) = 0 iff x = 0.]

Diff. Difference V = X - Y is the difference between 2 vectors. A component of V noted as vi is equal to xi - yi. The resulting V is generaly normalized, so in practice we have :

V =N(X - Y)

DDiff. Dotted Difference V = X \ Y is the dotted difference between 2 vectors. A component of V noted as vi is equal to xi - yi if x1 - y2 > 0 and 0 otherwise. The resulting V is generaly normalized, so in practice we have :

V =N(X \ Y)

Comp. Complementary V = C(X) is the complementary function of a vector. A component of V noted as vi is equal to (encoding-size - xi). The resulting vector V' is generaly normalized, so in practice we have :

V =N(C(X))

Statistical functions

Mean. The arithmetic mean between vi. Mean(V) = (v1 + … + vn)/n

Var. Variance of vi. With a flat vector V we have Var(V) = 0.

SD. Standard deviation between vi. SD(V) = sqrt(Var(V)).

VC. Variation Coefficent = SD/Mean. Meaningfull when all vi are > 0.

Interesting value is 1 (?). The higher the value, the more "conceptual" is the vector. A good indicator to hyperonymy. Precisely, we have :

If X is hyperonym of Y then we (statistically) have VC(X) > VC(Y).

Naturally, the reverse if also true : If X is hyponym of Y then we (statistically) have VC(X) < VC(Y).

We DO NOT HAVE the reciprocal : if VC(X) > VC(Y) then …

     
   

Other Vector functions

Inter. (Intersection function) V = Inter(X, Y) :

V = sqrt2(X * Y)

Gamma. (Contextualization function) V = C(X, Y) :

V = N(X + Inter(X, Y))

Anti-Gamma. (Anti-Contextualization function) V = AntiC(X, Y) :

V = N(X — Inter(X, Y))

Note : in both cases (C and AntiC), the value of V is basically X plus (or minus) the intersection of X and Y. The value Mass(Inter(X, Y)) is the determining factor (this mass is always between 0 and 1).

 

     
   

Request computing methods

A request is a set of words w1, … wp. To each word is associated a vector Xi as found in the lexicon.

Sum. The resulting vector is the center of gravity of all given vectors.

V = Sum(X1, … , Xp)
= N(X1 + … + Xp)

Sum-product. The resulting vector is the center of gravity plus the common intersection.

V = SumP(X1, … +, Xp)
= N( N(X1 + … + Xp) + N(sqrtp(X1 * … * Xp)) )

Static Resonance. The request vector is the normalized sum of the application of the resonance function between each vector and the other.

V = StaticResm(X1, … , Xn))
= N( Res(X1 , X2+ … + Xn)) + … + Res(Xn , X1+ … + Xn-1)) )

This function is static (opposed to dynamic) because only V is adjusted and Xi are fixed.The static resonance function Res as computed for step n is defined recursively as follows :

step 0 : Res0(Y, X1 , … , Xp) = V

step n+1 : Resn+1(Y, X1 , … , Xp) = Resn(Y, X1 , … , Xp) + sqrt2(Resn(Y, X1 , … , Xp)*X1) + … sqrt2(Resn(Y, X1 , … , Xp)*Xp).

Dynamic Resonance. Static resonance (m steps) is applied n times for each X.

V = DynResm(X1, … , Xp))

step 0 : vi0 = StaticResm(vi, v1 , … , vi-1, vi+1, vp)

step n+1 : vin+1 = StaticResm(vin, v1n , … , vi-1n, vi+1n, vpn)

Both Resonance functions are deemed to be convergent.

     
   
last update17 april 2001
mathieu lafourcade LIRMM - 161, rue ADA - 34392 Montpellier Cedex 5 - France - Tél : (33) 04 67 41 85 71 - Fax : (33) 04 67 41 85 00 - courriel : lafourca@lirmm.fr