# Online Clustering

$D=\{X^1,X^2,...,X^N\}$ $X\in R^d\$ $Cost=(Nd)$

What if we only care about clusters.

$C=\{b^1,b^2,...,b^M\}$ $M<<N$ $Cost=(Md+N)$
• Which cluster dose a point belong? Win Function

We define the $$X^i$$ belongs to the nearest cluster.

$win(i)=j,\quad s.t.\,min\{|X^i-b^j|\}$
• Now, assume that we already know $$M$$

So first we choose MM random $$b$$ from data points, then we iterate all data point, trying to minimize this Error function.

$E=\frac{1}{N}\sum_{i=1}^{N}||X^i-b^{win(i)}||^2$

$E=\frac{1}{N}\sum_{i=1}^{N}||X^i-b^{win(i)}||^2$ $b^{win(i)}(t+1)=b^{win(i)}(t)-\alpha\nabla E$ $b^{win(i)}(t+1)=b^{win(i)}(t)+\alpha(X^i-b^{win(i)})$
• Adjustment of $$\alpha(t)$$
$\alpha(t)=\alpha(0)e^{-\frac{t}{\tau}}$
• Situation that we do not know $$M$$

Just choose $$M$$ from 1 to $$N$$, find the kneel point where $$E(M)$$ does not decrease sharply.

• Modifying Clustering to fit non-linear situation

In such case, we can use $$M$$ red cluster to this type of data, then connect those cluster center.

We just need to modify the cluster function in order to make sure it cluster in order. $$b^1,b^2,b^3...b^M$$

• Neighbor Factor

We add a neighbor factor to achieve our goal, instead of update a single weight, we update all weight simutaouensly to ensure the order. $$h(c,j)=e^{\frac{-(j-c)^2}{\sigma^2}}$$

In this gaussian distribution, sigma represent how much neighbor will be infected in one iteration, so this will decrease as iteration times increase, basically it follows the same exponatial decrease formula like the $$learning\space rate$$ does.

So we now have:

$b^{j}(t+1)=b^{j}(t)+h(win(i),j)\alpha(X^i-b^{j})(learning\space rate=\alpha)$
• 2-D case

In the above, we only consider when the $$b_i$$ lies in a one dimensional line, but we can have other high dimensional space like a plane, so we can let $$b_i$$ lies in a 2-D net or a 3-D cube to achieve our goal. $$b^{i}\rightarrow b^{i,j}$$