A strategy to quantify embedding layer

时间：2020-06-11 10:42:26 阅读：78 评论：0 收藏：0 [点我收藏+]

标签：example isp roc which embedding express wan cti main

A strategy to quantify embedding layer

Basic idea

Embedding is mainly in the process of word pre-training. Two embedding methods, word2vec and GloVe, are commonly used. Generally speaking, the calculation matrix size of embedding is \(V \times h\) where, \(V\) is the size of the one-hot vector, \(h\) is the size of the vector after embedding. For a slightly larger corpus, the parameters of this process are very large, the main reason is that the \(V\) is too large. The main idea is to not use one-hot vector to represent words, but to use a code \(C_w\) to represent, the way to express is:

\[C_{w}=\left(C_{w}^{1}, C_{w}^{2}, \ldots, C_{w}^{M}\right) \]

That is, the dimension of the word becomes the \(M\) dimension, where \(C_w^i \in [1,K]\) , Therefore, \(C_w^i\) can essentially be regarded as a one-hot vector of \(K\) dimension, and \(C_w\) is a collection of one-hot vectors. At this time, if we want to embedding the word vector C, we need a matrix, which is \(E_1, E_2, \dots, E_M\).

For example

if we have \(C_{dog} = (3, 2, 4, 1)\) and \(C_{dogs} = (3, 2, 4, 2)\) , in this condition, \(K = 4\) and \(M=4\), \(E_1 = \{e_{11}, e_{12}, e_{13}, e_{14}\}\) \(E_2 = \{e_{21}, e_{22}, e_{23}, e_{24}\}\) and \(\dots\) \(E_4\) , Among them, we need to know that the dimension of \(e_{ij}\) is \(1 \times H\) , and the process of embedding is :

\[E\left(C_{dog}\right)=\sum_{i=1}^{M} E_{i}\left(C_{w}^{i}\right)=E_1(3) + E_2(2)+E_3(4)+E_4(1) = e_{13}+e_{22}+e_{34}+e_{41} \]

So the matrix of the embedding process is \(M \times K \times h\)

A strategy to quantify embedding layer

标签：example isp roc which embedding express wan cti main

原文地址：https://www.cnblogs.com/wevolf/p/13091540.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行