Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RabitQ quantization support #32

Open
hicder opened this issue Oct 25, 2024 · 4 comments
Open

RabitQ quantization support #32

hicder opened this issue Oct 25, 2024 · 4 comments
Assignees

Comments

@hicder
Copy link
Owner

hicder commented Oct 25, 2024

Implement this paper: https://arxiv.org/abs/2405.12497 as a new quantization type

@tvhong tvhong self-assigned this Oct 27, 2024
@tvhong
Copy link
Contributor

tvhong commented Nov 5, 2024

Sample code: https://github.com/gaoj0017/RaBitQ

@tvhong
Copy link
Contributor

tvhong commented Nov 5, 2024

Here are the notations:
image

Here is the summary of the indexing and querying algorithms:
image

@tvhong
Copy link
Contributor

tvhong commented Nov 5, 2024

Let's break the Index Phase down:

1 Normalize the set of vectors (Section 3.1.1)

This means find the centroid of all data vectors ($c$), then normalize all data vectors using: $o := \frac{o_r - c} {∥ o_r - c∥}$

2 Sample a random orthogonal matrix 𝑃 to construct the codebook C𝑟𝑎𝑛𝑑 (Section 3.1.2)

Just need to create a random orthogonal matrix P of size $D \times D$, where $D$ is the dimension of the input vectors. Here is the code from the paper's repo:

def Orthogonal(D):
    G = np.random.randn(D, D).astype('float32')
    Q, _ = np.linalg.qr(G)
    return Q

https://github.com/gaoj0017/RaBitQ/blob/main/data/rabitq.py#L17-L21

3 Compute the quantization codes $\bar{x}_b$ (Section 3.1.3)

$\bar{x}_b$ is a D-bit vector that can be obtained from $sign(P^{-1}o)$, where sign returns 1 if the element is positive and 0 otherwise.

4 Pre-compute the values of $∥o_r − c∥$ and $\langle \bar{o}, o \rangle$

Compute these values and store them for later use. Note that we already have all the info from the previous steps.

$\mathbf{\bar{o}} = P\mathbf{\bar{x}}$ and $\mathbf{\bar{x}}$ can be retrieved from $\mathbf{\bar{x}_b}$ using $\mathbf{\bar{x}} = (2\mathbf{\bar{x}_b} - \mathbf{1_D}) / \sqrt{D}$ where $\mathbf{1_D}$ is the D-dimensional vector with all entries equal 1.

@tvhong
Copy link
Contributor

tvhong commented Nov 10, 2024

1 Normalize and inversely transform the raw query vector and obtain q′

Normalize:
$q = \frac{q_r - c} {\Vert q_r - c\Vert}$

Inverse transform:
$q' = P^{-1}q$

2 Quantize q′ into $\bar{q}$ (Section 3.3.1)

Convert $q'$ to $\bar{q}$, which is the uniform scalar quantization of $q'$. We'll represent $\bar{q}$ using a vector of size $D$ where each element is a $B_q$-bit unsigned integer ($B_q = 4$ is sufficient for very large $D$).

$\bar{q}_u[i] = \left\lfloor \frac{q'[i] - v_l}{\Delta} + u_i \right\rfloor$ (equation 18)

where

  • $v_l$ is the min value in $q'$
  • $v_r$ is the max value in $q'$
  • $\Delta = \frac{v_r - v_l}{2^{B_q} - 1}$

3 foreach input ID of the data vectors do

4 Compute the value of the estimator $\frac{\langle\bar{o},q\rangle}{\langle\bar{o},o\rangle}$ as an approximation of $\langle o,q \rangle$ (Section 3.3.2)

Here, we can retrieve $\langle\bar{o},o\rangle$ from the index.

Then calculate $\langle\bar{o},q\rangle$ as follow:

$$\begin{matrix} \langle\bar{o},q\rangle \\\ = \langle\bar{x},q'\rangle\ (equation 17)\\\ \approx \langle\bar{x},\bar{q}\rangle \\\ \approx \langle\bar{x}_b, \bar{q}_u\rangle \\\ \end{matrix}$$

There are 2 ways to compute $\langle\bar{x}_b, \bar{q}_u\rangle$: using bitwise operation for 1 data vector at a time, or SIMD-based implementation. I'll need to revisit this later to understand it better.

Then, we can compute

$$\langle o,q \rangle \approx \frac{\langle\bar{o},q\rangle}{\langle\bar{o},o\rangle} \approx \frac{\langle\bar{x}_b, \bar{q}_u\rangle}{\langle\bar{o},o\rangle}$$

5 Compute an estimated distance between the raw query and the data vector based on Equation (2)

Here is equation (2):
$\Vert o_r - q_r\Vert^2 =\Vert o_r - c\Vert^2 + \Vert q_r - c\Vert^2 - 2 \cdot \Vert o_r - c\Vert \cdot \Vert q_r - c\Vert \cdot \langle q, o\rangle$

Where

  • $\Vert o_r - c\Vert$ is in the index built earlier
  • $\Vert q_r - c\Vert$ is straight forward to compute
  • $\langle q, o\rangle$ is from step 4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants