Skip to content

Commit 5808e14

Browse files
authored
add VECTOR doc (#18791)
* add VECTOR doc
1 parent 7e5695d commit 5808e14

File tree

4 files changed

+102
-0
lines changed

4 files changed

+102
-0
lines changed

src/current/_includes/v24.2/misc/enterprise-features.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ Feature | Description
77
[Multi-Region Capabilities]({% link {{ page.version.version }}/multiregion-overview.md %}) | Row-level control over where your data is stored to help you reduce read and write latency and meet regulatory requirements.
88
[PL/pgSQL]({% link {{ page.version.version }}/plpgsql.md %}) | Use a procedural language in [user-defined functions]({% link {{ page.version.version }}/user-defined-functions.md %}) and [stored procedures]({% link {{ page.version.version }}/stored-procedures.md %}) to improve performance and enable more complex queries.
99
[Node Map]({% link {{ page.version.version }}/enable-node-map.md %}) | Visualize the geographical distribution of a cluster by plotting its node localities on a world map.
10+
[`VECTOR` type]({% link {{ page.version.version }}/vector.md %}) | Represent data points in multi-dimensional space, using fixed-length arrays of floating-point numbers.
1011

1112
## Recovery and streaming
1213

src/current/_includes/v24.2/sidebar-data/sql.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1015,6 +1015,12 @@
10151015
"urls": [
10161016
"/${VERSION}/uuid.html"
10171017
]
1018+
},
1019+
{
1020+
"title": "<code>VECTOR</code>",
1021+
"urls": [
1022+
"/${VERSION}/vector.html"
1023+
]
10181024
}
10191025
]
10201026
},

src/current/v24.2/data-types.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ Type | Description | Example
3333
[`TSQUERY`]({% link {{ page.version.version }}/tsquery.md %}) | A list of lexemes and operators used in [full-text search]({% link {{ page.version.version }}/full-text-search.md %}). | `'list' & 'lexem' & 'oper' & 'use' & 'full' & 'text' & 'search'`
3434
[`TSVECTOR`]({% link {{ page.version.version }}/tsvector.md %}) | A list of lexemes with optional integer positions and weights used in [full-text search]({% link {{ page.version.version }}/full-text-search.md %}). | `'full':13 'integ':7 'lexem':4 'list':2 'option':6 'posit':8 'search':15 'text':14 'use':11 'weight':10`
3535
[`UUID`]({% link {{ page.version.version }}/uuid.md %}) | A 128-bit hexadecimal value. | `7f9c24e8-3b12-4fef-91e0-56a2d5a246ec`
36+
[`VECTOR`]({% link {{ page.version.version }}/vector.md %}) | A fixed-length array of floating-point numbers. | `[1.0, 0.0, 0.0]`
3637

3738
## Data type conversions and casts
3839

src/current/v24.2/vector.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
---
2+
title: VECTOR
3+
summary: The VECTOR data type stores fixed-length arrays of floating-point numbers, which represent data points in multi-dimensional space.
4+
toc: true
5+
docs_area: reference.sql
6+
---
7+
8+
{% include enterprise-feature.md %}
9+
10+
{{site.data.alerts.callout_info}}
11+
{% include feature-phases/preview.md %}
12+
{{site.data.alerts.end}}
13+
14+
The `VECTOR` data type stores fixed-length arrays of floating-point numbers, which represent data points in multi-dimensional space. Vector search is often used in AI applications such as Large Language Models (LLMs) that rely on vector representations.
15+
16+
For details on valid `VECTOR` comparison operators, refer to [Syntax](#syntax). For the list of supported `VECTOR` functions, refer to [Functions and Operators]({% link {{ page.version.version }}/functions-and-operators.md %}#pgvector-functions).
17+
18+
{{site.data.alerts.callout_info}}
19+
`VECTOR` functionality is compatible with the [`pgvector`](https://github.com/pgvector/pgvector) extension for PostgreSQL. Vector indexing is **not** supported at this time.
20+
{{site.data.alerts.end}}
21+
22+
## Syntax
23+
24+
A `VECTOR` value is expressed as an [array]({% link {{ page.version.version }}/array.md %}) of [floating-point numbers]({% link {{ page.version.version }}/float.md %}). The array size corresponds to the number of `VECTOR` dimensions. For example, the following `VECTOR` has 3 dimensions:
25+
26+
~~~
27+
[1.0, 0.0, 0.0]
28+
~~~
29+
30+
You can specify the dimensions when defining a `VECTOR` column. This will enforce the number of dimensions in the column values. For example:
31+
32+
~~~ sql
33+
ALTER TABLE foo ADD COLUMN bar VECTOR(3);
34+
~~~
35+
36+
The following `VECTOR` comparison operators are valid:
37+
38+
- `=` (equals). Compare vectors for equality in filtering and conditional queries.
39+
- `<>` (not equal to). Compare vectors for inequality in filtering and conditional queries.
40+
- `<->` (L2 distance). Calculate the Euclidean distance between two vectors, as used in [nearest neighbor search](https://en.wikipedia.org/wiki/Nearest_neighbor_search) and clustering algorithms.
41+
- `<#>` (negative inner product). Calculate the [inner product](https://en.wikipedia.org/wiki/Inner_product_space) of two vectors, as used in similarity searches where the inner product can represent the similarity score.
42+
- `<=>` (cosine distance). Calculate the [cosine distance](https://en.wikipedia.org/wiki/Cosine_similarity) between vectors, such as in text and image similarity measures where the orientation of vectors is more important than their magnitude.
43+
44+
## Size
45+
46+
The size of a `VECTOR` value is variable, but it's recommended to keep values under 1 MB to ensure performance. Above that threshold, [write amplification]({% link {{ page.version.version }}/architecture/storage-layer.md %}#write-amplification) and other considerations may cause significant performance degradation.
47+
48+
## Functions
49+
50+
For the list of supported `VECTOR` functions, refer to [Functions and Operators]({% link {{ page.version.version }}/functions-and-operators.md %}#pgvector-functions).
51+
52+
## Example
53+
54+
Create a table with a `VECTOR` column, specifying `3` dimensions:
55+
56+
{% include_cached copy-clipboard.html %}
57+
~~~ sql
58+
CREATE TABLE items (
59+
category STRING,
60+
vector VECTOR(3),
61+
INDEX (category)
62+
);
63+
~~~
64+
65+
Insert some sample data into the table:
66+
67+
{% include_cached copy-clipboard.html %}
68+
~~~ sql
69+
INSERT INTO items (category, vector) VALUES
70+
('electronics', '[1.0, 0.0, 0.0]'),
71+
('electronics', '[0.9, 0.1, 0.0]'),
72+
('furniture', '[0.0, 1.0, 0.0]'),
73+
('furniture', '[0.0, 0.9, 0.1]'),
74+
('clothing', '[0.0, 0.0, 1.0]');
75+
~~~
76+
77+
Use the [`<->` operator](#syntax) to sort values with the `electronics` category by their similarity to `[1.0, 0.0, 0.0]`, based on geographic distance.
78+
79+
{% include_cached copy-clipboard.html %}
80+
~~~ sql
81+
SELECT category, vector FROM items WHERE category = 'electronics' ORDER BY vector <-> '[1.0, 0.0, 0.0]' LIMIT 5;
82+
~~~
83+
84+
~~~
85+
category | vector
86+
--------------+--------------
87+
electronics | [1,0,0]
88+
electronics | [0.9,0.1,0]
89+
~~~
90+
91+
## See also
92+
93+
- [Functions and Operators]({% link {{ page.version.version }}/functions-and-operators.md %}#pgvector-functions)
94+
- [Data Types]({% link {{ page.version.version }}/data-types.md %})

0 commit comments

Comments
 (0)