Skip to content

Commit 7b58cf2

Browse files
committed
add host build scripts
1 parent 4fc79e2 commit 7b58cf2

30 files changed

+26112
-5253
lines changed

README.md

+232-3
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,234 @@
1-
# SQLITE3 WITH ICU SUPPORT FOR ANDROID
1+
# SQLite3 with ICU and extension support for Android
22

3-
This project copied and modified from [SQLite Android Bindings](http://www.sqlite.org/android/zip/SQLite+Android+Bindings.zip?uuid=trunk) to support **LOAD EXTENSION** and **ICU**
3+
This project is copied and modified from [SQLite Android Bindings](http://www.sqlite.org/android/zip/SQLite+Android+Bindings.zip?uuid=trunk) to support **extension loading**, **full-text searches** and **multi-language words segmentation**. Please read the [SQLite Android Bindings Documentation](https://sqlite.org/android/doc/trunk/www/index.wiki) for more information.
44

5-
This project is for personal use, **DO NOT ENABLE LOAD EXTESNION FOR SECURITY REASON** unless you can ensure the safty.
5+
The following details you should be aware before using it:
6+
7+
* This project uses SQLite3 with `version 3.17.0`.
8+
* The currently supported architectures are `armeabi-v7a` and `arm64-v8a`.
9+
* There are security risks if extension loading is enabled according to [this topic](https://www.sqlite.org/c3ref/enable_load_extension.html), be careful about it.
10+
11+
## Command-line interface
12+
13+
This project provides tools run in command-line, current support `Unix-like OS` only, run the following commands to buid:
14+
15+
```sh
16+
$ cd /<your-project-dir>/host
17+
$ chmod +x build.sh
18+
$ ./build.sh
19+
```
20+
21+
Now you'll find an executable file named `sqlite3` and 3 extension libraries in the directory `./out`. Run the following commands in your command-line to check if it works:
22+
23+
```sql
24+
$ cd out
25+
$ ./sqlite3
26+
SQLite version 3.17.0 2017-02-13 16:02:40
27+
Enter ".help" for usage hints.
28+
Connected to a transient in-memory database.
29+
Use ".open FILENAME" to reopen on a persistent database.
30+
sqlite> SELECT sqlite_compileoption_used('ENABLE_LOAD_EXTENSION');
31+
1
32+
sqlite> SELECT load_extension('./libspellfix');
33+
34+
sqlite> CREATE VIRTUAL TABLE spellfix USING spellfix1;
35+
sqlite> INSERT INTO spellfix(word) VALUES('frustrate');
36+
sqlite> INSERT INTO spellfix(word) VALUES('Frustration');
37+
sqlite> SELECT word FROM spellfix WHERE word MATCH 'frus';
38+
frustrate
39+
Frustration
40+
```
41+
42+
Check the file `build.sh` to find out the compile options, there are much more you may want to know, please read this topic: [How To Compile SQLite.](https://www.sqlite.org/howtocompile.html)
43+
44+
## Build native libraries
45+
46+
All of the first, please make sure you have `NDK` installed, then add the following code into your `local.properties` file to build the libraries:
47+
48+
```
49+
ndk.dir=/your/ndk/directory
50+
```
51+
52+
## Application programming
53+
54+
Load the native library:
55+
56+
```java
57+
System.loadLibrary("sqliteX");
58+
```
59+
60+
Replace the `android.database.sqlite` namespace with `org.sqlite.database.sqlite`. For example, the following:
61+
62+
```java
63+
import android.database.sqlite.SQLiteDatabase;
64+
```
65+
66+
should be replaced with:
67+
68+
```java
69+
import org.sqlite.database.sqlite.SQLiteDatabase;
70+
```
71+
72+
For more details, please read [this topic](https://sqlite.org/android/doc/trunk/www/usage.wiki).
73+
74+
## FTS3, FTS4 and ICU support
75+
76+
> FTS3 and FTS4 are SQLite virtual table modules that allows users to perform full-text searches on a set of documents.
77+
78+
We use `FTS3` and `FTS4` to perform full-text searches, and `icu` to perform `multi-language words segmentation` in SQLite.
79+
80+
The following code shows how to use `FTS` and `ICU`.
81+
82+
```java
83+
SQLiteDatabase db = helper.getWritableDatabase();
84+
// Create an FTS table with a single column - "content"
85+
// that uses the "icu" tokenizer
86+
db.execSQL("CREATE VIRTUAL TABLE icu_fts USING fts4(tokenize=icu)");
87+
// Insert texts into the table created before
88+
db.execSQL("INSERT INTO icu_fts VALUES('Welcome to China.')");
89+
db.execSQL("INSERT INTO icu_fts VALUES('Welcome to Beijing.')");
90+
db.execSQL("INSERT INTO icu_fts VALUES('中国欢迎你!')");
91+
db.execSQL("INSERT INTO icu_fts VALUES('北京欢迎你!')");
92+
// Perform full-text searches
93+
Cursor c = db.rawQuery("SELECT * FROM icu_fts WHERE icu_fts MATCH 'welcome'", null);
94+
while (c.moveToNext()) {
95+
Log.d(TAG, "search for 'welcome': " + c.getString(0));
96+
// Should be:
97+
// Welcome to China.
98+
// Welcome to Beijing.
99+
}
100+
c.close();
101+
c = db.rawQuery("SELECT * FROM icu_fts WHERE icu_fts MATCH '欢迎'", null);
102+
while (c.moveToNext()) {
103+
Log.d(TAG, "search for '欢迎': " + c.getString(0));
104+
// Should be:
105+
// 中国欢迎你!
106+
// 北京欢迎你!
107+
}
108+
c.close();
109+
```
110+
111+
You can use the `binary operators` to perform logic searches and combine the auxiliary functions to perform more complicated searches. For more details, please read [the documentation](https://www.sqlite.org/fts3.html).
112+
113+
## Extension loading
114+
115+
> SQLite has the ability to load extensions (including new application-defined SQL functions, collating sequences, virtual tables, and VFSes) at run-time. This feature allows the code for extensions to be developed and tested separately from the application and then loaded on an as-needed basis.
116+
117+
Basically, a SQLite extension is a "plugin" that implemented a set of specific functions and can be loaded into SQLite dynamically.
118+
119+
The following code shows how to check if extension loading is supported:
120+
121+
```java
122+
SQLiteDatabase db = helper.getWritableDatabase();
123+
Cursor c = db.rawQuery("SELECT sqlite_compileoption_used('ENABLE_LOAD_EXTENSION')", null);
124+
// The result must not be 0
125+
assert(c.getInt(0) != 0);
126+
```
127+
128+
Enable or disable extension loading:
129+
130+
```java
131+
// The following code CAN NOT run in a transaction
132+
133+
// Enable extension loading
134+
db.enableLoadExtension(true);
135+
// Disable extension loading
136+
db.enableLoadExtension(false);
137+
```
138+
139+
After enabled, you could load your extensions now, take `spellfix` as an example:
140+
141+
```java
142+
// Load successfully if there are no exceptions thrown
143+
Cursor c = db.rawQuery("SELECT load_extension('libspellfix')", null);
144+
c.moveToFirst();
145+
Log.i(TAG, "Load spellfix, result = " + c.getInt(0));
146+
```
147+
148+
Writing your own extensions is also simple, be sure you have read [this topic](https://www.sqlite.org/loadext.html).
149+
150+
## Builtin extensions
151+
152+
There are 3 builtin extensions, `offsets_rank`, `okapi_bm25` and `spellfix`, the source code is placed in the directory `builtin_extensions`. These extensions are enabled by default, you can disable it by adding the following code into your `local.properties` file.
153+
154+
```
155+
useBuiltinExtensions=false
156+
```
157+
158+
### The spellfix1 virtual table
159+
160+
[The documentation](https://www.sqlite.org/spellfix1.html) said:
161+
162+
> This spellfix1 virtual table can be used to search a large vocabulary for close matches. For example, spellfix1 can be used to suggest corrections to misspelled words. Or, it could be used with FTS4 to do full-text search using potentially misspelled words.
163+
164+
You can download the latest source code from [here](https://www.sqlite.org/src/finfo?name=ext/misc/spellfix.c).
165+
166+
A quick look:
167+
168+
```sql
169+
sqlite> SELECT load_extension('./libspellfix');
170+
171+
sqlite> CREATE VIRTUAL TABLE demo USING spellfix1;
172+
sqlite> INSERT INTO demo(word, rank) VALUES('frustrate', 2);
173+
sqlite> INSERT INTO demo(word, rank) VALUES('Frustration', 3);
174+
sqlite> INSERT INTO demo(word, rank) VALUES('frustate', 1);
175+
sqlite> SELECT word FROM demo WHERE word MATCH 'fru*';
176+
frustrate
177+
Frustration
178+
frustate
179+
```
180+
181+
More details can be found at [here](https://www.sqlite.org/spellfix1.html).
182+
183+
### offsets_rank
184+
185+
An extension to use with the function [offsets()](https://www.sqlite.org/fts3.html#the_offsets_function) to calculate simple relevancy of an FTS match. The value returned is the relevancy score (a real value greater than or equal to zero). A larger value indicates a more relevant document.
186+
187+
According to the value returned by the function `offsets()`, it contains 4 integer value on each term, the last value is the size of the matching term in bytes, typically the value will keep the same with the given term, but in some cases, for example, when create an fts table with option tokenize=porter, and contains the following records:
188+
189+
```
190+
docid content
191+
------ -------
192+
1 sleep
193+
2 sleeping
194+
```
195+
196+
when we execute the queries:
197+
198+
```sql
199+
SELECT docid, content, offsets(fts) FROM fts WHERE fts MATCH 'sleeping';
200+
SELECT docid, content, offsets(fts) FROM fts WHERE fts MATCH 'sleep';
201+
```
202+
203+
will get the exact same results:
204+
205+
```
206+
docid content offsets
207+
------ ------- -------
208+
1 sleep 0 0 0 5
209+
2 sleeping 0 0 0 8
210+
```
211+
212+
but we want a higher score on record `sleeping` when searches for `sleeping`, the function `offsets_rank` will parse the value returned by the function `offsets` and adjust the relevancy score. The following query returns the documents that match the full-text query sorted from most to least relevant:
213+
214+
```sql
215+
SELECT docid, content FROM fts WHERE fts MATCH 'sleeping'
216+
ORDER BY offsets_rank(offsets(fts)) DESC;
217+
```
218+
219+
the results will be:
220+
221+
```
222+
docid content
223+
------ --------
224+
2 sleeping
225+
1 sleep
226+
```
227+
228+
### okapi_bm25
229+
230+
This file is a fork from [sqlite-okapi-bm25](https://github.com/neozenith/sqlite-okapi-bm25), that is under the [MIT License](https://opensource.org/licenses/MIT). The ranking function uses the built-in [matchinfo](https://www.sqlite.org/fts3.html#matchinfo) function to obtain the data necessary to calculate the scores. Make sure you have read these documentations before use.
231+
232+
## License
233+
234+
Except the files included in [SQLite](https://www.sqlite.org/copyright.html), all other files are under the [MIT License](https://opensource.org/licenses/MIT).

builtin_extensions/.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
/obj
2+
/libs

builtin_extensions/jni/Android.mk

+31
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
LOCAL_PATH := $(call my-dir)
2+
3+
# build okapi bm25
4+
include $(CLEAR_VARS)
5+
6+
LOCAL_CFLAGS := -std=c99 -Os
7+
8+
LOCAL_MODULE := okapi_bm25
9+
LOCAL_SRC_FILES := okapi_bm25.c
10+
11+
include $(BUILD_SHARED_LIBRARY)
12+
13+
# build offsets rank
14+
include $(CLEAR_VARS)
15+
16+
LOCAL_CFLAGS := -std=c99 -Os
17+
18+
LOCAL_MODULE := offsets_rank
19+
LOCAL_SRC_FILES := offsets_rank.c
20+
21+
include $(BUILD_SHARED_LIBRARY)
22+
23+
# build spell fix
24+
include $(CLEAR_VARS)
25+
26+
LOCAL_CFLAGS := -std=c99 -Os
27+
28+
LOCAL_MODULE := spellfix
29+
LOCAL_SRC_FILES := spellfix1.c
30+
31+
include $(BUILD_SHARED_LIBRARY)

builtin_extensions/jni/Application.mk

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
APP_ABI := armeabi-v7a arm64-v8a

builtin_extensions/jni/offsets_rank.c

+110
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
#include <stdlib.h>
2+
#include "sqlite3ext.h"
3+
SQLITE_EXTENSION_INIT1
4+
5+
/**
6+
* Take match size from value returned by function offsets()
7+
*/
8+
static char *next_match_size(char *value, int *size) {
9+
int i = 0;
10+
int token_count = 0;
11+
char x[32];
12+
while (*value != '\0') {
13+
if (*value == ' ') {
14+
x[i] = '\0';
15+
i = 0;
16+
token_count++;
17+
if (token_count == 4) {
18+
*size = atoi(x);
19+
return value + 1;
20+
}
21+
} else {
22+
x[i++] = *value;
23+
}
24+
value++;
25+
}
26+
x[i] = '\0';
27+
token_count++;
28+
if (token_count == 4) {
29+
*size = atoi(x);
30+
} else {
31+
*size = -1;
32+
}
33+
return NULL;
34+
}
35+
36+
/**
37+
* SQLite user defined function to use with offsets() to calculate the simple relevancy of
38+
* an FTS match. The value returned is the relevancy score (a real value greater than or
39+
* equal to zero). A larger value indicates a more relevant document.
40+
*
41+
* According to the value returned by the function offsets(), it contains 4 integer values on each
42+
* term, the last value is the size of the matching term in bytes, typically the size will
43+
* keep the same with the given term, but in some cases, for example, when create an fts table with
44+
* option tokenize=porter, and contains the following records:
45+
*
46+
* docid content
47+
* ------ -------
48+
* 1 sleep
49+
* 2 sleeping
50+
*
51+
* when we execute query:
52+
*
53+
* select docid, content, offsets(fts) from fts where fts match 'sleeping'
54+
*
55+
* we will get 2 records like:
56+
*
57+
* docid content offsets
58+
* ------ ------- -------
59+
* 1 sleep 0 0 0 5
60+
* 2 sleeping 0 0 0 8
61+
*
62+
* It is not that resonable, we want 'sleeping' ahead of 'sleep', in another word, we want a
63+
* larger matching size represents a more relevant document.
64+
*
65+
* The following query returns the docids of documents that match the full-text query <query>
66+
* sorted from most to least relevant.
67+
*
68+
* select docid from fts where fts match <query>
69+
* order by offsets_rank(offsets(fts)) desc
70+
*
71+
*/
72+
static void offsets_rank(sqlite3_context *pCtx, int nVal, sqlite3_value **apVal) {
73+
// Obtain the offsets value
74+
char *offsets_value = (char *)sqlite3_value_text(apVal[0]);
75+
if (offsets_value == NULL) {
76+
sqlite3_result_int(pCtx, 1);
77+
return;
78+
}
79+
80+
// Obtain the term length
81+
int termLen = 0;
82+
if (nVal > 1) {
83+
termLen = sqlite3_value_int(apVal[1]);
84+
}
85+
86+
// Calculate rank
87+
int rank = 0;
88+
int len = 0;
89+
char *next_start = offsets_value;
90+
while (next_start != NULL) {
91+
next_start = next_match_size(next_start, &len);
92+
if (len > 0) {
93+
rank += len;
94+
}
95+
}
96+
97+
// Adjust rank
98+
if (termLen > 0 && rank > termLen) {
99+
rank = termLen - 1;
100+
}
101+
102+
sqlite3_result_int(pCtx, rank);
103+
}
104+
105+
int sqlite3_extension_init(sqlite3 *db, char **pzErrMsg, const sqlite3_api_routines *pApi) {
106+
SQLITE_EXTENSION_INIT2(pApi)
107+
108+
sqlite3_create_function(db, "offsets_rank", -1, SQLITE_ANY, 0, offsets_rank, 0, 0);
109+
return 0;
110+
}

0 commit comments

Comments
 (0)