|
1 |
| -# SQLITE3 WITH ICU SUPPORT FOR ANDROID |
| 1 | +# SQLite3 with ICU and extension support for Android |
2 | 2 |
|
3 |
| -This project copied and modified from [SQLite Android Bindings](http://www.sqlite.org/android/zip/SQLite+Android+Bindings.zip?uuid=trunk) to support **LOAD EXTENSION** and **ICU** |
| 3 | +This project is copied and modified from [SQLite Android Bindings](http://www.sqlite.org/android/zip/SQLite+Android+Bindings.zip?uuid=trunk) to support **extension loading**, **full-text searches** and **multi-language words segmentation**. Please read the [SQLite Android Bindings Documentation](https://sqlite.org/android/doc/trunk/www/index.wiki) for more information. |
4 | 4 |
|
5 |
| -This project is for personal use, **DO NOT ENABLE LOAD EXTESNION FOR SECURITY REASON** unless you can ensure the safty. |
| 5 | +The following details you should be aware before using it: |
| 6 | + |
| 7 | +* This project uses SQLite3 with `version 3.17.0`. |
| 8 | +* The currently supported architectures are `armeabi-v7a` and `arm64-v8a`. |
| 9 | +* There are security risks if extension loading is enabled according to [this topic](https://www.sqlite.org/c3ref/enable_load_extension.html), be careful about it. |
| 10 | + |
| 11 | +## Command-line interface |
| 12 | + |
| 13 | +This project provides tools run in command-line, current support `Unix-like OS` only, run the following commands to buid: |
| 14 | + |
| 15 | +```sh |
| 16 | +$ cd /<your-project-dir>/host |
| 17 | +$ chmod +x build.sh |
| 18 | +$ ./build.sh |
| 19 | +``` |
| 20 | + |
| 21 | +Now you'll find an executable file named `sqlite3` and 3 extension libraries in the directory `./out`. Run the following commands in your command-line to check if it works: |
| 22 | + |
| 23 | +```sql |
| 24 | +$ cd out |
| 25 | +$ ./sqlite3 |
| 26 | +SQLite version 3.17.0 2017-02-13 16:02:40 |
| 27 | +Enter ".help" for usage hints. |
| 28 | +Connected to a transient in-memory database. |
| 29 | +Use ".open FILENAME" to reopen on a persistent database. |
| 30 | +sqlite> SELECT sqlite_compileoption_used('ENABLE_LOAD_EXTENSION'); |
| 31 | +1 |
| 32 | +sqlite> SELECT load_extension('./libspellfix'); |
| 33 | + |
| 34 | +sqlite> CREATE VIRTUAL TABLE spellfix USING spellfix1; |
| 35 | +sqlite> INSERT INTO spellfix(word) VALUES('frustrate'); |
| 36 | +sqlite> INSERT INTO spellfix(word) VALUES('Frustration'); |
| 37 | +sqlite> SELECT word FROM spellfix WHERE word MATCH 'frus'; |
| 38 | +frustrate |
| 39 | +Frustration |
| 40 | +``` |
| 41 | + |
| 42 | +Check the file `build.sh` to find out the compile options, there are much more you may want to know, please read this topic: [How To Compile SQLite.](https://www.sqlite.org/howtocompile.html) |
| 43 | + |
| 44 | +## Build native libraries |
| 45 | + |
| 46 | +All of the first, please make sure you have `NDK` installed, then add the following code into your `local.properties` file to build the libraries: |
| 47 | + |
| 48 | +``` |
| 49 | +ndk.dir=/your/ndk/directory |
| 50 | +``` |
| 51 | + |
| 52 | +## Application programming |
| 53 | + |
| 54 | +Load the native library: |
| 55 | + |
| 56 | +```java |
| 57 | +System.loadLibrary("sqliteX"); |
| 58 | +``` |
| 59 | + |
| 60 | +Replace the `android.database.sqlite` namespace with `org.sqlite.database.sqlite`. For example, the following: |
| 61 | + |
| 62 | +```java |
| 63 | +import android.database.sqlite.SQLiteDatabase; |
| 64 | +``` |
| 65 | + |
| 66 | +should be replaced with: |
| 67 | + |
| 68 | +```java |
| 69 | +import org.sqlite.database.sqlite.SQLiteDatabase; |
| 70 | +``` |
| 71 | + |
| 72 | +For more details, please read [this topic](https://sqlite.org/android/doc/trunk/www/usage.wiki). |
| 73 | + |
| 74 | +## FTS3, FTS4 and ICU support |
| 75 | + |
| 76 | +> FTS3 and FTS4 are SQLite virtual table modules that allows users to perform full-text searches on a set of documents. |
| 77 | +
|
| 78 | +We use `FTS3` and `FTS4` to perform full-text searches, and `icu` to perform `multi-language words segmentation` in SQLite. |
| 79 | + |
| 80 | +The following code shows how to use `FTS` and `ICU`. |
| 81 | + |
| 82 | +```java |
| 83 | +SQLiteDatabase db = helper.getWritableDatabase(); |
| 84 | +// Create an FTS table with a single column - "content" |
| 85 | +// that uses the "icu" tokenizer |
| 86 | +db.execSQL("CREATE VIRTUAL TABLE icu_fts USING fts4(tokenize=icu)"); |
| 87 | +// Insert texts into the table created before |
| 88 | +db.execSQL("INSERT INTO icu_fts VALUES('Welcome to China.')"); |
| 89 | +db.execSQL("INSERT INTO icu_fts VALUES('Welcome to Beijing.')"); |
| 90 | +db.execSQL("INSERT INTO icu_fts VALUES('中国欢迎你!')"); |
| 91 | +db.execSQL("INSERT INTO icu_fts VALUES('北京欢迎你!')"); |
| 92 | +// Perform full-text searches |
| 93 | +Cursor c = db.rawQuery("SELECT * FROM icu_fts WHERE icu_fts MATCH 'welcome'", null); |
| 94 | +while (c.moveToNext()) { |
| 95 | + Log.d(TAG, "search for 'welcome': " + c.getString(0)); |
| 96 | + // Should be: |
| 97 | + // Welcome to China. |
| 98 | + // Welcome to Beijing. |
| 99 | +} |
| 100 | +c.close(); |
| 101 | +c = db.rawQuery("SELECT * FROM icu_fts WHERE icu_fts MATCH '欢迎'", null); |
| 102 | +while (c.moveToNext()) { |
| 103 | + Log.d(TAG, "search for '欢迎': " + c.getString(0)); |
| 104 | + // Should be: |
| 105 | + // 中国欢迎你! |
| 106 | + // 北京欢迎你! |
| 107 | +} |
| 108 | +c.close(); |
| 109 | +``` |
| 110 | + |
| 111 | +You can use the `binary operators` to perform logic searches and combine the auxiliary functions to perform more complicated searches. For more details, please read [the documentation](https://www.sqlite.org/fts3.html). |
| 112 | + |
| 113 | +## Extension loading |
| 114 | + |
| 115 | +> SQLite has the ability to load extensions (including new application-defined SQL functions, collating sequences, virtual tables, and VFSes) at run-time. This feature allows the code for extensions to be developed and tested separately from the application and then loaded on an as-needed basis. |
| 116 | +
|
| 117 | +Basically, a SQLite extension is a "plugin" that implemented a set of specific functions and can be loaded into SQLite dynamically. |
| 118 | + |
| 119 | +The following code shows how to check if extension loading is supported: |
| 120 | + |
| 121 | +```java |
| 122 | +SQLiteDatabase db = helper.getWritableDatabase(); |
| 123 | +Cursor c = db.rawQuery("SELECT sqlite_compileoption_used('ENABLE_LOAD_EXTENSION')", null); |
| 124 | +// The result must not be 0 |
| 125 | +assert(c.getInt(0) != 0); |
| 126 | +``` |
| 127 | + |
| 128 | +Enable or disable extension loading: |
| 129 | + |
| 130 | +```java |
| 131 | +// The following code CAN NOT run in a transaction |
| 132 | + |
| 133 | +// Enable extension loading |
| 134 | +db.enableLoadExtension(true); |
| 135 | +// Disable extension loading |
| 136 | +db.enableLoadExtension(false); |
| 137 | +``` |
| 138 | + |
| 139 | +After enabled, you could load your extensions now, take `spellfix` as an example: |
| 140 | + |
| 141 | +```java |
| 142 | +// Load successfully if there are no exceptions thrown |
| 143 | +Cursor c = db.rawQuery("SELECT load_extension('libspellfix')", null); |
| 144 | +c.moveToFirst(); |
| 145 | +Log.i(TAG, "Load spellfix, result = " + c.getInt(0)); |
| 146 | +``` |
| 147 | + |
| 148 | +Writing your own extensions is also simple, be sure you have read [this topic](https://www.sqlite.org/loadext.html). |
| 149 | + |
| 150 | +## Builtin extensions |
| 151 | + |
| 152 | +There are 3 builtin extensions, `offsets_rank`, `okapi_bm25` and `spellfix`, the source code is placed in the directory `builtin_extensions`. These extensions are enabled by default, you can disable it by adding the following code into your `local.properties` file. |
| 153 | + |
| 154 | +``` |
| 155 | +useBuiltinExtensions=false |
| 156 | +``` |
| 157 | + |
| 158 | +### The spellfix1 virtual table |
| 159 | + |
| 160 | +[The documentation](https://www.sqlite.org/spellfix1.html) said: |
| 161 | + |
| 162 | +> This spellfix1 virtual table can be used to search a large vocabulary for close matches. For example, spellfix1 can be used to suggest corrections to misspelled words. Or, it could be used with FTS4 to do full-text search using potentially misspelled words. |
| 163 | +
|
| 164 | +You can download the latest source code from [here](https://www.sqlite.org/src/finfo?name=ext/misc/spellfix.c). |
| 165 | + |
| 166 | +A quick look: |
| 167 | + |
| 168 | +```sql |
| 169 | +sqlite> SELECT load_extension('./libspellfix'); |
| 170 | + |
| 171 | +sqlite> CREATE VIRTUAL TABLE demo USING spellfix1; |
| 172 | +sqlite> INSERT INTO demo(word, rank) VALUES('frustrate', 2); |
| 173 | +sqlite> INSERT INTO demo(word, rank) VALUES('Frustration', 3); |
| 174 | +sqlite> INSERT INTO demo(word, rank) VALUES('frustate', 1); |
| 175 | +sqlite> SELECT word FROM demo WHERE word MATCH 'fru*'; |
| 176 | +frustrate |
| 177 | +Frustration |
| 178 | +frustate |
| 179 | +``` |
| 180 | + |
| 181 | +More details can be found at [here](https://www.sqlite.org/spellfix1.html). |
| 182 | + |
| 183 | +### offsets_rank |
| 184 | + |
| 185 | +An extension to use with the function [offsets()](https://www.sqlite.org/fts3.html#the_offsets_function) to calculate simple relevancy of an FTS match. The value returned is the relevancy score (a real value greater than or equal to zero). A larger value indicates a more relevant document. |
| 186 | + |
| 187 | +According to the value returned by the function `offsets()`, it contains 4 integer value on each term, the last value is the size of the matching term in bytes, typically the value will keep the same with the given term, but in some cases, for example, when create an fts table with option tokenize=porter, and contains the following records: |
| 188 | + |
| 189 | +``` |
| 190 | +docid content |
| 191 | +------ ------- |
| 192 | +1 sleep |
| 193 | +2 sleeping |
| 194 | +``` |
| 195 | + |
| 196 | +when we execute the queries: |
| 197 | + |
| 198 | +```sql |
| 199 | +SELECT docid, content, offsets(fts) FROM fts WHERE fts MATCH 'sleeping'; |
| 200 | +SELECT docid, content, offsets(fts) FROM fts WHERE fts MATCH 'sleep'; |
| 201 | +``` |
| 202 | + |
| 203 | +will get the exact same results: |
| 204 | + |
| 205 | +``` |
| 206 | +docid content offsets |
| 207 | +------ ------- ------- |
| 208 | +1 sleep 0 0 0 5 |
| 209 | +2 sleeping 0 0 0 8 |
| 210 | +``` |
| 211 | + |
| 212 | +but we want a higher score on record `sleeping` when searches for `sleeping`, the function `offsets_rank` will parse the value returned by the function `offsets` and adjust the relevancy score. The following query returns the documents that match the full-text query sorted from most to least relevant: |
| 213 | + |
| 214 | +```sql |
| 215 | +SELECT docid, content FROM fts WHERE fts MATCH 'sleeping' |
| 216 | + ORDER BY offsets_rank(offsets(fts)) DESC; |
| 217 | +``` |
| 218 | + |
| 219 | +the results will be: |
| 220 | + |
| 221 | +``` |
| 222 | +docid content |
| 223 | +------ -------- |
| 224 | +2 sleeping |
| 225 | +1 sleep |
| 226 | +``` |
| 227 | + |
| 228 | +### okapi_bm25 |
| 229 | + |
| 230 | +This file is a fork from [sqlite-okapi-bm25](https://github.com/neozenith/sqlite-okapi-bm25), that is under the [MIT License](https://opensource.org/licenses/MIT). The ranking function uses the built-in [matchinfo](https://www.sqlite.org/fts3.html#matchinfo) function to obtain the data necessary to calculate the scores. Make sure you have read these documentations before use. |
| 231 | + |
| 232 | +## License |
| 233 | + |
| 234 | +Except the files included in [SQLite](https://www.sqlite.org/copyright.html), all other files are under the [MIT License](https://opensource.org/licenses/MIT). |
0 commit comments