|
| 1 | += Migrating your Data from MySQL to Couchbase Server |
| 2 | +:description: Using MySQL as a starting point, this guide demonstrates \ |
| 3 | +how to migrate your existing data from SQL tables to documents stored in a Couchbase bucket. |
| 4 | +:page-topic-type: guide |
| 5 | +:page-pagination: next |
| 6 | + |
| 7 | +[abstract] |
| 8 | +{description} |
| 9 | + |
| 10 | +== Introduction |
| 11 | + |
| 12 | +Couchbase offers a number of strategies for migrating your existing data; |
| 13 | +in this example, we will begin with a sample student record database stored in MySQL, |
| 14 | +and use the Couchbase server tool `cbimport` to copy the data into a Couchbase cluster. |
| 15 | + |
| 16 | +== Prerequisites |
| 17 | + |
| 18 | +Before you begin this exercise, you should have installed and set up a Couchbase cluster on your local machine. |
| 19 | +You will find instructions for creating a fresh cluster here: xref:getting-started:do-a-quick-install.adoc[Couchbase Server Installation] |
| 20 | + |
| 21 | +To use `cbimport`, you will need to install the Couchbase `CLI` package. |
| 22 | +You will find the location of the package and instructions for installing it, here: xref:cli:cli-intro.adoc[] |
| 23 | + |
| 24 | +If you're running through the examples, |
| 25 | +then you will also need an existing MySQL installation with the preexisting table structure |
| 26 | +defined in xref:student-record-sql-database-section[the following section]. |
| 27 | + |
| 28 | +IMPORTANT: This tutorial makes use of the MySQL JSON functions that were introduced in version `5.7.22`. |
| 29 | +Make sure you have installed MySQL version `5.7.22` or later. |
| 30 | + |
| 31 | +[#student-record-sql-database-section] |
| 32 | +== Student Record database |
| 33 | + |
| 34 | +The database we will convert will consist of a relational structure with three tables: |
| 35 | + |
| 36 | +[plantuml,student-record-erd] |
| 37 | +.Student records SQL database |
| 38 | +.... |
| 39 | +include::partial$diagrams/student-record-erd.puml[] |
| 40 | +.... |
| 41 | + |
| 42 | +which we will convert to a document model suitable for storage in our Couchbase bucket: |
| 43 | + |
| 44 | +[#document-model] |
| 45 | +[plantuml,student-document-database-design] |
| 46 | +.Student document model |
| 47 | +.... |
| 48 | +include::partial$diagrams/student-document-database-design.puml[] |
| 49 | +.... |
| 50 | + |
| 51 | +You will see that our document model is not an exact mapping of the SQL database: |
| 52 | +we have taken the `enrollments` records and added them directly as a list of sub-documents |
| 53 | +within each student record: |
| 54 | + |
| 55 | +[source, json] |
| 56 | +---- |
| 57 | +[ |
| 58 | +{"student-id": 1, |
| 59 | +"enrollments": [{"course-id": 3, "final-score": null, "date-enrolled": "2024-04-18", "date-completed": null}, {"course-id": 1, "final-score": null, "date-enrolled": "2025-03-05", "date-completed": null}], "student-name": "Harriet Hill", "date-of-birth": "1970-03-06"}, |
| 60 | +{"student-id": 2, "enrollments": [{"course-id": 1, "final-score": null, "date-enrolled": "2025-03-01", "date-completed": null}], "student-name": "Steven Morris", "date-of-birth": "1984-03-05"}, |
| 61 | +{"student-id": 3, "enrollments": [{"course-id": null, "final-score": null, "date-enrolled": null, "date-completed": null}], "student-name": "Jenny Mills", "date-of-birth": "1969-11-06"} |
| 62 | +] |
| 63 | +---- |
| 64 | + |
| 65 | +TIP: In the early stages of your migration, it is a good idea to design the new structure of your Couchbase collections. |
| 66 | +This will make it easier to work out the `cbimport` command parameters you will need for the migration. |
| 67 | + |
| 68 | + |
| 69 | +== Step {counter:step}: Extract your Course data from MySQL |
| 70 | + |
| 71 | +The first stage of your migration is to extract the data a file format that the `cbimport` utility can work with. |
| 72 | +`cbimport` can work with comma-separated value files or JSON-formatted files. |
| 73 | +Because we already know that we will be embedding our `enrollment` records into the record for each student, |
| 74 | +makes sense to use the more versatile JSON structure. |
| 75 | + |
| 76 | +Fortunately, MySQL has a number of SQL functions that make working with JSON data fairly straightforward, |
| 77 | +so we'll start by migrating the `course` table into a JSON file: |
| 78 | + |
| 79 | +[source, mysql] |
| 80 | +.Extract the `course` table into the file: `/var/lib/mysql-files/courses.json` |
| 81 | +---- |
| 82 | +SELECT JSON_OBJECT( |
| 83 | + 'course-id', course.`course-id`, |
| 84 | + 'course-name', course.`course-name`, |
| 85 | + 'faculty', course.faculty, |
| 86 | + 'credit-points', course.`credit-points` |
| 87 | + ) FROM course |
| 88 | +INTO OUTFILE '/var/lib/mysql-files/courses.json' |
| 89 | +---- |
| 90 | + |
| 91 | +Using the `JSON_OBJECT` function, the command will `SELECT` every record in the table and output it to a file. |
| 92 | +Each line of the file will correspond to a single record: |
| 93 | + |
| 94 | +[source,jsonlines] |
| 95 | +---- |
| 96 | +{"faculty": "Art", "course-id": 1, "course-name": "Art History", "credit-points": 50} |
| 97 | +{"faculty": "Art", "course-id": 2, "course-name": "Fine Art", "credit-points": 30} |
| 98 | +{"faculty": "Design", "course-id": 3, "course-name": "Graphic Design", "credit-points": 70} |
| 99 | +{"faculty": "English", "course-id": 4, "course-name": "Creative Writing", "credit-points": 70} |
| 100 | +---- |
| 101 | + |
| 102 | +NOTE: Strictly speaking, the JSON output is not a well-formed JSON document because it isn't structured as an array. |
| 103 | +Nevertheless, `cbimport` will read each line as a separate record. |
| 104 | + |
| 105 | +== Step {counter:step}: Extract your Student data from MySQL |
| 106 | + |
| 107 | +This case is slightly different because we want to include the enrollment details with each student record |
| 108 | +(see the xref:document-model[]) |
| 109 | + |
| 110 | +We can handle this JSON structure by using a more involved SELECT: |
| 111 | +as well as extracting the student records, we can simultaneously pull in the enrollments for each student: |
| 112 | + |
| 113 | +[source,mysql] |
| 114 | +.Extract `students` and their `enrollments`. |
| 115 | +---- |
| 116 | +SELECT JSON_OBJECT( |
| 117 | + 'student-id', student.`student-id`, |
| 118 | + 'student-name', student.name, |
| 119 | + 'date-of-birth', student.`date-of-birth`, |
| 120 | + 'enrollments', IF (COUNT(enrollment.`course-id`) = 0, JSON_ARRAY(), JSON_ARRAYAGG( |
| 121 | + JSON_OBJECT( |
| 122 | + 'course-id', enrollment.`course-id`, |
| 123 | + 'date-enrolled', enrollment.`date-enrolled`, |
| 124 | + 'date-completed', enrollment.`date-completed`, |
| 125 | + 'final-score', enrollment.`score` |
| 126 | + ) |
| 127 | + )) |
| 128 | + ) |
| 129 | +FROM student |
| 130 | + LEFT OUTER JOIN enrollment ON enrollment.`student-id` = student.`student-id` |
| 131 | +GROUP BY student.`student-id` |
| 132 | +INTO OUTFILE '/var/lib/mysql-files/students.json'; |
| 133 | +---- |
| 134 | + |
| 135 | +In addition to the `JSON_OBJECT` function call that extracts the student details, |
| 136 | +we are also using the `JSON_ARRAYAGG` function to build an array within each student record. |
| 137 | +The data for this list is retrieved through the `LEFT OUTER JOIN` |
| 138 | +which provides the foreign key link between the student and the enrollment record. |
| 139 | + |
| 140 | +We also use the ``IF (COUNT(enrollment.`course-id`) = 0`` statement |
| 141 | +to ensure that there are existing enrollment records attached to the current student. |
| 142 | +If there are no enrollment records, then that portion of the query uses `JSON_ARRAY()` to return an empty list. |
| 143 | + |
| 144 | +[source, jsonlines] |
| 145 | +---- |
| 146 | +{"student-id": 1, "enrollments": [{"course-id": 3, "final-score": 0, "date-enrolled": "2025-03-10", "date-completed": null}, {"course-id": 1, "final-score": 0, "date-enrolled": "2025-03-10", "date-completed": null}], "student-name": "Hilary Wells", "date-of-birth": "1990-08-09"} |
| 147 | +{"student-id": 2, "enrollments": [{"course-id": 2, "final-score": 0, "date-enrolled": "2025-03-10", "date-completed": null}], "student-name": "Ashley Matthews", "date-of-birth": "1987-07-01"} |
| 148 | +{"student-id": 3, "enrollments": [{"course-id": 1, "final-score": 0, "date-enrolled": "2025-03-10", "date-completed": null}], "student-name": "Boregard Johnson", "date-of-birth": "1985-03-23"} |
| 149 | +{"student-id": 4, "enrollments": [], "student-name": "Toni Jones", "date-of-birth": "1984-10-02"} |
| 150 | +---- |
| 151 | + |
| 152 | +== Step {counter:step}: Create your bucket, scope, and collections. |
| 153 | + |
| 154 | +You will need to create the bucket, scope, and collections to hold the data on your Couchbase cluster. |
| 155 | + |
| 156 | +For information on creating buckets, scopes, and collections, |
| 157 | +read the sections on xref:manage:manage-buckets/bucket-management-overview.adoc[Managing Buckets] |
| 158 | +and xref:manage:manage-scopes-and-collections/manage-scopes-and-collections.adoc[Managing Scopes and Collections] |
| 159 | + |
| 160 | +''' |
| 161 | + |
| 162 | +.Set up your cluster |
| 163 | + |
| 164 | +. Using the Couchbase admin console, the command line tool, or the REST API, |
| 165 | +create a new bucket on your cluster called `student-bucket`. |
| 166 | + |
| 167 | +. Create a new scope called `art-school-scope` within `student-bucket`. |
| 168 | + |
| 169 | +. Create two new collections (`student-record-collection` and `course-record-collection`) inside `art-school-scope`. |
| 170 | + |
| 171 | + |
| 172 | +== Step {counter:step}: Import your data |
| 173 | + |
| 174 | +In this step, you will use `cbimport` to load your two JSON files into your cluster. |
| 175 | + |
| 176 | +''' |
| 177 | + |
| 178 | +.Import the course data |
| 179 | + |
| 180 | +Use the following command to import `courses.json` into your cluster. |
| 181 | + |
| 182 | +[source,console] |
| 183 | +---- |
| 184 | +./cbimport json --cluster 127.0.0.1:8091 \ |
| 185 | + --username Administrator --password password \ |
| 186 | + --bucket student-bucket \ |
| 187 | + --dataset file:///var/lib/mysql-files/courses.json \ |
| 188 | + --format lines \ |
| 189 | + --generate-key %course-id% \ |
| 190 | + --scope-collection-exp art-school-scope.course-record-collection |
| 191 | +---- |
| 192 | + |
| 193 | +The parameters used are as follows: |
| 194 | + |
| 195 | +[horizontal,labelwith=25,itemwidth=75] |
| 196 | +`--cluster`:: |
| 197 | +The address and port of the Couchbase cluster receiving the imported data. |
| 198 | + |
| 199 | +`--username`:: |
| 200 | +A valid admin-level user to log on to the cluster |
| 201 | + |
| 202 | +`--bucket`:: |
| 203 | +The name of the destination bucket for the imported data. |
| 204 | + |
| 205 | +`--dataset`:: |
| 206 | +The full path of the JSON file where the import data can be found. |
| 207 | ++ |
| 208 | +IMPORTANT: Remember to include the `file:://` prefix. |
| 209 | + |
| 210 | +`--format`:: |
| 211 | +This is the of the JSON data that `cbimport` is importing. |
| 212 | +The value can be `lines` or `lists`. |
| 213 | +For this exercise, the value should be set to `lines`. |
| 214 | ++ |
| 215 | +For a detailed explanation of the `--format`, see xref:tools:cbimport-json.adoc#DATASET_FORMATS[Dataset formats]. |
| 216 | + |
| 217 | +`--generate-key`:: |
| 218 | +This tells `cbimport` how to generate the key for the imported data. |
| 219 | +You can use any combination of fields in the data to generate the key. |
| 220 | +In this exercise, we simply set the key to match the `course-id` field in the imported data. |
| 221 | + |
| 222 | +`--scope-collection-exp`:: |
| 223 | +This defines an expression that tells `cbimport` which scope and collection the data will be imported to. |
| 224 | +The expression can be a static value (as we have used above), or a combination of field identifiers from the import data. |
| 225 | ++ |
| 226 | +For more information, see the section on the xref:tools:cbimport-json.adoc#SCOPE_COLLECTION_PARSER[Scope/Collection Parser] |
| 227 | + |
| 228 | + |
| 229 | +''' |
| 230 | + |
| 231 | +.Import the student data |
| 232 | + |
| 233 | +The student JSON file can be imported in much the same way: |
| 234 | + |
| 235 | +[source, console] |
| 236 | +---- |
| 237 | +./cbimport json --cluster 127.0.0.1:8091 \ |
| 238 | +--username Administrator \ |
| 239 | +--password password \ |
| 240 | +--bucket student-bucket \ |
| 241 | +--dataset file:///var/lib/mysql-files/students.json \ |
| 242 | +--format lines \ |
| 243 | +--generate-key %student-id% \ |
| 244 | +--scope-collection-exp art-school-scope.student-record-collection |
| 245 | +---- |
| 246 | + |
| 247 | +== Step {counter:step}: Check your data |
| 248 | + |
| 249 | +Use the web admin console to examine your imported records to make sure they are correct. |
| 250 | + |
| 251 | +image::tutorials:cbimported-data.png[] |
| 252 | + |
| 253 | +== Further reading |
| 254 | + |
| 255 | +For more information about `cbimport`, read the xref:tools:cbimport.adoc[cbimport guide]. |
| 256 | + |
| 257 | +If you would like to know more about MySQL JSON functions, |
| 258 | +then you will find a comprehensive reference https://dev.mysql.com/doc/refman/9.2/en/json-function-reference.html[here]. |
| 259 | + |
| 260 | + |
| 261 | + |
| 262 | + |
| 263 | + |
| 264 | + |
| 265 | + |
| 266 | + |
0 commit comments