Skip to content

Commit b949ede

Browse files
committed
[DOC-13093]: Write a tutorial for SQL migration.
Add SQL to Capella migration tutorial This commit introduces a new guide on migrating data from MySQL to Couchbase Capella, providing step-by-step instructions for users. Additionally, the navigation file has been updated to include a link to the new tutorial. Rename SQL migration tutorial to specify Couchbase Server. Update tutorial to specify MySQL and minimum version requirement Clarify that the tutorial focuses on migrating data specifically from MySQL. Add a note requiring MySQL version 5.7.22 or later due to JSON function usage. Minor grammar and style adjustments. Add reference to MySQL JSON functions in the SQL migration guide Provide a link to the MySQL JSON function reference for additional context and detailed information. Changes for preview Revised headers for consistency in cbimport guide and updated example JSON data in the SQL migration tutorial to better align with a new data model. Added detailed instructions and images for setting up buckets, scopes, collections, and importing data. Update SQL migration tutorial with the document model and examples Add detailed steps, diagrams, and JSON examples for migrating SQL data to a document model, including student and course data. Update the enrollment schema with a 'score' field and clarify how to structure JSON for Couchbase imports. New tutorial page and addition to the navigation menu.
1 parent 964a692 commit b949ede

File tree

6 files changed

+285
-1
lines changed

6 files changed

+285
-1
lines changed

antora.yml

+5
Original file line numberDiff line numberDiff line change
@@ -10,5 +10,10 @@ ext:
1010
sources:
1111
docs-server:
1212
branches: [release/7.6]
13+
backup:
14+
branches: [ master ]
15+
startPaths: docs
1316
override:
1417
startPage: server:introduction:intro.adoc
18+
19+
352 KB
Loading
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
= Migrating your Data from MySQL to Capella
2+
:description: Using MySQL as a starting point, this guide demonstrates \
3+
how to migrate your existing data from SQL tables to a Couchbase Capella instance.
4+
:page-topic-type: guide
5+
:page-pagination: full
6+
7+
[abstract]
8+
{description}
9+
10+
== Introduction
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,266 @@
1+
= Migrating your Data from MySQL to Couchbase Server
2+
:description: Using MySQL as a starting point, this guide demonstrates \
3+
how to migrate your existing data from SQL tables to documents stored in a Couchbase bucket.
4+
:page-topic-type: guide
5+
:page-pagination: next
6+
7+
[abstract]
8+
{description}
9+
10+
== Introduction
11+
12+
Couchbase offers a number of strategies for migrating your existing data;
13+
in this example, we will begin with a sample student record database stored in MySQL,
14+
and use the Couchbase server tool `cbimport` to copy the data into a Couchbase cluster.
15+
16+
== Prerequisites
17+
18+
Before you begin this exercise, you should have installed and set up a Couchbase cluster on your local machine.
19+
You will find instructions for creating a fresh cluster here: xref:getting-started:do-a-quick-install.adoc[Couchbase Server Installation]
20+
21+
To use `cbimport`, you will need to install the Couchbase `CLI` package.
22+
You will find the location of the package and instructions for installing it, here: xref:cli:cli-intro.adoc[]
23+
24+
If you're running through the examples,
25+
then you will also need an existing MySQL installation with the preexisting table structure
26+
defined in xref:student-record-sql-database-section[the following section].
27+
28+
IMPORTANT: This tutorial makes use of the MySQL JSON functions that were introduced in version `5.7.22`.
29+
Make sure you have installed MySQL version `5.7.22` or later.
30+
31+
[#student-record-sql-database-section]
32+
== Student Record database
33+
34+
The database we will convert will consist of a relational structure with three tables:
35+
36+
[plantuml,student-record-erd]
37+
.Student records SQL database
38+
....
39+
include::partial$diagrams/student-record-erd.puml[]
40+
....
41+
42+
which we will convert to a document model suitable for storage in our Couchbase bucket:
43+
44+
[#document-model]
45+
[plantuml,student-document-database-design]
46+
.Student document model
47+
....
48+
include::partial$diagrams/student-document-database-design.puml[]
49+
....
50+
51+
You will see that our document model is not an exact mapping of the SQL database:
52+
we have taken the `enrollments` records and added them directly as a list of sub-documents
53+
within each student record:
54+
55+
[source, json]
56+
----
57+
[
58+
{"student-id": 1,
59+
"enrollments": [{"course-id": 3, "final-score": null, "date-enrolled": "2024-04-18", "date-completed": null}, {"course-id": 1, "final-score": null, "date-enrolled": "2025-03-05", "date-completed": null}], "student-name": "Harriet Hill", "date-of-birth": "1970-03-06"},
60+
{"student-id": 2, "enrollments": [{"course-id": 1, "final-score": null, "date-enrolled": "2025-03-01", "date-completed": null}], "student-name": "Steven Morris", "date-of-birth": "1984-03-05"},
61+
{"student-id": 3, "enrollments": [{"course-id": null, "final-score": null, "date-enrolled": null, "date-completed": null}], "student-name": "Jenny Mills", "date-of-birth": "1969-11-06"}
62+
]
63+
----
64+
65+
TIP: In the early stages of your migration, it is a good idea to design the new structure of your Couchbase collections.
66+
This will make it easier to work out the `cbimport` command parameters you will need for the migration.
67+
68+
69+
== Step {counter:step}: Extract your Course data from MySQL
70+
71+
The first stage of your migration is to extract the data a file format that the `cbimport` utility can work with.
72+
`cbimport` can work with comma-separated value files or JSON-formatted files.
73+
Because we already know that we will be embedding our `enrollment` records into the record for each student,
74+
makes sense to use the more versatile JSON structure.
75+
76+
Fortunately, MySQL has a number of SQL functions that make working with JSON data fairly straightforward,
77+
so we'll start by migrating the `course` table into a JSON file:
78+
79+
[source, mysql]
80+
.Extract the `course` table into the file: `/var/lib/mysql-files/courses.json`
81+
----
82+
SELECT JSON_OBJECT(
83+
'course-id', course.`course-id`,
84+
'course-name', course.`course-name`,
85+
'faculty', course.faculty,
86+
'credit-points', course.`credit-points`
87+
) FROM course
88+
INTO OUTFILE '/var/lib/mysql-files/courses.json'
89+
----
90+
91+
Using the `JSON_OBJECT` function, the command will `SELECT` every record in the table and output it to a file.
92+
Each line of the file will correspond to a single record:
93+
94+
[source,jsonlines]
95+
----
96+
{"faculty": "Art", "course-id": 1, "course-name": "Art History", "credit-points": 50}
97+
{"faculty": "Art", "course-id": 2, "course-name": "Fine Art", "credit-points": 30}
98+
{"faculty": "Design", "course-id": 3, "course-name": "Graphic Design", "credit-points": 70}
99+
{"faculty": "English", "course-id": 4, "course-name": "Creative Writing", "credit-points": 70}
100+
----
101+
102+
NOTE: Strictly speaking, the JSON output is not a well-formed JSON document because it isn't structured as an array.
103+
Nevertheless, `cbimport` will read each line as a separate record.
104+
105+
== Step {counter:step}: Extract your Student data from MySQL
106+
107+
This case is slightly different because we want to include the enrollment details with each student record
108+
(see the xref:document-model[])
109+
110+
We can handle this JSON structure by using a more involved SELECT:
111+
as well as extracting the student records, we can simultaneously pull in the enrollments for each student:
112+
113+
[source,mysql]
114+
.Extract `students` and their `enrollments`.
115+
----
116+
SELECT JSON_OBJECT(
117+
'student-id', student.`student-id`,
118+
'student-name', student.name,
119+
'date-of-birth', student.`date-of-birth`,
120+
'enrollments', IF (COUNT(enrollment.`course-id`) = 0, JSON_ARRAY(), JSON_ARRAYAGG(
121+
JSON_OBJECT(
122+
'course-id', enrollment.`course-id`,
123+
'date-enrolled', enrollment.`date-enrolled`,
124+
'date-completed', enrollment.`date-completed`,
125+
'final-score', enrollment.`score`
126+
)
127+
))
128+
)
129+
FROM student
130+
LEFT OUTER JOIN enrollment ON enrollment.`student-id` = student.`student-id`
131+
GROUP BY student.`student-id`
132+
INTO OUTFILE '/var/lib/mysql-files/students.json';
133+
----
134+
135+
In addition to the `JSON_OBJECT` function call that extracts the student details,
136+
we are also using the `JSON_ARRAYAGG` function to build an array within each student record.
137+
The data for this list is retrieved through the `LEFT OUTER JOIN`
138+
which provides the foreign key link between the student and the enrollment record.
139+
140+
We also use the ``IF (COUNT(enrollment.`course-id`) = 0`` statement
141+
to ensure that there are existing enrollment records attached to the current student.
142+
If there are no enrollment records, then that portion of the query uses `JSON_ARRAY()` to return an empty list.
143+
144+
[source, jsonlines]
145+
----
146+
{"student-id": 1, "enrollments": [{"course-id": 3, "final-score": 0, "date-enrolled": "2025-03-10", "date-completed": null}, {"course-id": 1, "final-score": 0, "date-enrolled": "2025-03-10", "date-completed": null}], "student-name": "Hilary Wells", "date-of-birth": "1990-08-09"}
147+
{"student-id": 2, "enrollments": [{"course-id": 2, "final-score": 0, "date-enrolled": "2025-03-10", "date-completed": null}], "student-name": "Ashley Matthews", "date-of-birth": "1987-07-01"}
148+
{"student-id": 3, "enrollments": [{"course-id": 1, "final-score": 0, "date-enrolled": "2025-03-10", "date-completed": null}], "student-name": "Boregard Johnson", "date-of-birth": "1985-03-23"}
149+
{"student-id": 4, "enrollments": [], "student-name": "Toni Jones", "date-of-birth": "1984-10-02"}
150+
----
151+
152+
== Step {counter:step}: Create your bucket, scope, and collections.
153+
154+
You will need to create the bucket, scope, and collections to hold the data on your Couchbase cluster.
155+
156+
For information on creating buckets, scopes, and collections,
157+
read the sections on xref:manage:manage-buckets/bucket-management-overview.adoc[Managing Buckets]
158+
and xref:manage:manage-scopes-and-collections/manage-scopes-and-collections.adoc[Managing Scopes and Collections]
159+
160+
'''
161+
162+
.Set up your cluster
163+
164+
. Using the Couchbase admin console, the command line tool, or the REST API,
165+
create a new bucket on your cluster called `student-bucket`.
166+
167+
. Create a new scope called `art-school-scope` within `student-bucket`.
168+
169+
. Create two new collections (`student-record-collection` and `course-record-collection`) inside `art-school-scope`.
170+
171+
172+
== Step {counter:step}: Import your data
173+
174+
In this step, you will use `cbimport` to load your two JSON files into your cluster.
175+
176+
'''
177+
178+
.Import the course data
179+
180+
Use the following command to import `courses.json` into your cluster.
181+
182+
[source,console]
183+
----
184+
./cbimport json --cluster 127.0.0.1:8091 \
185+
--username Administrator --password password \
186+
--bucket student-bucket \
187+
--dataset file:///var/lib/mysql-files/courses.json \
188+
--format lines \
189+
--generate-key %course-id% \
190+
--scope-collection-exp art-school-scope.course-record-collection
191+
----
192+
193+
The parameters used are as follows:
194+
195+
[horizontal,labelwith=25,itemwidth=75]
196+
`--cluster`::
197+
The address and port of the Couchbase cluster receiving the imported data.
198+
199+
`--username`::
200+
A valid admin-level user to log on to the cluster
201+
202+
`--bucket`::
203+
The name of the destination bucket for the imported data.
204+
205+
`--dataset`::
206+
The full path of the JSON file where the import data can be found.
207+
+
208+
IMPORTANT: Remember to include the `file:://` prefix.
209+
210+
`--format`::
211+
This is the of the JSON data that `cbimport` is importing.
212+
The value can be `lines` or `lists`.
213+
For this exercise, the value should be set to `lines`.
214+
+
215+
For a detailed explanation of the `--format`, see xref:tools:cbimport-json.adoc#DATASET_FORMATS[Dataset formats].
216+
217+
`--generate-key`::
218+
This tells `cbimport` how to generate the key for the imported data.
219+
You can use any combination of fields in the data to generate the key.
220+
In this exercise, we simply set the key to match the `course-id` field in the imported data.
221+
222+
`--scope-collection-exp`::
223+
This defines an expression that tells `cbimport` which scope and collection the data will be imported to.
224+
The expression can be a static value (as we have used above), or a combination of field identifiers from the import data.
225+
+
226+
For more information, see the section on the xref:tools:cbimport-json.adoc#SCOPE_COLLECTION_PARSER[Scope/Collection Parser]
227+
228+
229+
'''
230+
231+
.Import the student data
232+
233+
The student JSON file can be imported in much the same way:
234+
235+
[source, console]
236+
----
237+
./cbimport json --cluster 127.0.0.1:8091 \
238+
--username Administrator \
239+
--password password \
240+
--bucket student-bucket \
241+
--dataset file:///var/lib/mysql-files/students.json \
242+
--format lines \
243+
--generate-key %student-id% \
244+
--scope-collection-exp art-school-scope.student-record-collection
245+
----
246+
247+
== Step {counter:step}: Check your data
248+
249+
Use the web admin console to examine your imported records to make sure they are correct.
250+
251+
image::tutorials:cbimported-data.png[]
252+
253+
== Further reading
254+
255+
For more information about `cbimport`, read the xref:tools:cbimport.adoc[cbimport guide].
256+
257+
If you would like to know more about MySQL JSON functions,
258+
then you will find a comprehensive reference https://dev.mysql.com/doc/refman/9.2/en/json-function-reference.html[here].
259+
260+
261+
262+
263+
264+
265+
266+

modules/tutorials/partials/diagrams/student-record-erd.puml

+2-1
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,9 @@ entity enrollment {
2020
* student-id
2121
* course-id
2222
--
23-
* date-enrolled
23+
date-enrolled
2424
date-completed
25+
score
2526
}
2627

2728

modules/tutorials/partials/nav.adoc

+2
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,5 @@
66
** xref:tutorials:java-tutorial/retrieving-documents.adoc[]
77
** xref:tutorials:java-tutorial/adding-course-enrollments.adoc[]
88
** xref:tutorials:java-tutorial/tutorial-troubleshooting.adoc[]
9+
* Migration Tutorial
10+
** xref:tutorials:migration-tutorial/sql-migration-tutorial-couchbase-server.adoc[]

0 commit comments

Comments
 (0)