Skip to content

Commit 263c77a

Browse files
How to safely use incremental import in a clustered environment (#2730)
Copy of [the PR #2704](#2704)
1 parent be18c88 commit 263c77a

File tree

1 file changed

+19
-8
lines changed

1 file changed

+19
-8
lines changed

modules/ROOT/pages/tools/neo4j-admin/neo4j-admin-import.adoc

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,16 @@ You should use this tool when:
1111

1212
* Import performance is important because you have a large amount of data (millions/billions of entities).
1313
* The database can be taken offline and you have direct access to one of the servers hosting your Neo4j DBMS.
14-
* The database is either empty or its content is unchanged since a previous incremental import.
14+
* The database is non-existent or empty and you need to perform the initial data load.
15+
* You need to update your graph with a large amount of data.
16+
In this case, importing data incrementally can be more performant than transactional insertion.
17+
+
18+
[NOTE]
19+
====
20+
The incremental import can be done either within a single command or in stages.
21+
For details, see <<_incremental_import_in_a_single_command>> and <<incremental-import-stages>>.
22+
====
23+
+
1524
* The CSV data is clean/fault-free (nodes are not duplicated and relationships' start and end nodes exist).
1625
This tool can handle data faults but performance is not optimized.
1726
If your data has a lot of faults, it is recommended to clean it using a dedicated tool before import.
@@ -626,16 +635,18 @@ Incremental import into an existing database.
626635

627636
=== Usage and limitations
628637

629-
[WARNING]
630-
====
631638
The importer works well on standalone servers.
632639

633-
In clustering environments with multiple copies of the database, the updated database must be used as a source to reseed the rest of the database copies.
634-
You can use the procedure xref:procedures.adoc#procedure_dbms_cluster_recreateDatabase[`dbms.cluster.recreateDatabase()`].
635-
For details, see xref:database-administration/standard-databases/recreate-database.adoc[Recreate a database].
640+
To safely perform an incremental import in a clustered environment, follow these steps:
641+
642+
. Run the incremental import command on a single server in the cluster.
643+
This server can then be used as the xref:clustering/databases.adoc#cluster-designated-seeder[designated seeder] from which other cluster members can copy the database.
644+
. Reconfigure the database topology to a single primary by running the xref:procedures.adoc#procedure_dbms_cluster_recreateDatabase[`dbms.cluster.recreateDatabase()`] procedure.
645+
. Then stop the database using xref:database-administration/standard-databases/create-databases.adoc#manage-databases-stop[STOP DATABASE].
646+
. Perform the incremental import on the server that hosts the database.
647+
. Then start the database with xref:database-administration/standard-databases/create-databases.adoc#manage-databases-start[START DATABASE].
648+
. Lastly, restore the desired database topology using xref:database-administration/standard-databases/alter-databases.adoc#[ALTER DATABASE].
636649

637-
Starting the clustered database after an incremental import without reseeding or performing the incremental import on a single server while the database remains online on other clustered members may result in unpredictable consequences, including data inconsistency between cluster members.
638-
====
639650

640651
The incremental import command can be used to add:
641652

0 commit comments

Comments
 (0)