Raphael/bulk details (#203)

rderbier · ryanfoxtyler · web-flow · commit f85bf59bcd92 · 2025-09-09T18:06:23.000-04:00
* Update bulk-loader.mdx

details about p directory

* Update bulk-loader.mdx

add increment to trigger snapshot

* Update bulk-loader.mdx

* Update dgraph/admin/bulk-loader.mdx

---------

Co-authored-by: Ryan Fox-Tyler &lt;60440289+ryanfoxtyler@users.noreply.github.com&gt;
diff --git a/dgraph/admin/bulk-loader.mdx b/dgraph/admin/bulk-loader.mdx
@@ -267,20 +267,26 @@ In case your dataset is small (a few gigabytes) it would be convenient to start
 by initializing just one Alpha node and then let the snapshot be streamed among
 the other Alpha replicas. You can follow these steps:
 
-1. Run Bulk Loader only on one server
+1. Run Bulk Loader only on one Alpha server
 
-2. Once the `p` directory has been created by the Bulk Loader, then start
-   **only** the first Alpha replica
+2. Once the generated `out\0\p` directory has been created by the Bulk Loader,
+   copy the `p` directory (default path is `out/0/p`) to the Alpha volume.
 
-3. Wait for 1 minute to ensure that a snapshot has been taken by the first Alpha
+3. Start **only** the first Alpha replica
+
+4. Generate some mutations. Without mutation the Alpha will not create a
+   snapshot. You can run `dgraph increment -n 10000` to generate some mutations
+   on an internal counter not affecting your data.
+
+5. Wait for 1 minute to ensure that a snapshot has been taken by the first Alpha
    node replica. You can confirm that a snapshot has been taken by looking for
    the following message":
 
    ```txt
    I1227 13:12:24.202196   14691 draft.go:571] Creating snapshot at index: 30. ReadTs: 4.
    ```
 
-4. After confirming that the snapshot has been taken, you can start the other
+6. After confirming that the snapshot has been taken, you can start the other
    Alpha node replicas (number of Alpha nodes must be equal to the `--replicas`
    flag value set in the Zero nodes). Now the Alpha node (the one started in
    step 2) logs similar messages:
@@ -305,9 +311,9 @@ When your dataset is pretty big (larger than 10 GB) it is faster that you just
 copy the generated `p` directory (by the Bulk Loader) among all the Alphas
 nodes. You can follow these steps:
 
-1. Run Bulk Loader only on one server
-2. Copy (or use `rsync`) the `p` directory to the other servers (the servers you
-   are using to start the other Alpha nodes)
+1. Run Bulk Loader only on one Alpha server
+2. Copy (or use `rsync`) the generated `out\0\p` directory to all Alpha nodes
+   (the servers you are using to start the Alpha nodes)
 3. Now, start all Alpha nodes at the same time
 
 If the process went well **all** Alpha nodes take a snapshot after 1 minute. You