@@ -24,7 +24,7 @@ commitments (_i.e._, Merkle roots) of the block data (_i.e._, the list of trans
24
24
25
25
To make DAS possible, Celestia uses a 2-dimensional Reed-Solomon
26
26
encoding scheme to encode the block data: every block data is split
27
- into $k \times k$ chunks , arranged in a $k \times k$ matrix, and extended with parity
27
+ into $k \times k$ shares , arranged in a $k \times k$ matrix, and extended with parity
28
28
data into a $2k \times 2k$ extended matrix by applying multiple
29
29
times Reed-Solomon encoding.
30
30
@@ -35,19 +35,19 @@ as the block data commitment in the block header.
35
35
![ 2D Reed-Soloman (RS) Encoding] ( /img/learn/reed-solomon-encoding.png )
36
36
37
37
To verify that the data is available, Celestia light nodes are sampling
38
- the $2k \times 2k$ data chunks .
38
+ the $2k \times 2k$ data shares .
39
39
40
40
Every light node randomly chooses a set of unique coordinates in the
41
- extended matrix and queries full nodes for the data chunks and the
41
+ extended matrix and queries full nodes for the data shares and the
42
42
corresponding Merkle proofs at those coordinates. If light nodes
43
43
receive a valid response for each sampling query, then there is a
44
44
[ high probability guarantee] ( https://github.com/celestiaorg/celestia-node/issues/805#issuecomment-1150081075 )
45
45
that the whole block's data is available.
46
46
47
- Additionally, every received data chunk with a correct Merkle proof
47
+ Additionally, every received data share with a correct Merkle proof
48
48
is gossiped to the network. As a result, as long as the Celestia light
49
- nodes are sampling together enough data chunks (_ i.e._ , at least
50
- $k \times k$ unique chunks ),
49
+ nodes are sampling together enough data shares (_ i.e._ , at least
50
+ $k \times k$ unique shares ),
51
51
the full block can be recovered by honest full nodes.
52
52
53
53
For more details on DAS, take a look at the [ original paper] ( https://arxiv.org/abs/1809.09044 ) .
@@ -75,9 +75,9 @@ DA layer.
75
75
The requirement of downloading the $4k$ intermediate Merkle roots is a
76
76
consequence of using a 2-dimensional Reed-Solomon encoding scheme. Alternatively,
77
77
DAS could be designed with a standard (_ i.e._ , 1-dimensional) Reed-Solomon encoding,
78
- where the original data is split into $k$ chunks and extended with $k$ additional
79
- chunks of parity data. Since the block data commitment is the Merkle root of the
80
- $2k$ resulting data chunks , light nodes no longer need to download $O(n)$ bytes to
78
+ where the original data is split into $k$ shares and extended with $k$ additional
79
+ shares of parity data. Since the block data commitment is the Merkle root of the
80
+ $2k$ resulting data shares , light nodes no longer need to download $O(n)$ bytes to
81
81
validate block headers.
82
82
83
83
The downside of the standard Reed-Solomon encoding is dealing with malicious
@@ -86,7 +86,7 @@ block producers that generate the extended data incorrectly.
86
86
This is possible as ** Celestia does not require a majority of the consensus
87
87
(_ i.e._ , block producers) to be honest to guarantee data availability.**
88
88
Thus, if the extended data is invalid, the original data might not be
89
- recoverable, even if the light nodes are sampling sufficient unique chunks
89
+ recoverable, even if the light nodes are sampling sufficient unique shares
90
90
(_ i.e._ , at least $k$ for a standard encoding and $k \times k$ for a
91
91
2-dimensional encoding).
92
92
@@ -112,20 +112,20 @@ To this end, Celestia is using Namespaced Merkle trees (NMTs).
112
112
An NMT is a Merkle tree with the leafs ordered by the namespace identifiers
113
113
and the hash function modified so that every node in the tree includes the
114
114
range of namespaces of all its descendants. The following figure shows an
115
- example of an NMT with height three (_ i.e._ , eight data chunks ). The data is
115
+ example of an NMT with height three (_ i.e._ , eight data shares ). The data is
116
116
partitioned into three namespaces.
117
117
118
118
![ Namespaced Merkle Tree] ( /img/learn/nmt.png )
119
119
120
120
When an application requests the data for namespace 2, the DA layer must
121
- provide the data chunks ` D3 ` , ` D4 ` , ` D5 ` , and ` D6 ` and the nodes ` N2 ` , ` N8 `
121
+ provide the data shares ` D3 ` , ` D4 ` , ` D5 ` , and ` D6 ` and the nodes ` N2 ` , ` N8 `
122
122
and ` N7 ` as proof (note that the application already has the root ` N14 ` from
123
123
the block header).
124
124
125
125
As a result, the application is able to check that the provided data is part
126
126
of the block data. Furthermore, the application can verify that all the data
127
127
for namespace 2 was provided. If the DA layer provides for example only the
128
- data chunks ` D4 ` and ` D5 ` , it must also provide nodes ` N12 ` and ` N11 ` as proofs.
128
+ data shares ` D4 ` and ` D5 ` , it must also provide nodes ` N12 ` and ` N11 ` as proofs.
129
129
However, the application can identify that the data is incomplete by checking
130
130
the namespace range of the two nodes, _ i.e._ , both ` N12 ` and ` N11 ` have descendants
131
131
part of namespace 2.
0 commit comments