Skip to content

Commit

Permalink
Merge pull request #97 from nspcc-dev/upd/split-object-updates
Browse files Browse the repository at this point in the history
Upd/split object updates
  • Loading branch information
roman-khimov authored Feb 12, 2024
2 parents 968f686 + 6c234c5 commit 046e623
Show file tree
Hide file tree
Showing 10 changed files with 35 additions and 7 deletions.
1 change: 1 addition & 0 deletions .github/workflows/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* @carpawell @cthulhu-rider @roman-khimov
2 changes: 1 addition & 1 deletion 01-arch/03-objects.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@ NeoFS stores all data in the form of objects, thus providing an object-based sto

ObjectID is a hash that equals Headers hashes plus Payload hashes. Any object includes a system header, extended headers, and a payload. A system header is an obligatory field, while extended headers may be omitted. However, any extended header should follow a particular structure (e.g. IntegrityHeader is a must). A user can add any extended header in the form of a key-value pair, though keeping in mind that it cannot be duplicated with several values. One attribute -- one value. Please note that any object initially has FileName, so that you cannot create an extended header with it as a key.

The maximum size for an object is fixed and can be changed only for the whole network in the main contract. It means that if a file is too heavy, it will be automatically divided into smaller objects. This smaller parts are put in a container and placed to a Storage Node. Later, they can be assembled to the initial object. Such assembling is performed in the storage nodes upon a corresponding request for a linking object. Once your file is converted into an object (or several objects), this object cannot be changed.
The maximum size for an object is fixed and can be changed only for the whole network in the main contract. It means that if a file is too heavy, it will be automatically divided into smaller objects. This smaller parts are put in a container and placed to a Storage Node. Later, they can be assembled to the original object. Such assembling is performed in the storage nodes upon a corresponding request for a linking object. Once your file is converted into an object (or several objects), this object cannot be changed.

One can define the format of the object in an API Specification. For more information, see [API Specification](https://github.com/nspcc-dev/neofs-api/tree/master/proto-docs).
31 changes: 26 additions & 5 deletions 01-arch/04-object_split.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@ NeoFS has a limit on the maximal physically stored single object size. If there

For each part of the original object's payload, a separate object with own `ObjectID` will be created. The large object will not be physically present in the system, but it will be reconstructed from the object parts when requested.

![Large object split](pic/object_split_all)
There are two active versions of the split objects, both are supported, but the second one is more flexible and support attributes-based ACL rules. The first one is kept for backward compatibility only. All objects participating in the split have the `Split` headers set. Depending on the place in the split hierarchy, it has different field combinations.

All objects participating in the split have the `Split` headers set. Depending on the place in the split hierarchy it has different field combinations. There are four possible cases:
### Object Split V1

![Large object split V1](pic/object_split_all_v1)

* First part \
First part object only has the `split_id` field set, as there is no more information known at this point
Expand All @@ -15,13 +17,32 @@ All objects participating in the split have the `Split` headers set. Depending o
Middle parts have information about the previous part in `previous` field in addition to the `split_id`

* Last part \
At this point all the information about the object under split is known. Hence the last part contains not only the `split_id` and `previous` fields, but also the `ObjectID` of the original large object in its `parent` field, signed `ObjectID` in `parent_signature` and original object's `Header` in `parent_header`.
At this point, all the information about the object under split is known. Hence, the last part contains not only the `split_id` and `previous` fields, but also the `ObjectID` of the original large object in its `parent` field, signed `ObjectID` in `parent_signature` and original object's `Header` in `parent_header`.

* Link object \
There are special "Link objects" that have the same common `split_id`, do not have any payload, but contain original object's `ObjectID` in `parent` field, it's signature in `parent_signature`, original object's `Header` in `parent_header` and the list of all object parts with payload in repeated `children` field. Link objects help to speed up the large object reconstruction and `HEAD` requests processing. If Link object is lost, the original large object still will be reconstructed from its parts, but it will require more actions from NeoFS nodes.

All of the split hierarchy objects may be physically stored on different nodes. During reconstruction, at first the link object or the last part object will be found. If it's a HEAD request, the link object or the last part object will have all the information required to return the original large object's HEAD response. For a GET request, the payload will be taken from part objects listed in the `split.children` header. As they are ordered, it will be possible to begin streaming the payload as soon as the first part object becomes available. If Link object is lost, some additional time will be spent on reconstructing the list from `split.previous` header fields.
### Object Split V2
\pagebreak

![Large object split V2](pic/object_split_all_v2)

* First part \
First part object only has `parent_header` fields set. The parent header is not finished: since there is no information about payload, it cannot be measured, hashed and signed (also, no object ID can be assigned). However, it has all user information attached to the object (e.g. attributes), and plays the role of representative of the original object until all the parts are uploaded to the NeoFS.

* Middle parts \
Middle parts have information about the previous part in `previous` field and the first one in the `first` field

* Last part \
At this point, all the information about the object under split is known. Hence, the last part contains not only the `first` and `previous` fields, but also the `ObjectID` of the original large object in its `parent` field, signed `ObjectID` in `parent_signature` and original object's `Header` in `parent_header`.

* Link object \
There are special "Link objects" that have the same common `first` field, but also contain original object's `ObjectID` in `parent` field, its signature in `parent_signature`, original object's `Header` in `parent_header` and a list of all the object parts' IDs paired with their sizes encoded in the payload. Link objects help to speed up the large object reconstruction and `HEAD` requests processing. If a Link object is lost, the original large object still will be reconstructed from its parts, but it will require more actions from NeoFS nodes.

All the split hierarchy objects may be physically stored on different nodes. During reconstruction, at first the link object or the last part object will be found. If it's a HEAD request, the link object or the last part object will have all the information required to return the original large object's HEAD response. For a GET request, the payload will be taken from part objects listed in the Link object's payload. As they are ordered, it will be possible to begin streaming the payload as soon as the first part object becomes available. If a Link object is lost, some additional time will be spent on reconstructing the list from `split.previous` header fields.

If the whole payload is available, a large object may be split on the client side using local tools like `neofs-cli`. In this case, the resulting object set will be signed with user's key. Such a split type can be called a "Static split".

If the whole payload is available, a large object may be split on the client side using local tools like `neofs-cli`. In this case the resulting object set will be signed with user's key. Such a split type can be called a "Static split".
There are attribute-based ACL rules in the NeoFS. Before the last part and the link object are created, the original object's parts should be validated according to the initial object's information about the original object.

When the large object's payload is not fully available right away, or it is too big to be split locally, the object upload can be started in a Session with another NeoFS node and be streamed in a PUT operation, part by part. Object parts will be automatically created as soon as the payload hits the MaxObjectSize limit. In this case, the resulting object set will be signed with a session key signed by user's key. This split type can be called a "Dynamic split".
File renamed without changes.
File renamed without changes.
File renamed without changes
2 changes: 2 additions & 0 deletions 01-arch/pic/object_split_all_v2.drawio
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
<?xml version="1.0" encoding="UTF-8"?>
<mxfile host="Electron" modified="2024-02-05T18:05:14.575Z" agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/23.0.2 Chrome/120.0.6099.109 Electron/28.1.0 Safari/537.36" etag="5lw1Y5zD3fnaqyJaEmA_" version="23.0.2" type="device"><diagram id="b3hDvS4LCimOuX0njIPD" name="Page-1">7V1dc9o4FP01vHSmjD+wgUcw0HQ2STub2d1kXzoKFqDGWKwtAvTXr2RLgJEAU2oRGmUyGXRtybLO0cc9XCk1N5guPyVgNrnDIYxqjhUua26v5jjtVov+ZYZVbvBafm4YJyjMTfbG8IB+QG60uHWOQpgWbiQYRwTNisYhjmM4JAUbSBK8KN42wlHxqTMwhpLhYQgi2foPCskkt7ac5sZ+A9F4Ip5s++38yhSIm/mbpBMQ4sWWye3X3CDBmOSfpssARqztRLvk+QZ7rvKKpWQlqipqlMCYlClhdd8b+9+fe+7NU+fu39Hs6+pm8ZEX+wqiOS/2FsUv1PLl+Ttr392Hpgs0jUBMU90RjskDv2LR9HCCovAWrPCcVSclYPgiUt0JTtAPej+I6CWbGujlhHD4HZ+VhqIowBFOsue4vj+gP4WcD6xE/qwEpjTvV/Hu9o7pDiwLN96ClIha4igCsxQ9Z/VmGacgGaO4iwnBU37TOAEhouVsVWiU/bCa8xag7SRKSPA8DmHIM6+Bzx83RUNRagTSlH+WwRNIwITA5ZaJg/kJ4ikkyYrewq82RJZVsQstNry125ydky3OOhbvo4D3lfG66A176AdOoBPI5EhkwhmLvrFcHXrlc8+TGEXflmSESPALFA0e45xiRVKsMQARGsfUFsERy8vaDNE+3OFmgmesxBkYonh8m93Ta2wsf/K2YCZM846iDK4JCkMYZ3ASQMDzmtgzjGKStZXXpb+0mQOr7tU8WqWApu1Nmv6y2xNKnJi+EEAZuJCybwEZAxWwH+yVx7nAsXf8ktj7FUHvS9DPwCrCgANPm+lzjzXUvZ21E005LOWIlMtSbtamhh+/nh+ec2F+2A2JILXAqXUsg3cVeDdb+vDuPL3i1eN/rWUbpD3ij9LX+OljW4J7AkEIk3o6ixCpj1DC3j+fE+SlzYkcoG92xRxg66itF7KynzU3JMzL0mUvN2yrcenBwDrMDvGCOT0a59AjCAaD66ZHVUOE7ZZdL1ZGA9n5UNDgW0oRAmSewJwQ9XrdEKISQniXXkTasgOhIkRuM2yolg2tSy8ZXXnJSL17Qi325fUJpk74vtEnDugTAj/hoirmG6dhqQjVqIpRnsQoSaA4ezH6LhySdd88R6HYg35V4B+RKKxlp9PtnjmfDAZctjQEKCFBaCZAUyKAkSAqBFyhQWgGvLVvOck7/rcIxmNWN9b/7/eO/FcL7TZXQzgC86hSwFXCQmWIK1Un911B+DPSUI7qSaKAXgh9A+GZECrceL0Qyn7bMS/+w/0X+vem83DTf/hwznT8u/jzVfBC4dDrnY8bsrxjFmC8aIRjXjKabseYOI1KnTKr5BrNq4wTslfGRR7HiDzXIPLYJVQeS6vK05D9PEnlcd73kFNyQFl3zrNUHjX6VYEv+3w7Kk8Q9HpnqjyGACepPHoJIEcemEVGhYCrVJ6qAA++oD/+bkZW4HhWfLcIXpKur4g8OyTymIG/GhootR+tHd8/FlSSwFeE56mJOioRdVQFQ8qvDM9niFKXkL/9MzFp2mLScsKcplppHT88ORrJLByOqBPVDhgqwUpFCXHfr59SZEpwccI14sQViBOe9+ZCUHxZA5XEib3fYr2PEafkgOKXp8NbCUHxXQn8HXGi3x8MSosT2xOztH2u61x3HEoI0kk2YLDhA7PqEAaje8KUczpDdMaoKJ1Zs0tG64igM0RFibcsVR8SL8zEUA0NygeuVLVz1t+rYqnFi7NlrOt2T6tjgqtxBlCKFHtj1oxIUb1I4R/bIXPpkEZf/nbD7JzTP0q0NAY6KuvcLLWD0myd0+ZaKg7f0MyIcpspzd45LXRwy0oNVS0obXkdcQuSMby8bslO9vmddcuCcCEz7GQR025z3hw65kdFrcqO+bHlRciOhplTzCxAyo0v67563nE/WrfWyOsPs9bQBL7yLB+d4DvydxhmWaEDeeWpPlqRL7GHlsIfBPRPv2+iLKsigvoIH61MMJtp9SKuPK1HK+KyR2EQrxJx5XE8WhE3odR6EVceuaNVRJIDFCSsYRx22EHLNDVkbi9zgrsTMhX+P22VZPW4nXhiGNDG5cme8ODz1Eqklog8bn3eykVTm0wsIfLIUIivRPA8GcIDbyqihQkTRQ7BKM6MgWHh4GgZxC3QPNUOOm5LYAQIeoWF6qqA5E/4yoi7pQZwQNYLAatYQv7iPNOGDVI5frEcb6eYvFmkYjJSrd/5DJ4p4hx+U575SvqUIF6+je1SPGu0iwRxdxlSlmiNHaY5Tb1Ua8lKxV8xQSSCIbXeghV1WyV0f+Ig9ZYifHiX0kU1cTFBBD7Q6YxdXSRgVuR3QWzcKzBuq5BSgKjvd12F5skn5p8Lqlq3Z2kB02m4xTHLlp0X16kLq5aJTsy0bwct32fC+BtAq+kdR8vVjFaJZYlutHh85YXRcoXI/5b6VonJ/X2OhLsnIrZkn17vt4QteUvUxTvW2xgGXds9jpXl/BKsaHLzb2Hydc/mf+u4/f8B</diagram></mxfile>
Binary file added 01-arch/pic/object_split_all_v2.pdf
Binary file not shown.
4 changes: 4 additions & 0 deletions 01-arch/pic/object_split_all_v2.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
FROM ubuntu:20.04
MAINTAINER Stanislav Bogatyrev <stanislav@nspcc.ru>
MAINTAINER NeoSPCC <info@nspcc.io>

ENV DEBIAN_FRONTEND noninteractive

Expand Down

0 comments on commit 046e623

Please sign in to comment.