Skip to content

Dataset Concepts

DSri Seah edited this page Mar 11, 2025 · 6 revisions

note: this is more of a design doc that tracks the evolution of the system

New to URSYS in 2024 is the Dataset architecture for managing a set of related data collections under one address.

A Dataset consists of several DataBins that have a unique binID name and a binType. A client application can request a particular Dataset by specifying its dataURI and authentication credentials. After the dataset request succeeds, the client can then perform CRUD, Search, and Query operations on any given DataBin by specifying its binID and operation.

The API for Datasets is exposed through the modules sna-dataclient.ts and sna-dataserver.mts. Synchronization is maintained from the server's instance of the loaded dataset with the client's copy. From the client's perspective, data mutations use a write followed by notify pattern that is accessible through an URSYS EventMachine pub/sub interface. Data reads, by comparison, use read from cached dataset which is guaranteed to be "up to date" at time of request.

Component: Dataserver

source: sna-dataserver.mts

The dataserver is a SNA Component that receives URSYS SYNC messages for "whole dataset" and "data CRUD+Query" operations from multiple dataclients on the web. The dataserver is the single source of truth for data; client-based data operations are synchronized to and from the dataserver in normal sync mode.

Dataserver Configuration

  • The PreConfig object must contain runtime_dir which is used to resolve the "file bucket" address where all runtime data is stored.
  • The dataclient must execute the SELECT DATASET command to load the data from permanent storage before any of the API methods work

Dataserver Message API

  • SYNC:SRV_DSET handles "whole dataset" operations declared in DatasetOp on a provided dataURI, which is sent from a dataclient's remote adapter. At the time of this writing, the operations are LOAD, UNLOAD, PERSIST, GET_MANIFEST and GET_DATA
  • SYNC:SRV_DATA handles "databin CRUDQ" operations declared in DataSyncOp on a provided optional dataURI; data operations default to the "selected dataset". Current operations are CLEAR, GET, ADD, UPDATE, WRITE, DELETE, REPLACE, FIND and QUERY.

Dataserver Direct API

In addition to the message-based API, DataSet provides direct API methods. These are currently unused, but are provided for future server-side dataset access.

  • LoadDataset()
  • CloseDataset()
  • PersistDataset()
  • OpenBin()
  • CloseBin()

Dataserver Extensibility

The SYNC:SRV_DSET and SYNC:SRV_DATA protocols are designed to be independent of the filesystem, but dataserver is a default implementation that uses a default dataobject adapter to serialize/deserialize data to the filesystem.

Component: Dataclient

source: sna-dataclient.ts

The dataclient is a SNA Component that sends URSYS SYNC messages to select datasets to use and perform "data CRUD+QUERY" operations on the various databins in the dataset. It also receives update messages when the dataserver updates.

Dataclient Configuration

  • The PreConfig object must have a dataset property containing uri (providing a dataURI that is sent to the server) and mode (default 'sync' to enable two-way synchronization)

The dataclient otherwise initializes itself through its built-in PreHook declarations to call Configure() and Activate() during the application startup cycle, making use of the PreConfig object parameters.

(TBD) Dataclient Authentication

The SYNC:SRV_DSET and SYNC:SRV_DATA protocols for data operations will rely on an access token that is derived from the URSYS authentication token. Currently, though, this support is only stubbed-in and ignored by dataserver. The idea is that once a web app using dataclient is logged-in, the SELECT DATASET operation will negotiate the handshake.

Dataclient Direct API

There is no message API that is exposed for users, as the direct API is suitable. Behind the scenes, however, the dataclient sends SYNC:SRV_DSET and SYNC:SRV_DATA messages to talk to the dataserver and receives SYNC:CLI_DATA messages to synchronize its dataset instance.

The following methods assume that the selected dataset was set through PreConfig as dataURI

  • Get(binID, ...)
  • Add(binID, ...)
  • Update(binID, ...)
  • Write(binID, ...)
  • Delete(binID, ...)
  • DeleteIDs(binID, ...)
  • Replace(binID, ...)
  • Clear(binID, ...)
  • Find(binID, ...)
  • Query(binID, ...)
  • Subscribe(evt, callback)
  • Unsubscribe(evt, callback)

In the default case syncMode=='sync', these calls are routed to the dataserver but changes are not applied locally until the server sends a SYNC:CLI_DATA message. As all web apps using the dataclient module implement this message, this ensure that everyone receives the same change.

(TBD) Accessing Other Datasets through DataClient

In the future it will be possible to access read-only datasets by specifying syncMode=='sync-ro', but in the meantime it's possible to use DatasetAdapter.getDataObj(dataURI) to request the dataset object associated with the dataURI, subject to (TBD) access control.

The DataClient also specifies two implemented methods:

  • async DS_RemoteFind(dataURI, binID, matchCrit?) to find matching items in specified dataURI/binID
  • async function DS_RemoteQuery( dataURI, binID, query?) to query matching items and return them in a RecordSet.

The latter is intended to be used for accessing read-only shared data across the dataset server, subject to (TBD) access controls.

Dataset Portability

The SNA dataset client/server is a default implementation built on top of a protocol that assumes the following:

  • a dataURI that uniquely identifies a set of named dataBins (collections), composed of several platform-independent serialized and persisted stored contents.
  • a set of common CRUD+QUERY operations based on DataObjects with an _id field.
  • a message protocol that maps to the operations for managing datasets (DatasetOp) and the collections within (DataOp)
  • a write through cache implementation that synchronizes the master dataset on the server with multiple client dataset mirrors that are updated behind-the-scenes
  • a notification system through which clients can be aware of changes that were made behind-the-scenes
  • an authentication and access token system to detemine access
  • the Dataset class holds the collection
  • the abstract-class-databin class is the ancestor providing the common set of CRUD+QUERY operations across multiple collection types
  • the Dataset designated by a dataURI has an associated manifest object maps dataBins by name to their platform-dependent storage locations

While the default protocol is URSYS messaging, the actual implementation of the above protocol is up to the developer. In our case, sna-dataserver.mts and sna-dataclient.ts is our default implementation of the dataset concept, and we've approached it as follows:

  • the dataset and data operations are conducted over URSYS with the SYNC:SRV_DSET, SYNC:SRV_DATA, and SYNC:CLI_DATA messages
  • the dataURI is mapped to a directory on the server's filesystem
  • the dataset collections are serialized as JSON files in the dataURI's associated directory

These functions have been isolated into adapters:

  • sna-dataclient makes use of an object using the interface IDS_DatasetAdapter which provides selectDatabase(), getDataObj() and syncData() methods. The default implementation uses URSYS messaging and stores accessToken, sending SYNC:SRV_DSET, SYNC:SRV_DATA, and receiving SYNC:CLI_DATA.
  • sna-dataserver implements separate URSYS message handlers for SYNC:SRV_DSET andSYNC:SRV_DATA, sending SYNC:CLI_DATA whenever the master Dataset instance is updated.
  • sna-dataserver implements the interface IDS_DataObjectAdapter which is the bridge between the filesystem and data objects that represent the persisted data of a dataset. The abstract base class abstract-dataobj.adapter.ts is extended by the default implementation sna-dataobj-adapter.mts, providing methods getDatasetInfo(), readDatasetObj(), readDataBinObj(), writeDatasetObj() and writeDataBinObj().

Note

The server-side message handler could be considered an "adapter", but as the entire sna-dataserver.mts is a default implementation this handler isn't broken-out into its own thing like ProtocolAdapter. The DataObjectAdapter, however, could be replaced in this dataserver implementation to use a different storage backend.


ToDo List

@run-sna.mts

  • uses SNA to build a project directory
  • SNA.Build() scans for .mts files for server
  • SNA.Build() scans for .ts files for web client

server.mts

  • loads SNA server module
  • invokes SNA.Start()

sna-node.mts

  • registers sna-dataserver SNA_MOD
  • implements SNA_Start() export as Start()

sna-dataserver

  • exports SNA_MOD PreHook for 'EXPRESS_READY'
  • handles 'SYNC:SRV_DATA' and 'SYNC:SRV_DSET'
  • receives dataURI from configured project info
  • can load or generate manifest from dataURI
  • can initialize dataset from manifest
  • can manage multiple datasets loaded
  • can load dataset object from disk
  • can serialize dataset object to disk
  • implements DATASET and DATA operations through protocol

app.ts

  • imports SNA client module
  • prompts for login creds
  • successfully authenticates websocket
  • receives dataURI, authToken from appserver
  • saves session dataURI, authToken, userID
  • registers dc-comments as an SNA_MOD
  • invokes SNA.Start() to kick stuff off

sna-web.ts

  • registers sna-dataclient as an SNA_MOD
  • hook DOM_READY for SNA_NetConnect()
  • hook NET_CONNECT for hot module reloading
  • implements SNA_Start() export as Start()

sna-dataclient

  • exports SNA_MOD PreHook for 'NET_DATASET'
  • can get session dataURI
  • can submit session authToken to request dataset
  • can determine syncmode from response
  • can initialize dataset instance from getDataset()
  • handle SYNC:CLI_DATA protocol conditionally syncmode
  • provides means to select dataset by dataURI
  • provides CRUDQ interface for dataset and its databins
  • provides eventmachine notify pubsub

dc-comments

  • exports SNA_MOD PreHook for 'LOAD_DATA'
  • exports SNA_MOD PreConfig to receive GlobalConfig
  • select dataURI and open 'comments' ItemList
  • provide methods for performing comment data manipulation
  • provide eventmachine notify pubsub
Clone this wiki locally