Updates following statistics of n
-variate data
d
'th moment tensor,- an array of moment tensors of order
1,2,...,d
.
Given t
realisations of n
-variate data: X ∈ ℜᵗˣⁿ
, and the update X + ∈ ℜᵘˣⁿ
returns array of updated cumulant tensors of order 1,2,...,d
.
To store symmetric tensors uses a SymmetricTensors
type, requires SymmetricTensors.jl. To convert to array, run:
julia> Array(st::SymmetricTensors{T, d})
to convert back, run:
julia> SymmetricTensor(a::Array{T,d})
Requires Cumulants.jl.
As of 01/01/2017 kdomino is the lead maintainer of this package.
Within Julia, run
pkg> add CumulantsUpdates
to install the files. Julia 1.0 or later is required.
For parallel computation just run
julia> addprocs(n)
julia> @everywhere using CumulantsUpdates
To update data X ∈ ℜᵗˣⁿ
by the update X+ ∈ ℜᵘˣⁿ
in the observation
window of size t
and get ℜᵗˣⁿ ∋ X' = vcat(X,X+)[1+u:end, :]
run:
julia> dataupdat(X::Matrix{T}, Xplus::Matrix{T}) where T<:AbstractFloat
the condition u < t
must be fulfilled.
julia> a = ones(4,4)
4×4 Array{Float64,2}:
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
julia> b = zeros(2,4)
2×4 Array{Float64,2}:
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
julia> dataupdat(a,b)
4×4 Array{Float64,2}:
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
To update the moment tensor M::SymmetricTensor{T, d}
for data X ∈ ℜᵗˣⁿ
, given the update X+ ∈ ℜᵘˣⁿ
where u < t
run
julia> momentupdat(M::SymmetricTensor{T, d}, X::Matrix{T}, Xplus::Matrix{T}) where {T<:AbstractFloat, d}
Returns a SymmetricTensor{T, d}
of the moment tensor of updated multivariate data:
ℜᵗˣⁿ ∈ X' = dataupdat(X,X+)
, i.e.: M = moment(X, d)
, momentupdat(M, X, X+) = moment(X', d)
. If u ≪ t
momentupdat()
is much faster than a recalculation.
julia> x = ones(6, 2);
julia> SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0]
[1.0 1.0; 1.0 1.0]], 2, 1, 2, true)
julia> y = 2*ones(2,2);
julia> momentupdat(m, x, y)
SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[3.33333 3.33333; 3.33333 3.33333]
[3.33333 3.33333; 3.33333 3.33333]], 2, 1, 2, true)
Function momentarray(X, d)
can be used to compute an array of moments of order 1, ..., d
of data X ∈ ℜᵗˣⁿ
julia> momentarray(X::Matrix{T}, d::Int, b::Int) where {T<:AbstractFloat, d}
b
- is a SymmetricTensor
parameter, a block size.
One can update an array of moments by calling:
julia> momentupdat(M::Array{SymmetricTensor{T, d}}, X::Matrix{T}, Xplus::Matrix{T}) where {T<:AbstractFloat, d}
Returns an Array{SymmetricTensor{T, d}}
of moment tensors of order 1, ..., d
of updated multivariate data: ℜᵗˣⁿ ∋ X' = dataupdat(X,X+)
, i.e. Mₐᵣ = momentarray(X, d)
, momentupdat(Mₐᵣ, X, X+) = momentarray(X', d)
.
Presented functions are design for sequent update of multivariate cumulant tensors.
Hence it can be applied in a data streaming scheme. Suppose one has data X ∈ ℜᵗˣⁿ
and subsequent coming updates X+ ∈ ℜᵘˣⁿ
such that u ≪ t
. Suppose one want to compute cumulant tensors in an observation window of size t
each time the update comes.
To store data amd moments we use the following structure mutable struct DataMoments{T <: AbstractFloat}
with following fields: X
- data, d
- maximal order of a moment series, b
- a block size, M
- moment series. To initialise, we can use the following constructor:
julia> DataMoments(X::Matrix{T}, d::Int, b::Int) where T <: AbstractFloat
here an array of moments will be computed. To update a DataMoments structure and compute updated cumulants run:
julia> cumulantsupdate!(dm::DataMoments{T}, Xplus::Matrix{T}) where T <: AbstractFloat
The function updates a DataMoment structure and returns a series of cumulants of order 1, ..., dm.d
for updated data. See:
julia> x = ones(10,2);
julia> s = DataMoments(x, 4, 2);
julia> y = zeros(4,2);
julia> cumulantsupdate!(s,y)[4]
SymmetricTensor{Float64,4}(Union{Nothing, Array{Float64,4}}[[-0.1056 -0.1056; -0.1056 -0.1056]
[-0.1056 -0.1056; -0.1056 -0.1056]
[-0.1056 -0.1056; -0.1056 -0.1056]
[-0.1056 -0.1056; -0.1056 -0.1056]], 2, 1, 2, true)
To save the DataMoments structure use:
julia> savedm(dm::DataMoments, dir::String)
To load it use
julia> loaddm(dir::String)
returns a DataMoments structure stored in a given directory.
julia> norm(st::SymmetricTensor{T, d}, p::Union{Float64, Int}) where {T<:AbstractFloat, d}
Returns a p
-norm of the tensor stored as SymmetricTensors
, supported for k ≠ 0
. The output of norm(st, p) = norn(convert(Array, st),p)
. However
norm(st::SymmetricTensor, p)
uses the block structure implemented in SymmetricTensors
, hence is faster and decreases the computer memory requirement.
julia> te = [-0.112639 0.124715 0.124715 0.268717 0.124715 0.268717 0.268717 0.046154];
julia> st = SymmetricTensor(reshape(te, (2,2,2)));
julia> norm(st)
0.5273572868359742
julia> norm(st, 2.5)
0.4468668679541424
julia> norm(st, 1)
1.339089
Given M
a vector of moments of order 1, ..., d
to change it to a vector
of cumulants of order 1, ..., d
using
julia> function moms2cums!(M::Vector{SymmetricTensor{T}}) where T <: AbstractFloat
One can convert a vector of cumulants c
to a vector of moments by running
julia> function cums2moms(c::Vector{SymmetricTensor{T}}) where T <: AbstractFloat
julia> m = momentarray(ones(20,3), 3, 2)
3-element Array{SymmetricTensor{Float64,N} where N,1}:
SymmetricTensor{Float64,1}(Union{Nothing, Array{Float64,1}}[[1.0, 1.0], [1.0]], 2, 2, 3, false)
SymmetricTensor{Float64,2}(Union{Nothing, Array{Float64,2}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; nothing [1.0]], 2, 2, 3, false)
SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0]
[1.0 1.0; 1.0 1.0] nothing; nothing nothing]
Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; nothing [1.0]], 2, 2, 3, false)
julia> moms2cums!(m)
julia> m
3-element Array{SymmetricTensor{Float64,N} where N,1}:
SymmetricTensor{Float64,1}(Union{Nothing, Array{Float64,1}}[[1.0, 1.0], [1.0]], 2, 2, 3, false)
SymmetricTensor{Float64,2}(Union{Nothing, Array{Float64,2}}[[0.0 0.0; 0.0 0.0] [0.0; 0.0]; #undef [0.0]], 2, 2, 3, false)
SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[0.0 0.0; 0.0 0.0]
[0.0 0.0; 0.0 0.0] #undef; #undef #undef]
Union{Nothing, Array{Float64,3}}[[0.0 0.0; 0.0 0.0] [0.0; 0.0]; #undef [0.0]], 2, 2, 3, false)
julia> cums2moms(m)
3-element Array{SymmetricTensor{Float64,N} where N,1}:
SymmetricTensor{Float64,1}(Union{Nothing, Array{Float64,1}}[[1.0, 1.0], [1.0]], 2, 2, 3, false)
SymmetricTensor{Float64,2}(Union{Nothing, Array{Float64,2}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; #undef [1.0]], 2, 2, 3, false)
SymmetricTensor{Float64,3}(Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0]
[1.0 1.0; 1.0 1.0] #undef; #undef #undef]
Union{Nothing, Array{Float64,3}}[[1.0 1.0; 1.0 1.0] [1.0; 1.0]; #undef [1.0]], 2, 2, 3, false)
To analyse the computational time of cumulants updates vs Cumulants.jl
recalculation, we supply the executable script comptimes.jl
. The script saves computational times to the res/*.jld
file. The scripts accept following parameters:
-d (Int)
: cumulant's maximum order, by defaultd = 4
,-n (vararg Int)
: numbers of marginal variables, by defaultn = 40
,-t (Int)
: number of realisations of random variable, by defaultt = 500000
,-u (vararg Int)
: number of realisations of update, by defaultu = 10000, 15000, 20000
,-b (Int)
: blocks size, by defaultb = 4
,-p (Int)
: numbers of processes, by defaultp = 3
.
To analyse the computational time of cumulants updates for different block sizes 1 < b ≤ Int(√n)+2
, we supply the executable script comptimesblocks.jl
.
The script saves computational times to the res/*.jld
file. The scripts accept following parameters:
-d (Int)
: cumulant's order, by defaultd = 4
,-n (Int)
: numbers of marginal variables, by defaultn = 48
,-u (vararg Int)
: number of realisations of the update, by defaultu = 10000, 20000
.-p (Int)
: numbers of processes, by defaultp = 3
.
To analyse the computational time of cumulants updates for different number of workers, we supply the executable script comptimesprocs.jl
.
The script saves computational times to the res/*.jld
file. The scripts accept following parameters:
-d (Int)
: cumulant's order, by defaultd = 4
,-n (Int)
: numbers of marginal variables, by defaultn = 48
,-u (vararg Int)
: number of realisations of the update, by defaultu = 10000, 20000
,-b (Int)
: blocks size, by defaultb = 4
,-p (Int)
: maximal numbers of processes, by defaultp = 6
.
To plot computational times run executable res/plotcomptimes.jl
on chosen *.jld
file.
Krzysztof Domino, Piotr Gawron, An algorithm for arbitrary–order cumulant tensor calculation in a sliding window of data streams, International Journal of Applied Mathematics and Computer Science, vol. 29, issue 1, pp. 206, 2019 (https://doi.org/10.2478/amcs-2019-0015 )
This project was partially financed by the National Science Centre, Poland – project number 2014/15/B/ST6/05204.