Note
|
the documentation is not complete yet. I am still writing it… |
This library offers a way to represent optional (or nullable) objects (objects that may not have a value) of type T
by encoding the no-value state in one of the states of T
.
This is an alternative to boost::optional
that offers a limited interface and a guarantee that the size of optional T
is the same as the size of T
.
In order to create an optional object type for your type T
,
you first need to decide how you want to represent the the no-value state in T
.
Suppose you want to use type int
and want -1
to represent the no-value.
Now you have to define an marked value policy: a type that formally reflects how the no-value is represented.
Luckily, for this common case the library comes with a predefined class template that you can use:
mark_int<int, -1>
. With this policy, you can define the type of the optional object, and use it:
markable<mark_int<int, -1>> oi; // optional int
static_assert(sizeof(oi) == sizeof(int), "no size penalty");
Instances of markable
have a very small interface. They are copyable and movable.
The default constructor initializes an object with the value indicated by the marked-value policy.
Another explicit constructor takes the value of T
:
using opt_int = markable<mark_int<int, -1>>;
opt_int oi; // internal value initialized to -1 (as per policy)
opt_int o1 {0}; // internal value initialized to 0
opt_int oN {-1}; // internal value initialized to -1
There are also three observer functions:
-
has_value()
that checks if the currently stored value is that indicated in the marked-value policy (MP
); -
value()
that extracts the stored value, with the precondition that the value is different than the one indicated inMP
; -
representation_value()
that extracts the value as stored internally, with no precondition.
// continuing the previous example
assert (!oi.has_value());
assert ( o0.has_value());
assert (!oN.has_value());
assert (o0.value() == 0);
// oi.value() is UB
// oN.value() is UB
assert (oi.representation_value() == -1);
assert (o0.representation_value() == 0);
assert (oN.representation_value() == -1);
As you can see, there are two ways to set the special marked value: either by default construction, or by providing it explicitly.
It is not possible to change the stored value through function value()
:
it returns a non-mutable reference or value (based on the policy).
In order to 'reset' the optional object you can either move-assign a new value,
or use assign
member functions:
// continuing the previous example
o0 = {}; // reset to marked value
oN = opt_int{2}; // new value
o0.assign(2); // assign value 2
oN.assign_representaiton(2); // also results in value 2
oN.assign_representation(-1); // results in no-value state
Each instance of markable
also provides three nested types:
-
value_type
- value we want to represent, -
reference_type
- what functionvalue
returns: in most cases it isconst value_type&
, -
representation_type
- type of the value that is used for representing values and the no-value: in most of the cases it is same asvalue_type
.
The type of the object that markable<MP>
uses for representing the value
or the information about the lack of value is MP::representaiton_type
.
Sometimes this type is same as MP::value_type
but it is not the rule.
For instance, when representing optional enum class E
with underlying type
int
, MP::value_type
is E
whereas MP::representation_type
is int
:
enum class E { V0, V1 };
using OptE = markable<mark_enum<E, -1>>;
static_assert(std::is_same_v<OptE::value_type, E>);
static_assert(std::is_same_v<OptE::representation_type, int>);
Whatever value, or the no-value, markable<MP>
stores, it is always
representable as a value of type MP::representation_type
:
OptE o;
assert(o.representation_value() == -1);
o.assign_representation(1);
assert(o.representation_value() == 1);
A representation value can always be reinterpreted as a (potentially missing) value.
OptE o;
assert(!o.has_value());
o.assign_storage(1);
assert(o.has_value());
assert(o.value() == E::V1);
Thus, markable<MP>
provides two interfaces: one for modifying and observing the
(possibly missing) MP::value_type
, and the other for modifying and observing
the (always present) MP::representation_type
.
What type is being stored and how the marked value is encoded is controlled by marked-value policy. You can either define your own, or use one of the policies provided with this library:
template <typename Integral, Integral EV> struct mark_int;
Can be used with all types that can be used as template non-type parameters, including built-in integral arithmetic types and pointers.
Integral
represents the stored type.
EV
is the marked value representation.
template <typename FPT> struct mark_fp_nan;
A policy for floating-point types, where the marked value is encoded as quiet NaN.
FPT
needs to be a floating-point scalar type.
template <typename T> struct mark_value_init;
A policy for storing any Regular
type, the marked value is represented by a value-initialized T
.
T
must meet the requirements of Regular
: be default constructible, copyable, moveable, and EqualityComparable
.
template <typename Cont> struct mark_stl_empty;
Marked value is created using value initialization. Marked value is checked by calling member function empty()
.
Useful for STL containers where we want the empty container to represent the no-value.
T
must be default constructible and have a member function empty()
.
struct mark_bool;
This is the policy for storing an optional bool
in a compact way, such that the size of markable<mark_bool>
is 1.
template <typename Enum, std::underlying_type_t<Enum> Val> struct mark_enum;
A policy for storing any enum, the marked value is represented by the indicated integral value Val
,
which can be outside the range of valid enum values.
Enum
must be an enumeration type.
template <typename Optional> struct mark_optional;
This policy is used for types that cannot (or do not want to) spare any value to indicate the marked state.
We logically represent optional T
, but store it in boost::optional<T>
or std::experimental::optional<T>
.
Optional
must be an instance of either boost::optional
or std::experimental::optional
.
In order to provide a custom marked-value policy to store a given type T
,
we need to provide a class that derive it from markable_type<T>
and implements two static member functions:
marked_value
and is_marked_value
:
struct mark_string_with_0s : markable_type<std::string>
{
static representation_type marked_value() {
return std::string("\0\0", 2);
}
static bool is_marked_value(const representation_type& v) {
return v.compare(0, v.npos, "\0\0", 2) == 0;
}
};
Base class markable_type<T>
defines all the necessary nested types and some house-keeping functions.
With it, we are declaring what type we will be storing.
Function marked_value
returns a value of the "representation" type (representation_type
) that represents the marked value.
Function is_marked_value
returns true iff the the given value is recognized as the marked value.
In a less likely case where we want to store the represent an optional value of type T
,
but store it internally in a different type, we need to provide more arguments to markable_type<T>
.
Suppose we want to implement the policy for storing type bool
in a storage of size 1 (the same way that mark_bool
does).
We need three states: no-value, true
, and false
. We cannot store it in type bool
because it only has two states.
So, for storage we will use type char
. We will use value 2
('\2'
) to represent the marked state,
value 0
to represent value false
and 1
to represent true
. Now, apart from defining how the marked state is encodes,
we also need to provide a recipe on how to encode a bool
in a char
, and how to extract the bool
value from char
storage.
We need to define additional two static member functions: access_value
and store_value
:
struct compact_bool_policy : markable_type<bool, char, bool> // see below
{
static representation_type marked_value() { return char(2); }
static bool is_marked_value(representation_type v) { return v == 2; }
static reference_type access_value(const storage_type& v) { return bool(v); }
static storage_type store_value(const value_type& v) { return v; }
};
The three types passed to markable_type
denote respectively:
-
value_type
— the type we are modeling. -
representation_type
— the type we use to internally represent all values plus the no-value state. -
reference_type
— what type functionvalue()
should return.
Class template markable_type
also provides the fourth type: storage_type
,
which in this case defaults to representation_type
. Because function value()
should return a bool
and we are storing no bool
we have to create a temporary value,
and return it by value: therefore type reference_type
is not really a reference.
There is also a possibility to select a default mark-value policy for a given type.
default_markable<int> iN, i0(0);_
Alias default_markable
selects the marked policy and uses it in markable
template. This mechanism tries to pick the best policy:
-
mark_int<T, std::numeric_limits<T>::max()>
for built-in integral types, -
mark_fp_nan<T>
for floating-point types, -
mark_bool
forbool
, -
mark_enum<T, std::numeric_limits<typename std::underlying_type<T>::type>::max()
for enumerations, -
mark_value_init<T>
otherwise.
It is possible that the default policy selection will create an ill-formed type. Also, note that the policy selected by default may not work for your use case. Manual policy selection might be preferred, because you are in control of what is actually selected.
default_markable
also selects order_by_value
as the ordering policy. This is
slightly slower, but safer than and sometimes more intuitive than
order_by_representation
.
Sometimes there is no spare value of T
, but there is a spare combination of member values in DUAL<T>
,
where DUAL<T>
a type layout-compatible with T
but without invariants. Consider the following type
representing a range of integers:
class range
{
int min_, max_;
bool invariant() const { return min_ <= max_; }
public:
range(int min, int max) : min_(min), max_(max) { assert (invariant()); }
int min() const { assert (invariant()); return min_; }
int max() const { assert (invariant()); return max_; }
~range() { assert (invariant()); }
};
It is guaranteed and enforced with assertions that min_
is never greater than max_
.
This leaves many spare combinations of values, e.g., {0, -1}
. But if we try to use them,
we violate the invariant, and trigger assertion failure. To address such cases,
markable
provides the dual storage. This will only work if your type is standard-layout.
You need to define a type layout-compatible with range
:
struct range_representation
{
int min_, max_;
};
Now you can request a "dual storage". It is a union that holds either a real type or its weaker counterpart:
union
{
range value_;
range_representation representation_;
};
Now, either we are storing a value (first member is active),
or we are storing the row type (second member is active),
in which we can encode the value impossible in value type.
We do not know which member of the union is active at a given moment,
but it is always safe to inspect the value of member representation_
.
This is guaranteed by the common initial sequence feature of unions in C++.
When we observe the special value, we know that the second member is active.
Otherwise we know that the active member is value_
.
In order to define the mark policy, you have to inherit from markable_dual_storage_type
and define the special value:
struct mark_range : markable_dual_storage_type<mark_range, range, range_representation>
{
static representation_type marked_value() noexcept { return {0, -1}; }
static bool is_marked_value(const representation_type& v) noexcept { return v.min_ > v.max_; }
};
The first argument is the type of the policy we are defining. (We are using the CRTP.) The second is the logical value type, and the third is its "raw" layout-compatible conterpart.
Warning
|
However, there is a certain danger involved when using dual storage.
Types T and DualT passed to markable_dual_storage_type need to preserve certain relation:
they have to be layout-compatible. While this can be enforced statically in C20 with type traits
`std::is_layout_compatible`, there is no way to enforce it in earlier revisions of C,
so you have to make sure this is the case. Otherwise you are risking UB.
|
To guarantee that two Standard Layout types are layout-compatible is difficult, especially if at some point you have to add a member datat to T
. It requires that the members of two types decompose to similar basic types with the same layout. In order to minimize the risk of the types becoming non layout compatible, we recommend te following technique. Define the struct
only for storing members. Than have both T
and its representation counterpart inferit from te member-storing struct. The inheriting types should not define any members:
struct range_members
{
int min_, max_;
};
class range : private range_members
{
bool invariant() const { return min_ <= max_; }
public:
range(int min, int max) : range_members{min, max} { assert (invariant()); }
int min() const { assert (invariant()); return min_; }
int max() const { assert (invariant()); return max_; }
~range() { assert (invariant()); }
};
struct range_representation : range_members
{
range_representation() noexcept : range_members{0, -1} {};
};
This way there is only one place that defines members for both types.
Using markable_dual_storage_type
requires that the dual storage has a noexcept
move constructor and that the marked-value policy function marked_value()
is also noexcept
. This requirement is necessary to guarantee exceptions safety for markable
object. If this requirement is not met the attempt to instantiate a markable object will result in a compile-time error.
If either of the functions is not noexcept
in your case but you still want to use the policy and take the risk that neither of the two functions will ever throw, you can use class template markable_dual_storage_type_unsafe
instead. This type does not enforce noexcept
statically, but will call std::terminate
should any of the two functions throw when performing operations on markable
in critical places.
There are no relational operations provided by default, as it is not obvious how the no-value should compare against other values. If you need to compare markable values, you have two options:
-
provide a custom comparator, where you explicitly state how the marked state is treated, or
-
select ordering policy in the second template parameter to
markable
.
This library offers three ordering policies, for rel-ops and std::hash
:
-
order_none
(the default), no relational operators orstd::hash
. -
order_by_value
, where marked-value is treated as the smallest possible value, distinct from any other value (much like inboost::optional
). -
order_by_representation
, where marked values are ordered by their representation.
using markable_int = markable<
mark_int<int, INT_MAX>, // marked-value policy
order_by_value // ordering policy
>;
markable_int mX{}, m0{0};
assert(m0 == m0);
assert(mX < m0);
assert(std::hash<markable_int>{}(m0) == std::hash<int>(0));
Ordering includes not only the six relational operators, but also the specializations
of std::hash
.
Ordering policy order_by_value
is a good safe default. It is slower than
order_by_representation
because it has to perform a branch, but it works well
with marked-value policies that mark more than one representation state:
struct mark_negative : markable_type<int>
{
static bool is_marked_value(const int& v) { return v < 0; } // any negative value is marked
static int marked_value() { return -1; } // but we can set only one: -1
};
using opt_int_A = markable<mark_negative, order_by_value>;
assert (opt_int_A{} == opt_int_A{-2}); // ok
using opt_int_B = markable<mark_negative, order_by_representation>;
assert (opt_int_B{} == opt_int_B{-1}); // ok
assert (opt_int_B{} == opt_int_B{-2}); // UB
If you are using marked-value policy that marks more than one state, and use
order_by_representation
ordering policy, you have to make sure that the
when comparing two such markables, if their state is marked, its representation
must be exactly MP::marked_value()
. Otherwise, you will get UB (Undefined Behavior).
It is also possible to define a custom ordering policy:
struct order_custom
{
template <typename Markable>
static bool equal(const Markable& l, const Markable& r) { // enables operators == and !=
// your implementation goes here
}
template <typename Markable>
static bool less(const Markable& l, const Markable& r) { // enables operators <, >, <=, >=
// your implementation goes here
}
};
using opt_int = markable<mark_int<int, -1>, order_custom>;
This library also comes with a set of predefined comparator objects, that can be
used in STL containers and algorithms without having to make your markable
type ordered:
using opt_int = markable<mark_int<int, -1>>; // not comparable
std::set<opt_int, less_by_value> osv;
std::set<opt_int, less_by_representation> osr;
std::unordered_set<opt_int, hash_by_value> usv;
std::unordered_set<opt_int, hash_by_representation> usr;
Because markable
uses policies, you can get the opaque typedef
for your markable types for free.
Suppose you have two conceptually different types Count
and Num
, but because they are markable types using the same policy, they end um being one and the same type:
using Count = markable<mark_int<int, -1>>;
using Num = markable<mark_int<int, -1>>;
static_assert(std::is_same<Count, Num>::value, "same type");
In order to make the two markable types distinct, you can alter the type of the policies (but not their semantics) by inheriting from them:
struct mark_count : mark_int<int, -1> {};
struct mark_num : mark_int<int, -1> {};
using Count = markable<mark_count>;
using Num = markable<mark_num>;
static_assert(!std::is_same<Count, Num>::value, "different types!");
This library is not a replacement for boost::optional
. While there is some overlap, both libraries target different use cases.
boost::optional
is really generic: It will work practically with any T
,
to the extent that you can use optional<optional<T>>
. You just give the type T
and you get the optional object wrapper.
In contrast, in markable
from the outset you have to make a choice case-by-case how you want to store the 'marked' value T
.
The policy for managing the marked state is part of the contract, part of semantics, part of the type.
Having a nested markable
is technically possible, but requires sparing two values of T
.
Some type T
may not have a 'spare' value to indicate the marked state. In such case, markable
cannot help you.
In contrast, boost::optional<T>
will work just fine: the additional marked state is stored separately.
In a way, boost::optional<T>
can be thought as boost::variant
<boost::none_t, T>
.
boost::optional<T>
gives you a useful guarantee that if you initialize it to a no-value state, no object of type T
is created. This is useful for run-time performance reasons and allows a two phase initialization. In contrast, markable
upon construction, immediately constructs a T
. markable
simply contains T
as subobject. In contrast, boost::optional
only provides a storage for T
and creates and destroys the contained object as needed.
boost::optional<T>
is almost a container, capable of storing 0 or 1 elements. When it stores one element, you can alter its value. It is impossible in markable
: its value and the "container’s size" are one thing. But in exchange, the latter, because it only provides a non-mutable access to the contained value, it can build the return value on the fly, similarly to the proxy reference in std::vector<bool>
, but because we only provide a non-mutable reference, it is much simpler and safer. This is why we can provide a markable for type bool
(which has no spare value) with the sizeof
of a single char
.
markable
has a minimalistic interface: this is also to avoid any surprises. As a cost it is less expressive and convenient. There are no implicit conversions, no overloaded operators; unlike boost::optional
, it does not have operator==
by default: you have to specify a comparison policy and decide yourself how you want to compare the no-value states.
In general, it is expected that in the first pass of the implementation of your program, you will use boost::optional<T>
as a simple ready-to-use solution. Later, if you determine that boost::optional
kills your performance, you can think of replacing it with markable
and how you want to represent the no-value state.
Another use case is when you are currently using objects of scalar types with encoded special values (like type std::string::size_type
with value std::string::npos
) and you want to change it into something safer but be sure you are adding no runtime overhead. You can change your type to markable<mark_int<string::size_type, string::npos>>
.