Equality in C# and .Net
If you think that comparing two objects for equality is trivial, then you are probably a junior .Net programmer. This short article will introduce you to every mechanism in .Net that is used to determine if two values are equal.
Let's start with the easiest situation first. How do two instances of object
or two instances of a completely regular descendant from object
determine their equality? We'll use the following two variables throughout the examples:
var objA = new SampleClass();
var objB = new SampleClass();
var op_Equality_class = objA == objB;
We haven't defined operator ==
for our SampleClass
. However, since it's a reference type, it is automatically translated into a ceq
CIL instruction. TODO: add from CIL manual.
var objectReferenceEquals_class = object.ReferenceEquals(objA, objB);
ReferenceEquals
is a very trivial static method:
public static bool ReferenceEquals(Object objA, Object objB)
{
return objA == objB;
}
var thisEquals_class = objA.Equals(objB);
The only virtual method of the four, the base Equals(object)
is implemented using a runtime helper:
public virtual bool Equals(Object obj)
{
return RuntimeHelpers.Equals(this, obj);
}
The helper method is implemented in unmanaged code
[System.Security.SecuritySafeCritical] // auto-generated
[ResourceExposure(ResourceScope.None)]
[MethodImplAttribute(MethodImplOptions.InternalCall)]
public new static extern bool Equals(Object o1, Object o2);
The unmanaged code itself is in src/classlibnative/bcltype/objectnative.cpp
:
FCIMPL2(FC_BOOL_RET, ObjectNative::Equals, Object *pThisRef, Object *pCompareRef)
{
CONTRACTL
{
FCALL_CHECK;
INJECT_FAULT(FCThrow(kOutOfMemoryException););
}
CONTRACTL_END;
if (pThisRef == pCompareRef)
FC_RETURN_BOOL(TRUE);
// Since we are in FCALL, we must handle NULL specially.
if (pThisRef == NULL || pCompareRef == NULL)
FC_RETURN_BOOL(FALSE);
MethodTable *pThisMT = pThisRef->GetMethodTable();
// If it's not a value class, don't compare by value
if (!pThisMT->IsValueType())
FC_RETURN_BOOL(FALSE);
// Make sure they are the same type.
if (pThisMT != pCompareRef->GetMethodTable())
FC_RETURN_BOOL(FALSE);
// Compare the contents (size - vtable - sync block index).
DWORD dwBaseSize = pThisRef->GetMethodTable()->GetBaseSize();
if(pThisRef->GetMethodTable() == g_pStringClass)
dwBaseSize -= sizeof(WCHAR);
BOOL ret = memcmp(
(void *) (pThisRef+1),
(void *) (pCompareRef+1),
dwBaseSize - sizeof(Object) - sizeof(int)) == 0;
FC_GC_POLL_RET();
FC_RETURN_BOOL(ret);
}
FCIMPLEND
As you can see from the code, the unmanaged method works for both reference and value types. This is unusual, as value types have their own virtual implementation.
var objectEquals_class = object.Equals(objA, objB);
The implementation of this method is rather straightforward:
public static bool Equals(Object objA, Object objB)
{
if (objA==objB) {
return true;
}
if (objA==null || objB==null) {
return false;
}
return objA.Equals(objB);
}
As you can see, it does some null checks and fall back to the virtual method. This, of course, means that it does the null checks twice if you do not override it.
All four methods compare reference types using reference equality (i.e., object identity). Both static and virtual Equals
methods can change their behavior in child classes and will continue to work if you upcast your values to object
, while ReferenceEquals
will always use reference equality.
Operator ==
can be reimplemented in a child class, but if the value on either side of it is upcast to object
, the version from object
will be used, again utilizing reference equality.
Value types are technically descendants of System.ValueType
which derives from System.Object
, but they are a very special case. They have no object identity, so they are usually compared by value, not by reference. This means that using ReferenceEquals
on value types is a very bad idea: it will always return false.
There's no predefined operator ==
for all value types like there's one for reference types and C# will actually not allow you to use it on value types that do not define it themselves.
Equals
is overridden in System.ValueType
, so you can use it to compare two custom structs either via the static or the virtual method, but one look at the source will make it clear why this is a wrong idea:
[System.Security.SecuritySafeCritical]
public override bool Equals (Object obj) {
BCLDebug.Perf(false, "ValueType::Equals is not fast. "+this.GetType().FullName+" should override Equals(Object)");
if (null==obj) {
return false;
}
RuntimeType thisType = (RuntimeType)this.GetType();
RuntimeType thatType = (RuntimeType)obj.GetType();
if (thatType!=thisType) {
return false;
}
Object thisObj = (Object)this;
Object thisResult, thatResult;
// if there are no GC references in this object we can avoid reflection
// and do a fast memcmp
if (CanCompareBits(this))
return FastEqualsCheck(thisObj, obj);
FieldInfo[] thisFields = thisType.GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic);
for (int i=0; i<thisFields.Length; i++) {
thisResult = ((RtFieldInfo)thisFields[i]).UnsafeGetValue(thisObj);
thatResult = ((RtFieldInfo)thisFields[i]).UnsafeGetValue(obj);
if (thisResult == null) {
if (thatResult != null)
return false;
}
else
if (!thisResult.Equals(thatResult)) {
return false;
}
}
return true;
}
If your struct contains reference types, it will be compared via reflection! If there are no references, the comparison will be much faster, as it will use two helper methods defined in unmanaged code:
[System.Security.SecuritySafeCritical] // auto-generated
[ResourceExposure(ResourceScope.None)]
[MethodImplAttribute(MethodImplOptions.InternalCall)]
private static extern bool CanCompareBits(Object obj);
[System.Security.SecuritySafeCritical] // auto-generated
[ResourceExposure(ResourceScope.None)]
[MethodImplAttribute(MethodImplOptions.InternalCall)]
private static extern bool FastEqualsCheck(Object a, Object b);
The code is in src/vm/comutilnative.cpp
:
// Return true if the valuetype does not contain pointer and is tightly packed
FCIMPL1(FC_BOOL_RET, ValueTypeHelper::CanCompareBits, Object* obj)
{
FCALL_CONTRACT;
_ASSERTE(obj != NULL);
MethodTable* mt = obj->GetMethodTable();
FC_RETURN_BOOL(!mt->ContainsPointers() && !mt->IsNotTightlyPacked());
}
FCIMPLEND
FCIMPL2(FC_BOOL_RET, ValueTypeHelper::FastEqualsCheck, Object* obj1, Object* obj2)
{
FCALL_CONTRACT;
_ASSERTE(obj1 != NULL);
_ASSERTE(obj2 != NULL);
_ASSERTE(!obj1->GetMethodTable()->ContainsPointers());
_ASSERTE(obj1->GetSize() == obj2->GetSize());
TypeHandle pTh = obj1->GetTypeHandle();
FC_RETURN_BOOL(memcmp(obj1->GetData(),obj2->GetData(),pTh.GetSize()) == 0);
}
FCIMPLEND
Both methods on MethodTable
check specific flags so are reasonably fast.
A reference type that is immutable should behave like it possesses no object identity. Obviously, this will not protect you from ReferenceEquals
, but all other comparison mechanisms should be modified. Let's see how string
does that.
string
is a special type and is interned by the runtime. That's why ReferenceEquals
might sometimes return true when you compare different string objects.
// Determines whether two strings match.
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
public override bool Equals(Object obj)
{
if (this == null) //this is necessary to guard against reverse-pinvokes and
throw new NullReferenceException(); //other callers who do not use the callvirt instruction
String str = obj as String;
if (str == null)
return false;
if (Object.ReferenceEquals(this, obj))
return true;
if (this.Length != str.Length)
return false;
return EqualsHelper(this, str);
}
The implementation is straightforward: there are some null check, a reference equality shortcut, a domain specific shortcut and the "meat" of the method is implemented in a helper.
EqualsHelper
compares the contents of two strings character by character using some clever but straightforward optimizations.
Framework design guidelines say that all your operators should have equivalent static methods to be usable from languages that do not support operators. This is what string
does:
public static bool operator == (String a, String b) {
return String.Equals(a, b);
}
[Pure]
public static bool Equals(String a, String b) {
if ((Object)a==(Object)b) {
return true;
}
if ((Object)a==null || (Object)b==null) {
return false;
}
if (a.Length != b.Length)
return false;
return EqualsHelper(a, b);
}
Again, there are some obvious null checks, a reference equality shortcut, a domain specific shortcut and a helper call.
Why did Microsoft implement a separate interface with yet another Equals(T)
method if all objects support Equals(object)
anyway? The answer is performance, of course. Every call to Equals(object)
includes a runtime type check, it's especially problematic for value types which have to be boxed. If your generic type does lots of comparisons of its parameterized members, you need a way to avoid these checks. Let's see how List<T>
does that.
public bool Contains(T item) {
if ((Object) item == null) {
for(int i=0; i<_size; i++)
if ((Object) _items[i] == null)
return true;
return false;
}
else {
EqualityComparer<T> c = EqualityComparer<T>.Default;
for(int i=0; i<_size; i++) {
if (c.Equals(_items[i], item)) return true;
}
return false;
}
}
The first half of the method is the fast past for null checks, where reference equality is the way to go. But what's going on in the second part? What's this EqualityComparer<T>.Default
? It's a cached result of a call to CreateComparer
, the source of which is shown below:
[System.Security.SecuritySafeCritical] // auto-generated
private static EqualityComparer<T> CreateComparer() {
Contract.Ensures(Contract.Result<EqualityComparer<T>>() != null);
RuntimeType t = (RuntimeType)typeof(T);
// Specialize type byte for performance reasons
if (t == typeof(byte)) {
return (EqualityComparer<T>)(object)(new ByteEqualityComparer());
}
// If T implements IEquatable<T> return a GenericEqualityComparer<T>
if (typeof(IEquatable<T>).IsAssignableFrom(t)) {
return (EqualityComparer<T>)RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter((RuntimeType)typeof(GenericEqualityComparer<int>), t);
}
// If T is a Nullable<U> where U implements IEquatable<U> return a NullableEqualityComparer<U>
if (t.IsGenericType && t.GetGenericTypeDefinition() == typeof(Nullable<>)) {
RuntimeType u = (RuntimeType)t.GetGenericArguments()[0];
if (typeof(IEquatable<>).MakeGenericType(u).IsAssignableFrom(u)) {
return (EqualityComparer<T>)RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter((RuntimeType)typeof(NullableEqualityComparer<int>), u);
}
}
// See the METHOD__JIT_HELPERS__UNSAFE_ENUM_CAST and METHOD__JIT_HELPERS__UNSAFE_ENUM_CAST_LONG cases in getILIntrinsicImplementation
if (t.IsEnum) {
TypeCode underlyingTypeCode = Type.GetTypeCode(Enum.GetUnderlyingType(t));
// Depending on the enum type, we need to special case the comparers so that we avoid boxing
// Note: We have different comparers for Short and SByte because for those types we need to make sure we call GetHashCode on the actual underlying type as the
// implementation of GetHashCode is more complex than for the other types.
switch (underlyingTypeCode) {
case TypeCode.Int16: // short
return (EqualityComparer<T>)RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter((RuntimeType)typeof(ShortEnumEqualityComparer<short>), t);
case TypeCode.SByte:
return (EqualityComparer<T>)RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter((RuntimeType)typeof(SByteEnumEqualityComparer<sbyte>), t);
case TypeCode.Int32:
case TypeCode.UInt32:
case TypeCode.Byte:
case TypeCode.UInt16: //ushort
return (EqualityComparer<T>)RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter((RuntimeType)typeof(EnumEqualityComparer<int>), t);
case TypeCode.Int64:
case TypeCode.UInt64:
return (EqualityComparer<T>)RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter((RuntimeType)typeof(LongEnumEqualityComparer<long>), t);
}
}
// Otherwise return an ObjectEqualityComparer<T>
return new ObjectEqualityComparer<T>();
}
Okay, this is a long and complex method, but it's called only once per type during the execution of the program, so it's a huge gain. What's going on in it, anyway?
It uses reflection to determine the runtime type of its generic argument and returns a different EqualityComparer
for particular types:
ByteEqualityComparer
forbyte
GenericEqualityComparer<T>
for types that implementIEquatable<T>
NullableEqualityComparer<U>
for types that areNullable<IEquatable<U>>
- various
EnumEqualityComparer
s for enum types, depending on their underlying type ObjectEqualityComparer<T>
for all other cases
What do these comparers do? Again, nothing truly surprising.
GenericEqualityComparer<T>
uses IEquatable<T>.Equals
:
[Pure]
public override bool Equals(T x, T y) {
if (x != null) {
if (y != null) return x.Equals(y);
return false;
}
if (y != null) return false;
return true;
}
NullableEqualityComparer<T>
uses IEquatable<T>.Equals
too:
[Pure]
public override bool Equals(Nullable<T> x, Nullable<T> y) {
if (x.HasValue) {
if (y.HasValue) return x.value.Equals(y.value);
return false;
}
if (y.HasValue) return false;
return true;
}
Finally, ObjectEqualityComparer uses Equals(object)
:
[Pure]
public override bool Equals(T x, T y) {
if (x != null) {
if (y != null) return x.Equals(y);
return false;
}
if (y != null) return false;
return true;
}
This ensures that List<T>
or any other type that uses EqualityComparer<T>.Default
will use IEquatable.Equals(T)
if T
implements IEquatable<T>
and Equals(object)
if it doesn't.
What? Another way to compare two types? Why on Earth would you want another one? IStructuralEquatable
is necessary for reference types that possess object identity, but are often equated based on their contents. Arrays are one such class. You should implement IStructuralEquatable
explicitly, since it's an uncommon operation. Also not that apart from the other object, IStructuralEquatable.Equals
takes a IEqualityComparer
parameter. IEqualityComparer
is like IEqualityComparer<T>
, but worse. It cannot be a generic type because some structurally comparable types, like System.Tuple
, are heterogenous. Tecnically, you could implement interfaces IEqualityComparer<T1, T2, T3>
and so on that took even more parameters in the Equals
method, but this would be a very narrow use case.
This is how System.Array
implements the method:
Boolean IStructuralEquatable.Equals(Object other, IEqualityComparer comparer) {
if (other == null) {
return false;
}
if (Object.ReferenceEquals(this, other)) {
return true;
}
Array o = other as Array;
if (o == null || o.Length != this.Length) {
return false;
}
int i = 0;
while (i < o.Length) {
object left = GetValue(i);
object right = o.GetValue(i);
if (!comparer.Equals(left, right)) {
return false;
}
i++;
}
return true;
}
Again, the method is rather straightforward. Some null checks, two shortcuts and an elementwise check.
But what if you have to compare two arrays of arrays of arrays, i.e., values of type SampleClass[][][]
? What kind of comparer would you pass to IStructuralEquatable.Equals
? Enter StructuralComparisons.StructuralEqualityComparer
.
public new bool Equals(Object x, Object y) {
if (x != null) {
IStructuralEquatable seObj = x as IStructuralEquatable;
if (seObj != null){
return seObj.Equals(y, this);
}
if (y != null) {
return x.Equals(y);
} else {
return false;
}
}
if (y != null) return false;
return true;
}
It simply calls IStructuralEquatable.Equals(object)
if the lhs is IStructuralEquatable
and object.Equals(object)
if it isn't. Note that it cannot call IEquatable<T>.Equals
, so it loses a tiny bit of performance, but since it already uses a runtime cast, it's an acceptable loss.
Implementations of Equals
method must satisfy the mathematical properties of equality when both operands are of the same runtime type:
- reflexivity: for any
a
,Equals(a, a)
must return true - symmetry: for any
a
,b
,Equals(a, b)
andEquals(b, a)
must return the same result - transitivity: for any
a
,b
,c
,Equals(a, b) && Equals(b, c)
andEquals(a, c)
must return the same result
GetHashCode
is a performance shortcut method for equality testing. This means that:
- it must be fast: if it's slower than
Equals
, there's not much use in it - it must be deterministic:
Equals(a.GetHashCode(), a.GetHashCode())
must always return true - it must be reliable: it mustn't throw exceptions
- it must be reflect the equality: if
Equals(a, b)
returnstrue
,Equals(a.GetHashCode(), b.GetHashCode())
must also return true - its result should be uniformly distributed
Every time you override the Equals
method you must also override GetHashCode
to satisfy the above requirements. Obviously, an implementation like the one below is a valid, but completely useless one:
public override int GetHashCode() => 0;
A good implementation of GetHashCode
is provided by Jon Skeet (who else) here: http://stackoverflow.com/a/263416/3106795
In general all equality methods must produce consistent results. However, numeric value types are a very well-known exception. Numeric types are promoted to the widest type by ==
while Equals
doesn't do that:
Console.WriteLine(1.0 == 1.0f); //true
Console.WriteLine(1.0f == 1.0); //true
Console.WriteLine(1.0f.Equals(1.0)); //false
Console.WriteLine(1.0.Equals(1.0f)); //true
Similarly, IEEE 754:2008 defines non-reflective operations on floating point types, so operators follow the standard while Equals
matches the general rules:
Console.WriteLine(double.NaN == double.NaN); //false
Console.WriteLine(double.NaN.Equals(double.NaN)); //true
When implementing custom types you should avoid such discrepancies in behaviour if they aren't absolutely necessary.
Don't do anything, you are covered by the runtime. You can theoretically add the following implementation of IEquatable<T>.Equals
, but it's 100% useless:
public Equals(YourClass other) => this==other;
If your type is a container of sorts, implement IStructuralEquatable.Equals
and IStructuralEquatable.GetHashCode
explicitly.
-
Indicate that your type implements
IEquatable<YourValueType>
. -
Implement
Equals(YourValueType)
that compares its two parameters using their members'Equals
methods. If there are generic members, useEqualityComparer<T>.Default.Equals
instead. Make sure your method is reflexive, symmetric and transitive. -
Implement
Equals(object)
that checks that its argument is an instance ofYourValueType
and delegates the rest of the job to the previous method. -
Override
GetHashCode
to match the behavior ofEquals(object)
. Make sure your method is fast, deterministic and reliable. -
Implement a static
Equals(YourValueType a, YourValueType b)
method that compares its two parameters using their members'Equals
methods or other applicable methods if you're comparingdouble
s or other unusual types. -
Implement operator
==
that callsEquals(YourValueType a, YourValueType b)
. Implement operator '!=that calls
!Equals(YourValueType a, YourValueType b)`.
If your type doesn't contain any unusual types, you can and should use the same static helper method Equals(YourValueType a, YourValueType b)
for all other equality methods.