This article applies not only to C#, but also to other languages with references, like C++.
A reference is a special type
that allows you to access data stored in another location separate from
the variable that contains the reference. When accessing a reference
variable, indirection is automatically performed, i.e. first the actual
address to the remote variable is fetched, then the object stored in the
variable is accessed. This behaviour is distinct from pointers, which
employ the same indirection mechanisms, but are treated as normal values
and so dereferencing must be explicit.
In C#, there are three keywords that denote a reference:
ref
, out
, and in
. The latter two work only for method parameters; however readonly ref
can be used instead of in
for variables and return types (out
cannot be used in the same way). These map to different metadata in the
underlying CIL code, but all use the internal CLI managed pointer type (T&
). ref
is translated directly to the managed pointer type. out
is marked with OutAttribute
. Both in
and readonly ref
are marked with InAttribute
,
but notice that the attribute is only allowed on parameters. Rather
than allowing the attribute to be used on return values in addition to
parameters, or using a different type like IsReadOnlyAttribute
, the type itself is actually decorated with a required modifier (modreq
) of InAttribute
,
which is one of the rare occassions the compiler uses a modifier
instead of an attribute. (After a quick search in the Roslyn repo, I’ve
found out other modifiers, like IsVolatile
used for volatile
fields, or that the unmanaged
constraint also uses a required modifier of UnmanagedType
. Creative.) This also means that technically, you could have two methods with ref
return that differ only in readonly
,
as the required modifier is part of the signature and CLI allows method
that differ only in the return type (unlike attributes which aren’t
part of the signature).
In C++, normal
ref
reference can be translated to T&
while in
reference is T const &
,
although its behaviour differs from C# (calling non-readonly methods
creates a defensive copy of the value in C#, while calling non-const
methods in C++ is prohibited). There is no analogue to out
in C++, as it is recommended to return these values directly (and the
code is usually optimized so that the least amount of copying is
necessary).
So far, I’ve outlined three types of a reference, so what is the fourth one? It is a controlled-mutability reference.
In object-oriented languages, such a reference allows the modification
of the target value only via the methods it provides. In C#, any object
reference is a controlled-mutability reference, as you cannot copy the
state from one object to another without invoking its methods or setting
all the fields manually. This also holds true for boxed values types:
even though the CLI permits unboxing the object and obtaining a
reference to its state, the state can only be changed by calling the
methods or setting the fields that allow it. In other words, it always
behaves like a normal object.
I am not aware of any mechanism
in C++ that would directly mimic this functionality. You can turn all
non-const references to an object into controlled-mutability references
by deleting or hiding
operator=
on the type, so the state
of an already constructed object will always be protected, and then use a
wrapper type that will have access to operator=
in case assignability is required. A simpler option is to create a special pointer type whose operator*
returns a const reference but operator->
returns a non-const reference. Even though nothing stops you from calling ptr->operator=(val)
or ptr.operator->() = val
, at least the syntax makes it stand out and make you think whether you are doing something safe or not.
Side note: In C#, you can (semantically) turn any controlled-mutability reference to an object’s state by implementing the
IAssignable<>
interface:public interface IAssignable<T> where T : struct, IAssignable<T>
{
void Assign(in T value);
}
struct MyStruct : IAssignable<MyStruct>
{
public void Assign(in MyStruct value)
{
this = value;
}
}
This works even when a field is
readonly
.
For reference types, implementing this pattern is a bit harder as you
have to deal with inheritance, and the state of an object can only be
assigned from a state of an object of the same type, so the operation
may fail (virtual bool TryAssign(object value)
is probably the simplest way).
No comments:
Post a Comment