Reflecting Over Members of an Aggregate

Implementing ‘reflection’ qualities using standard C++

  • 8 minutes to read

A little while back a friend of mine and I were talking about serialization of struct objects as raw bytes. He was working with generated objects that contain padding, but the objects needed to be serialized without the padding; for example:

struct Foo
{
  char data0;
  // 3 bytes padding here
  int data1;
};

In the case he described, there are dozens of object types that need to be serialized, and all are:

  • Generated by his organization (so they can’t be modified), and
  • Are guaranteed to be aggregates

Being a template meta-programmer, I thought it would be a fun challenge to try to solve this in a generic way using c++17 – and in the process I accidentally discovered a generic solution for iterating all members of any aggregate type.

Aggregates

Before I continue, it’s important to know what an aggregate type actually is, an what its properties are.

Simply put, an aggregate is one of two things:

  • an array, or
  • a struct/class with only public members and public base-classes, with no custom constructors

There’s formally more criteria for this, but this is a simplification.

What is special about Aggregates?

Aggregates are special for a couple reasons.

The first is that Aggregates cannot have custom constructors; they can only use either the default-generated ones (copy/move/default), or be aggregate initialized . This fact will be important in a moment.

The second is that, since c++17 , aggregates can be used with structured bindings expressions without any extra work needed by the compiler author – for example:

struct Foo{
  char a;
  int b;
};

...

auto [x,y] = Foo{'X', 42};

It’s also important to know that an aggregate can only be decomposed with structured-bindings into the exact number of members that the aggregate has, so the number of members must be known before the binding expression is used.

How does this help?

Knowing the above two points about aggregates is actually all we need to develop a generic solution. If we can find out how many members an aggregate object contains, then we will be able to decompose this object with structured bindings and do something with each member!

Detecting members in an aggregate

The obvious first question is how can we know how many members an aggregate holds?

The C++ language does not offer any sizeof-equivalent for the number of members, so we will have to compute this using some template trickery. This is where the first point about aggregates comes into play: an aggregate can only have constructors that perform aggregate initialization . This means that for any aggregate, we know that it can be constructed from an expression T{args...}, where args... can be anywhere between 0 to the total number of members in the aggregate itself.

So really the question we need to be asking now is: “what is the most arguments I can aggregate-initialize this T from?”

Testing if T is aggregate initializable

The first thing we need is a way to test that T is aggregate initializable at all. Since we don’t actually know what the argument type is for each member, we will need something that the C++ language can substitute into the expression for the unevaluated type expression:

// A type that can be implicitly converted to *anything*
struct Anything {
    template <typename T>
    operator T() const; // We don't need to define this function
};

We don’t actually need to define the function at all; we only need to have the type itself so that the C++ type system can detect the implicit conversion to any valid argument type.

From here, all we really need is a simple trait that tests whether the expression T{ Anything{}... } is valid for a specific number of arguments. This is a perfect job for using std::index_sequence along with std::void_t to evaluate the expression in a SFINAE context:

namespace detail {
  template <typename T, typename Is, typename=void>
  struct is_aggregate_constructible_from_n_impl
    : std::false_type{};

  template <typename T, std::size_t...Is>
  struct is_aggregate_constructible_from_n_impl<
    T,
    std::index_sequence<Is...>,
    std::void_t<decltype(T{(void(Is),Anything{})...})>
  > : std::true_type{};
} // namespace detail

template <typename T, std::size_t N>
using is_aggregate_constructible_from_n = detail::is_aggregate_constructible_from_n_impl<T,std::make_index_sequence<N>>;

With this, we can now test how many arguments are needed to construct an aggregate using aggregate-initialization:

struct Point{ int x, y; }

// Is constructible from these 3
static_assert(is_aggregate_constructible_from_n<Point,0>::value);
static_assert(is_aggregate_constructible_from_n<Point,1>::value);
static_assert(is_aggregate_constructible_from_n<Point,2>::value);

// Is not constructible for anything above
static_assert(!is_aggregate_constructible_from_n<Point,3>::value);
static_assert(!is_aggregate_constructible_from_n<Point,4>::value);
Try Online

Testing the max number of initializer members

All we need now, is to test the max number of arguments that an aggregate can be constructed with.

This could be done in a number of ways:

  1. Count iteratively from 0 up to the first failure,
  2. Count from some pre-defined high number down until we find our first success, or
  3. Binary search between two predefined values until we find the largest scope

The former two options grow in template iteration depth based on the number of members an aggregate has. The larger the number of members, the more iterations are required at compile time – which can increase both compile-time and complexity.

The latter option will be more complex to understand, but also guarantees the fewest number of template instantiations and thus should reduce overall compile-time complexity.

For this part, it turns out that @Yakk on Stack Overflow already provided a brilliant solution doing exactly this (modified slightly for this article):

namespace detail {
  template <std::size_t Min, std::size_t Range, template <std::size_t N> class target>
  struct maximize
    : std::conditional_t<
        maximize<Min, Range/2, target>{} == (Min+Range/2)-1,
        maximize<Min+Range/2, (Range+1)/2, target>,
        maximize<Min, Range/2, target>
      >{};
  template <std::size_t Min, template <std::size_t N> class target>
  struct maximize<Min, 1, target>
    : std::conditional_t<
        target<Min>{},
        std::integral_constant<std::size_t,Min>,
        std::integral_constant<std::size_t,Min-1>
      >{};
  template <std::size_t Min, template <std::size_t N> class target>
  struct maximize<Min, 0, target>
    : std::integral_constant<std::size_t,Min-1>
  {};

  template <typename T>
  struct construct_searcher {
    template<std::size_t N>
    using result = is_aggregate_constructible_from_n<T, N>;
  };
}

template <typename T, std::size_t Cap=32>
using constructor_arity = detail::maximize< 0, Cap, detail::construct_searcher<T>::template result >;

This solution makes use of template template arguments which reuses the is_aggregate_constructible_from_n above to find the largest number of members that can construct a given aggregate from between 0 to Cap (default 32).

Testing our solution with the above Point type:

static_assert(constructor_arity<Point>::value == 2u);
Try Online

Extracting members from an aggregate

Now that we know how many members a given aggregate type can be built from, we can leverage structured bindings to extract the elements out, and perform some basic operation on them.

For our purposes here, lets simply call some function on it in the same way that std::visit does with visitor functions. Note that because structured bindings requires a specific number of elements statically specified, we will require N overloads for extracting N members:

namespace detail {
  template <typename T, typename Fn>
  auto for_each_impl(T&& agg, Fn&& fn, std::integral_constant<std::size_t,0>) -> void
  {
    // do nothing (0 members)
  }

  template <typename T, typename Fn>
  auto for_each_impl(T& agg, Fn&& fn, std::integral_constant<std::size_t,1>) -> void
  {
    auto& [m0] = agg;

    fn(m0);
  }

  template <typename T, typename Fn>
  auto for_each_impl(T& agg, Fn&& fn, std::integral_constant<std::size_t,2>) -> void
  {
    auto& [m0, m1] = agg;

    fn(m0); fn(m1);
  }

  template <typename T, typename Fn>
  auto for_each_impl(T& agg, Fn&& fn, std::integral_constant<std::size_t,3>) -> void
  {
    auto& [m0, m1, m2] = agg;

    fn(m0); fn(m1); fn(m2);
  }
  // ...

} // namespace detail

template <typename T, typename Fn>
void for_each_member(T& agg, Fn&& fn)
{
  detail::for_each_impl(agg, std::forward<Fn>(fn), constructor_arity<T>{});
}

We simply use integral_constant for tag-dispatch here for each member, and forward a function Fn that is called on each member. Lets test this quickly:

int main()
{
    const auto p = Point{1,2};
    for_each_member(p, [](auto x){
        std::cout << x << std::endl;
    });
}
Try Online

This effectively gives us a working solution.

Back to Serialization

Lets tie this all together now by brining in serialization. Now that we have an easy way to access each member of a struct, serialization just becomes a simple aspect of converting all members to a series of bytes with a simple callback.

If we were ignoring the endianness, then serialiazation of packed data can be accomplished as simply as:

template <typename T>
auto to_packed_bytes(const T& data) -> std::vector<std::byte>
{
  auto result = std::vector<std::byte>{};

  // serialize each member!
  for_each_member(data, [&result](const auto& v){
    const auto* const begin = reinterpret_cast<const std::byte*>(&v);
    const auto* const end = begin + sizeof(v);
    result.insert(result.end(), begin, end);
  });

  return result;
}

...
auto data = Foo{'X', 42};
auto result = to_packed_bytes(data);
Try Online

This is much easier than having N serialization functions defined for each generated object. All thats needed for this solution is the high-bound of Count in the detection macro to increase, and to have Count instances of the for_each_impl overload mentioned earlier.

Closing Thoughts

This gave us an interesting solution to “reflecting” over members of any aggregate in a generic way – all using completely standard C++17.

Originally when I discovered this solution, I had thought that I was the first to encounter this particular method; however while doing research for this write-up I discovered that the brilliant magic_get library beat me to it. However, this technique can still prove useful in any modern codebase – and can be used for a number of weird and wonderful things.

Outside of the serialization example that prompted this discovery, this can also be used in conjunction with other meta-programming utilities such as getting the unmangled type name at compile time to generate operator<< overloads for the purposes of printing aggregates on-the-fly.

Possible Improvements

This is just a basic outline of what can be done since this is a tutorial article. There are some possible improvements that are worth considering as well:

  • We can propagate the CV-qualifiers of the type by changing T to T&&, and auto& to auto&& in the bindings (which will then require some more std::forward-ing)

  • We could detect the existence of specializations of std::get and std::tuple_size so that this works with more than just aggregates

Next Post