The Equality Sledgehammer

Many test frameworks provide something equivalent to assertEquals, which is used to prove the outcome after the set up and execute stage of the test.

For example, if I had an algorithm to replace spaces in strings with - then I would assert that replace("a b ") is equal to "a-b-". This is a very good tool to use, because it is clear and simple.

However, let’s consider a more complex example. Let’s say we have a sort algorithm. We want to see that our objects get sorted. Perhaps there are only three of them, and they have a couple of fields each:

Data object1 = ...;
Data object2 = ...;
Data object3 = ...;

Data[] wrongOrder = { object1, object2, object3 };
Data[] correctOrder = { object2, object1, object3 };


It’s not hard to compose the above test, if the objects are small, it’s easy to see what’s going on, and it’s probably no hardship to have an equals operator on each object.

As we’ve discussed before, sometimes equality isn’t really possible. IDs and timestamps can get generated. Similarly, if the test set was 1200 rows, would we really go to the trouble of a deep equality on all 1200 rows and all of their fields after the sort?

Assert Equals is a Sledgehammer

If you can find a way to predict every field in a huge operation, then you can rightly declare that asserting 100% equality of the outcome is an unarguable test pass.

However, it can be a lot of hard work to achieve it, and it may not, semantically, be what you’re trying to prove.

In our sort example, we sorted according to a particular field. The most important outcome of the sort was that the data was in the order dictated by that field. Perhaps an alternative algorithm would be:

Data[] wrongOrder = { object1, object2, object3 };
String[] correctOrderOfFields = { "A", "B", "C" };

// let's just assert on the sortable field - `category`

This saves us having to do a deep equals.

However, even then perhaps the more accurate assertion should be:

var result = sort(wrongOrder);
assertOrder(result, Data::getCategory)

The above assertion library doesn’t exist… perhaps it should. It specifically can check the exact thing that matters, rather than forcing the use of a precise equality operation.

There must always be some equality

While I would definitely advocate for lighterweight more semantic assertions, these don’t necessarily provide perfect cast-iron evidence that data isn’t being corrupted somewhere. However, all tests, when taken together provide triangulation around the code under test, so if there is some use of equality too, then it becomes harder for the code to be passing the leaner tests by accident.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s