Stop fixing it already…

 photo MicrosoftFIX_IT_zpsoqz3jnck.pngOver in Fix it twice? I discussed why a second attempt at fixing an issue, to use hindsight of an actual fix as a way to improve the software, was a great idea. In this piece, I’d like to discuss the aim of bug fixing.

Why do we fix bugs?

Is it to make the error go away?

Well, kind of, but that’s only part of the story. The aim of fixing a bug is:

  • Make the software work
  • Fix the thing which caused the bug (which can be software, or process, or communication etc)
  • Make it harder to repeat the mistake

I don’t want software with fixes in, I want software without issues. I don’t want issue resolution, I want it to be hard to make that issue occur again.

Removing the source of a mistake and being mistake-proof are two sides of the same coin. However, it seems to be against human behaviour to fix the cause, rather than the side-effect of a bug, and it seems to be hard, sometimes, to identify the cause as something that can be mistake-proofed in future.

Here’s an example:

public class RouteSoFar {
    private final List<String> streetsVisited = new ArrayList<>();

    // adds the street to the route
    public void visitStreet(String street) {
        streetsVisted.add(street);
    }

    // get a fresh copy
    public RouteSoFar makeCopy() {
        RouteSoFar copy = new RouteSoFar();
        copy.streetsVisited.addAll(this.streetsVisited);
        return copy;
    }

    // show route
    public String getRoute() { ... }
}

In the above class we have an object that’s tracking a route by accumulating streets. Maybe the idea is to print out some routes to interesting stores radiating out from an initial street. In fact, that’s a nice algorithm to write.


    StreetMap streetMap = new StreetMap("Bordeaux");
    Street firstStreet = streetMap.get("Rue De Winston Churchill");
    RouteSoFar routeSoFar = new RouteSoFar();
    printRoutesToStores(firstStreet, routeSoFar);

    ...

void printRoutesToStores(Street currentStreet, RouteSoFar routeSoFar) {
    // we're now on this street
    routeSoFar.visitStreet(currentStreet.getName());

    // print interesting stores on this street
    currentStreet.getStores().stream()
        .filter(Store::isInteresting)
        .map(store -> store.getName() + " is at " + routeSoFar.getRoute())
        .forEach(System.out::println);

    // and recurse to all neighbouring streets
    currentStreet.getNeighbours().stream()
        .forEach(neighbour -> printRoutesToStores(neighbour, routeSoFar));
}

Ignoring for a moment the fact that the neighbouring streets could easily give us one of the streets we came from (let’s pretend it’s rigged not to) the above code looks like it will work, but won’t. You’d find out if you ran it that the routeSoFar doesn’t contain the route to the store, but instead is polluted with any street it’s visited. This is because it’s a mutable object and is shared between layers in the hierarchy of the program.

Ah. So that’s why there’s that makeCopy function, a kind of clone method, then? Do we just change one line?

    // replace this
    currentStreet.getNeighbours().stream()
        .forEach(neighbour -> printRoutesToStores(neighbour, routeSoFar));

    // with this
    currentStreet.getNeighbours().stream()
        .forEach(neighbour -> printRoutesToStores(neighbour, routeSoFar.makeCopy()));

}

It’s probably clear by context, that the answer I’m aiming for is no. If you’re pedantic enough to debate that it’s yes, then I applaud you. In this instance as written, yes would be fine… but then what?

In a real life project, I got into the habit of fixing bugs like the one above on a class with a similar purpose to the one above, by remembering to do that thing… Yes, if you’ve got this mutable object and it might travel beyond the method that owns it, then it might be better to send a copy, not the real thing.

How many times do you need to apply that fix to a piece of code before you realise you need to make it mistake proof?

The problem the above has is that it’s a mutable object being used naively by multiple parts of the program. If it were immutable, then you would have to knowingly change it, and you couldn’t accidentally have your copy changed. It’s a reversal of the paradigm, and the side effect is that the thing you’re interested in becomes slightly more conscious a thing to do, and the thing you don’t want to happen become impossible to do accidentally.

Here’s the same Route class made as an immutable object. I think you can see how it might be used.

// immutable pojo - no setters, but can be transformed into a new immutable pojo
public class RouteSoFar {
    private final List<String> streetsVisited = new ArrayList<>();

    // adds the street to the route and returns a new object
    // so you store the reference to that if you want to keep it
    public RouteSoFar havingVisitedStreet(String street) {
        // rather than have a clone method, we innately create a copy 
        // of this, modify that and return IT
        RouteSoFar newRoute = new RouteSoFar();
        newRoute.streetsVisited.addAll(this.streetsVisited);
        newRoute.streetsVisted.add(street);
        return newRoute;
    }

    // no need for a copy/clone method

    // show route
    public String getRoute() { ... }
}

Philosophically, this is what bug fixing is – finding a mistake and eradicating it. However, spotting an opportunity to remove the opportunity for future errors of the same sort is a ninja trick. Use it!

Advertisements
Posted in Uncategorized

Fix it twice?

 photo MicrosoftFIX_IT_zpsoqz3jnck.pngOne of the many catchphrases I use with my team is Fix it twice. This refers more to the Red – Green – Refactor cycle. For clarity, that’s usually used during TDD where you go Red – a failing test. Then you do whatever it takes to make the test pass – Green. Finally, you review the code you’re left with and, with the security of the unit test around you, you refactor it to the simplest possible design/structure for its current feature set.

Where I say Fix it twice, I’m usually referring to the rare case of a bug-fix or similar. With TDD you don’t get bugs so much, because your tests kind of prevent them. You do, however, get some surprises at later points. Tests are never perfect, and some issues only rear their heads when you add more features. So at some point, you need to apply a fix to existing code to make it do what it ought to have done all along.

We still use TDD for bug fixes, you can’t be sure you’ve fixed it until you’ve found a test that needs it to be fixed in order to pass. The problem can sometimes be that you don’t know the scope severity of the fix until you’ve had a crack at solving the problem, after which you can be left with code that’s not at your usual standard.

Refactoring alone may not make the root cause of your problem better. But, once you’ve fixed the code once, to get something that now works again, you’re usually in a better position to judge what you should have done all along. This is the Fix it twice of which me and the team speak.

Maybe you revisit a root cause of the bug. Maybe you generalise the code to avoid edge cases that need resolving individually. Maybe you add more tests to prevent a naive change causing a surprising regression. The philosophy of Fix it twice is the solution to the classic case of L’espirit de l’escalier that we encounter in life sometimes, where you wish you’d said something different in a conversation that’s now over. In software you’ve always got the chance for a better do over.

Posted in Uncategorized

So Let’s Test The POJOs

So the code coverage is mediocre and there are loads of getters and setters and equals methods out there sitting untested. Let’s impress our lords and masters by running some tests against them. Assert that when you set the value, the getter gets the new value and so on.

Shall we?

Is POJO coverage important?

Here are some thoughts.

Code Coverage is less important than Code-that’s-used

Writing placebo tests is just busy work. Asking why the methods aren’t already used by tests that matter for your business scenarios will reveal that either:

  • Not all scenarios are covered – so do that
  • Well, this method isn’t really needed – so delete it

And in a few cases, that’ll still leave some stuff behind that’s kind of needed structurally, but isn’t that important right now.

Absence of code coverage creates a fog

While working on Spectrum, we maintained a rule of 100% code coverage (until some new coverage tool came along and lost us 0.01% for some reason). The idea was that you have a rigid feedback loop if your change does anything to a 100% coverage statistic. We reckoned you could test every scenario, so there shouldn’t be a line of code out of place.

When you have lower code coverage, it’s hard to tell from the statistics whether this means that important or unimportant stuff is not getting tested. When it’s higher – 95%+ – you get the impression that every uncovered line represents a testing misfire and a potential bug.

So find an easy way to cover your minor lines. Use an equals/hashcode test helper to wash away doubts about those methods. EqualsVerifier is one such option.

Testing things in context leads to high coverage

We’re not really here to make code coverage. We’re here to make code that’s guaranteed to do what the business process is expected to do. This means you need high scenario coverage. So maybe your getter/setter combination is only used when you serialize the object, so maybe you need to serialize some, perhaps in the context of the flow which need serialization to work.

Once you focus on the usage in real life of your classes, you can use code coverage to feedback that the intended code paths are tested and that all your code is actually needed.

Caveat Tester!

Despite all of this, code coverage is only a surrogate statistic. Alone it doesn’t prove anything. If it’s low it proves SOMETHING, but if it’s high, it only provides a number in response to what should be a genuine heartfelt attempt to try out the behaviour of your software and show it working as intended.

Posted in Java, tdd

How Mocks Ruin Everything

Despite being an ardent advocate of TDD, mocking and Mockito (which is totally awesome), I’m going to argue against mocks.

It had to happen. I’ve found really clear situations where mocks have made my work harder, and it’s got to stop!

Why?

When used incorrectly Mocks do not test your code

Yep. Mocks can be the fool’s gold of TDD.

What do good tests achieve?

Back to basics then. What’s it all about?

  • Really exercise the components under test
  • Specify what the component is expected to do
  • Speed up development and fault resolution
  • Prevent waste – because developing without the right tests is probably slower and less efficient overall

Why do we mock?

  • Create boundaries in tests
  • Speed of execution
  • Simulate the hard-to-simulate, especially those pesky boundary cases and errors

What is a mock?

For the purposes of this discussion, let’s bundle in all non-real test objects:

  • Pretend test data
  • Stubs – which receive calls and do nothing much
  • Mocks – which can employ pre-programmed behaviours and responses to inputs, despite being just test objects
  • Fakes – implementations that simulate the real thing, but are made for testing – e.g. an in-memory repository

How can mocks go wrong?

Your mock starts to work against you if:

  • It makes your test implementation focused, rather than behaviour focused
  • Mocking requires some obscure practices
  • Mocking becomes a boilerplate ritual performed before testing

Implementation focused?

You’ll know if you’re doing this. If you do it before writing the code, it feels like trying to write the code backwards, via tests that predict the lines of code you need. If you do it after the fact, you’ll find yourself trying to contrive situations to exercise individual lines or statements, or you’ll find yourself pasting production code into your tests to make a mock behave the way you want.

Obscure

Mocks have the power to bypass the real code, so we may find ourselves using the mocks to generate an alternative reality where things kind of work because the mocks happen to behave in a way which gives a sort of an answer. This seems to happen when the thing you’re mocking is quite complex.

Ritual

If all tests begin with the same pasted code, then there’s something odd about your test refactoring and mocking.

So What’s The Solution?

  • You ARE allowed to use real objects in tests
    • Mock at heavy interface boundaries
    • Refactor your code so more of your real algorithms and services can be used at test time
  • You SHOULD test with real-life data
    • Your fancy date algorithm may work fine with 4th July 2006, but if that’s not the sort of date your users will use, come up with more real life ones
    • Make the test data describe real-world use cases so people can better understand the intent of the module
  • Add mocks when you must
    • Add spies to real objects to simulate hard-to-reach places as a preference to total mocking
  • Consider using fakes
    • Complex behaviour of simple things tests best if you can produce a fake implementation – this might even allow for changes in how the code under tests uses the services it depends on
  • Test-first means that you delay thinking about the implementation
  • Test behaviour, not lines of code

In Conclusion

Test what you mean to test. Write tests first about important behaviour. Try to forget any knowledge about the implementation in the tests. Within reason, be proud to weave together a handful of collaborating objects to spin up a unit test.

Posted in Java, tdd

Mockito Argument Matching in Java 8

When using mocks, we often want to check the inputs to a function that was called on the mock. It’s probably a subject for another post whether you should rely on doing this, or whether you should make more code that just returns something, rather than calls something, but let’s agree that you will, at some point, want to check how a mock’s function was called.

With Mockito as our mocking framework of choice, here’s the hard way:

// given some test execution has happened

// construct a fully-blown replica of what you think will have
// been passed into your function under test
SomeObject expectedInput = new SomeObject( ... );

verify(myMock).myMethod(eq(expectedInput));

Why is that hard?

Well, you need to predict a perfect replica of the input, which requires you to bind your test to the exact implementation. In some cases, certain fields, which are less interesting to the behaviour your testing, have to be specified so that the equals comparison works. Worse still, some fields that get arbitrary values at runtime have to have specific values pushed into them to make the test replicatable – timestamps, for instance.

In short, the above doesn’t always work well, so we often resource to the use of argument captors.

// given some test execution has happened

// Find out how the method got called
ArgumentCaptor<SomeObject> captor = ArgumentCaptor.for(SomeObject.class);
verify(myMock).myMethod(captor.capture());

// read the thing you're interested in from the captor
assertThat(captor.getValue().getInterestingProperty(), is("expected"));

Put mildly, this sucks! You have three activities to do with the captor, all of which are to enable you to check one fact. You have to construct one, use one to capture the value, and then read the value from the captor after the Mockito verify method is called. The test also reads oddly. It’s become:

  • When the test was executed
  • Given this captor
  • Check the method was called and capture
  • And check the captured value was expected

Since Mockito 2.1, used with Java 8, there has been a neater technique. You can use the argThat comparator with a verify call to inline your check in the verify method:

// given some test execution has happened

// Find out how the method got called
verify(myMock).myMethod(argThat(
        someObject -> someObject.getInterestingProperty().equals("expected")
    ));

This, I think, makes for a more straightforward test. Verify that the method was called with an argument that matches a certain filter.

The reason this is possible is that Mockito moved away from using Hamcrest internally, replacing it with its own strongly-typed ArgumentMatcher interface, which is essentially a functional interface. This means you can replace it with a lambda.

An extra bonus of this technique is that you can also use argThat within the when or given constructs in Mockito. This means you can neatly specify how your mock will behave based on nuances of the input:

when(myMock.myMethod(argThat(
        someObject -> someObject.getInterestingProperty().equals("expected")
    ))).thenReturn(42);

For more information on this please see the Mockito JavaDoc.

Posted in Java, tdd

Automated Testing is a Broad Church

I’ve had the pleasure to be working on an automated testing framework recently. This seems to solve a problem that we’ve been having with Cucumber. I will write a more detailed piece on this in the near future, but here’s the elevator pitch.

JUnit and Mockito tests tend to be too technical so we can’t use them as an acceptance test framework directly.

Cucumber is the go-to technology for BDD/ATDD but implementing it can be cumbersome (cucumbersome perhaps? or an encucumbrance? – who knows!?).

In short, if you want the Given/When/Then and documentation friendly features of Cucumber you have to pay the price of using Cucumber.

What’s Wrong with Cucumber?

Nothing necessarily. Once you’re dealing with lots of similarly documented specs, especially if they’re simple, Cucumber can be a real boon.

However, for Cucumber to work you have to phrase your Gherkin right, implying but not implementing the test script. Then you need to write your glue code just write, and then you need one or two tiers of test execution code. This means you may have to cross 3 or 4 layers of software/script in various languages/styles to get to the code which reaches out to the system under test.

This is usually good, until you need to remember the outcome of one step in order to use it to verify a later one. At this point, you’ve no way of clearly putting that into the software layers. It ends up somewhere in the Orchestration or World code. It’s hinted at by the Gherkin and glue-code. It’s obscure and it’s caused entirely by the Achilles heel of Cucumber.

Cucumber’s Achilles Heel

To connect your spec with test execution code you have two degrees of separation. Plaintext Gherkin, used at runtime, plus whichever glue-code is kicking about. For tricky cases, this often obscures the intent of the spec/test implementation.

What can we do?

How about we write tests in Java but use the BDD syntax to structure them and report on them? With this in mind there are a few frameworks that offer just this:

Oleaster and Ginkgo4j both try to be equivalents of Jasmine and Mocha. Please see my post for more on these and other BDD frameworks in Java.

I have been working on Spectrum with its founder, Greg Haskins. In the current live release, there’s support for Jasmine/Mocha/RSpec like tests. In the next release (soon) there will be decent support for Gherkin syntax, and some rather neat ways of weaving in your favourite JUnit test frameworks (Spring, Mockito etc) via JUnit Rules.

Have a look.

The Right Test for the Right Job

Success comes not from finding the right tool, but from using the right tool for the job at hand. Where Cucumber succeeds, you should use it. It’s very helpful. Once it gets hard, change tactics.

Posted in Java, tdd

Obvious Mistakes

As a team leader who also writes code I have to worry about code several times over.

  • The coding standards we adhere to – they must be disciplined but not overbearing or pernickety
  • Every line of code the team writes – the objective is a decent product made a different way. While peer review is the way to ensure everyone takes collective pride in the work, the tech lead doesn’t get to stop worrying.
  • Every line of code I write – what kind of a person doesn’t act in the way they demand in others?

Recently, I’ve been noticing issues in the way we’ve been working. It only takes a few minor cases of letting our standards or techniques slip for our efforts to become counter productive. As team lead, I could point the fingers at the individuals who happened to write lines of code that I came to worry about. I don’t need to. As a contributor to that code, I can find recent examples where, for no reasons other than a combination of bad luck and time pressure, I dropped some balls. I then discovered the effect these minor slips had, and I’d like to confess to them.

Making mistakes is no big issue; learning from them is a great opportunity. I hope others will find this useful.

Name it after the implementation

We had the need for a hashing key which could be easily predicted. We settled on a date with two random digits after it. This would hash well, but you could, for any given date, predict the hash key to within 100, which is easily searchable. So we did a lot of talking about random.

Two bad things happened. This was a consequence of the noun random getting stuck into our discussion, when we were really making a predictable hash key.

Firstly, the code got filled with the word random, which others were asking about – why is random? How will that work?

Secondly, we made the code depend on a random number generator. Given that we were trying to make a hashing function which was going to be used for persistence, the ad-hoc random number generator, coupled with whatever Java happens to do with hashCode implementations, could best be described as something which works that way for now

All of this was a consequence of thinking random first, rather than seeing that statistically those two digits would be random, but we were trying to make a stable hash key. In the end, I switched it to be a two digit sample from an MD5 hash. This I tested for statistical variability and it was fine.

Human-friendly complexities

What’s the best way to represent a day code for computing? We ended up with YYYYMMDD as an 8 digit number. That’s definitely a day. Isn’t it? My mistake was to try to process this as a number. Given all the days between 20161101 and 20161202, you can just increment the day number, right?

Clearly not.

20161130 leads to 20161131 (?) and then 20161142 and 20161199 – these are valid numbers but they’re not valid days.

Where in a later part of this article I’m going to argue the opposite, it was clear at this point, that the unnecessarily human-friendly more complex-to-constrain number would have, if kept, led to more code around it to manipulate it. Totally misleading. I zapped it and replaced it with epoch day.

What unit tests?

I promise you. I practice test-first development. I do it a lot. I teach it. I believe in it. I’ve used it to help me out of situations where I couldn’t get something to work, and the high discipline incremental nature of it has given me revelations.

I jumped into a system wide refactor, algorithm change, and rework of some core code. Did I run the tests? Apparently not. I didn’t feel I had time, or wanted to make time, to run the tests – surely what I was doing was just going to work, right? I had all the answers in my head.

In fairness to me. I didn’t get too far out… but then I actually ran the tests. That’s when I discovered the problem below about typing. My code worked in seconds, other code worked in milliseconds. They don’t match. It didn’t work. It would have worked.

At this point, I felt very privileged to work with a team who have been trying to adopt the principle that I forgot for a couple of rabid hours. One of them had written some brilliant step by step unit tests for each feature. I got them to work one after the other by fixing the code and they guided me to my destination perfectly. Just think how much use they would have been 4 hours sooner!

Strong typing replaced by…

I wanted to represent a search time range. I had an object which happened to store the time internally as a long. This is fine. It serialises. Long is a common way to store time. It’s milliseconds, right? Or seconds? Epoch seconds? In UTC? Or epoch milliseconds. Surely it’s encapsulated…?

A friend of mine complained that we weren’t always using strong typing, we were something using String typing – where the info is just sitting there in the string in the right form if you know how to interpret it. What of abstraction and encapsulation? In the above situation, we didn’t have strong types, we had Long types.

Why was this an issue? Just choose milliseconds or seconds and it would work again? Two other things:

  • Occasional use of Joda time to help with the object
  • A module which manipulated the inputs to put into a time range object because it knew how to do things in milliseconds

I felt shocked. Here we are with Java 8 and we’re trying to operate on raw numbers from outside a class which is all about time in an environment where there’s the finest time library in the world baked into the language!

This was a poor design decision of mine taken literally by the team around me. I refactored my way out of it by introducing an external interface entirely composed of Java 8 time classes. The internals of the class remained as Long because that serialises in a compact way. The outside world was not allowed to push Long values in any more and a number of helper methods were absorbed into the time range class itself using the ask, don’t tell principle.

Users of this weakly typed object are left now with no doubt how to interpret what it means.

Unrealistic tests

The above weak typing is ameliorated if there are some decent examples of real world usage in the unit tests. If, however, you’re just dealing with simple values, you may be tempted to write tests in terms of unrealistic numbers like 1234, or 0. These can test for things like equality and comparison, but they also give no indication of what sorts of real-world usage might happen and whether the code would work predictably with real world numbers.

For example, if I tested my YYYYMMDD day calculator purely in terms of numbers like 11111111, I would not notice that there appear to be 70 days between the end of November and early December. 

In many cases, there’s no such thing as a good or bad input to an algorithm (I do a lot of property bag tests using the String Jim). However, if you have an abstraction and you don’t test it with real world inputs, you’re missing the tests document the code opportunity and may be sitting on it only works in theory problems.

Code bomb

There are two ways to interpret it when someone makes a lot of code and passes it to you. On the one hand, they’ve just delivered something of value that you should be able to use. On the other hand, it can feel like a hit and run. The code may or may not be fit for purpose. It may or may not have nuances that you can understand right away. It may be a boost or a few hours of head scratching waiting to happen.

On the whole, parachuting your code onto someone else’s to do list is definitely something to do with caution. I know what it’s like to see people really get a productivity boost from having an answer handed to them. I know what it’s like for some unimportant detail in that to steal time. I know how it feels receiving a batch of someone else’s incomplete work. In short, it’s a thing I’d like to see less of.

Documentation is for Wimps

My firmly held view is that documentation is not to be an input to development, but more of an output. I don’t think that technical documentation is innately valuable, especially where it can get out of date with the code. However, you need to leave something for the next person who needs to be able to use what you’ve made. That next person may well be you!

I value:

  • JavaDoc – public APIs to have their semantics described
  • Code review – just accounting for your changes to someone else in a discussion, especially where you comment on your own code to explain why you did it that way, it can really help you see your work from the outside and make some last minute refinements or simplifications
  • High level diagrams – if you can’t draw a diagram of your system on the back of a beermat you don’t understand it – if you can create high level diagrams as part of development then that is very helpful. They seldom get that far out of date. Extra points if you can get the diagrams to be generated automatically as an output of your work.
  • User guide – if you make a feature but it relies on the developer knowing exactly where to find it, and the exact semantics of using it.. well, you’ve failed. There should be a human-friendly interface, some of that may be a start-up script or a how to document, or just a well documented public API entry point.

I’ve not been strict enough regarding documentation. I really don’t want to force people into being authors… but the definition of done has documentation mentioned and I’ve been less focused on where the minimum threshold actually is. One of my mistakes came about because I had no documentation to guide me.

Summary

In the cut and thrust of development, it’s no surprise that sometimes one’s standards slip. The aim should be to commit to work which can be achieved at a sustainable pace. That’s no guarantee that there won’t be blips. The simple provable truth, though, is that dropping discipline when under pressure more often results in a spiral of rework as the poorer techniques appear to be less effort, but result in more confusion and rework.

I’ve reminded myself of a few useful points here. I hope others find this useful too.

Posted in Uncategorized