I thought I’d put together a catalogue of testing smells (having discussed making tests difficult in the past). There may be others, please feel free to comment on them. While I may focus on Java (and possibly occasional JS) examples, these should be pretty universal.
Here are some smells I’ve found:
- The First and Last Rites – where there’s some ritual/boilerplate at the start and end of most test bodies, suggesting a lack of common setup/teardown code
- Oversharing on setup – where every test sets up a lot of shared data which only some tests need
- Everything is a property – where a test class keeps what should be temporary variables in instance variables
- Missing parameterised test – when you did it the long way round because you didn’t bring in parameterisation
- Test body is somewhere else – when the test method calls another method entirely with no other implementation in the test method – often a sign of missing parameterised test
- Test setup is somewhere else – where the test method just does the assertions, not the given/when part; this can be acceptable in the case of several tests on a single shared expensive resource setup, but seldom is at other times
- Two for the price of one – sometimes a sign of missing parameterised tests – where a test is testing two use cases with the same set up.
- Integration test, masquerading as unit test – where there are too many layers involved in making a unit test, so it runs too long
- The Parasite – a test which should be written stand-alone, but depends on the running of a previous test
- Herp Derp – words and comments in test code or names that add nothing, like
- Hidden Meaning – where something that should be part of the execution of the test, and appear in a test report, is hidden in a comment – essentially comment instead of name
- Over refactoring of tests – where you can’t read them because they’ve been DRYed out to death
- Boilerplate hell – where you can’t read the test because there’s so much code, perhaps a case of missing test data factory
- Half a helper method – where there’s a utility method to help a test do its job, yet all calls to it are immediately followed by the exact same code. This is because the method is only doing half the job it should, so your test has more boilerplate in it.
- What are we Testing? – where the test data, or the way we produce it, is not self-explanatory for the use case. This is the general case of many of the below smells, but also includes how test data is represented in the code.
- Second guess the calculation – where rather than using concrete test data, we use something that needs us to calculate the correct answer ahead of assertion
- Missing test data factory – where every test has its own way of making the same test example data
- Unworldly test data – where the test data is in a different style to real-world data e.g. time processing based on epoch milliseconds near 0, rather than on sensible timestamps that would be used in the real world
- Invalid test data – when the test data would not be valid if used in real life – does this make the test invalid or not?
- Wheel of fortune – where random values in the test can lead to error – see also It Passed Yesterday
- Chatty logging – often a substitute for self-explanatory assertions or well defined test names, the test writes lots of data to the console or logs in order to explain test failures outside of the assertions.
- Over exertion assertion – where the implementation of an assertion is heavy and in the body of the test, rather than in an assertion library
- Bumbling assertions – where there was a more articulate assertion available, but we chose a less sophisticated one and kind of got the message across. E.g. testing exceptions the hard way, or using equality check on list size, rather than a list size assertion.
- Assertion diversion – where the wrong sort of assert is used, thus making a test failure harder to understand
- Celery data – usually quite Stringy – where the data read from the system under test is in a format which is hard to make meaningful assertions on – for example raw JSON Strings.
- Conditional assertions – potentially a case of over exertion or diversion – the choice of assertion in a test appears to be a runtime choice, leading to tests whose objectives are harder to understand/guarantee.
- Fuzzy assertions – where lack of control for the system under test, causes us not to be able to predict the exact outcome, leading to fuzzy or partial matching in our assertions
- Badly reimplementing a bit of test framework – related to over exertion asserts, where there’s an ad-hoc bit of what should be a test library, this also includes home-made shallow implementations for deep problems like managing resources such as database or file. It also includes manually pumping framework primitives, rather than using the framework as a whole.
- Assertion Chorus – aka missing custom assertion method – where a series of assertions repetitively perform a long winded routine to test something.
- The True Believer – just enough tests to convince the author that the code must surely be right, not that it most likely isn’t wrong
- Assert the world – where the assertions prove everything, even uninteresting stuff.
- Blinkered assertions – where the assertions are blind to the fact that the whole answer is wrong, because they’re focusing on a subset of the detail.
Mocks and Hooks Madness
- Overmocking – where tests are testing situations that are guaranteed to pass as they’re whitebox tested against perfect mocks that do not indicate anything to do with reality. See also How Mocks Ruin Everything.
- Mock madness – where even near-primitive values like POJOs are being mocked, just because.
- Making a mockery of design – where pure functions have to be dependency injected so they can be mocked.
- Remote Control Mocking – where a class that depends on a service is tested with those service’s complex dependencies mocked, rather than the service itself being mocked.
- Hooks everywhere – where the production code has awkward backdoors in it to enable test to perform test-time rewiring or intercepting.
- The telltale heart – where the production code is repeatedly calculating and returning values that are only used at test time.
- Is There Anybody There? – the flickering test that occasionally breaks a build – bad test or bad code?
- It was like that when I got here – ignoring the preparation of pre and post-test state, leading to all manner of shenanigans.
- Repeatedly re-reading the inputs – where some data that could be made immutable and loaded once is read for every instance of a test
- The painful clean-up – where every test needs to build or clean up an expensive resource, like a database, as the separation of tests is weak, or the test is too large
- I wrote it like this – testing the known implementation rather than the outcome of that implementation.
- Contortionist testing – this is really a design smell. You’re probably adding tests after the code was written and are required to bend over backwards to construct those tests owing to poorly designed code. This especially involves NEEDing to use mocking of static functions or types.
Please feel free to complain about your own testing smells in the comments below. I plan to flesh out examples of the above in due course.
Other Test Smells resources: