Test Smells

Screen Shot 2018-10-20 at 18.33.58

I thought I’d put together a catalogue of testing smells (having discussed making tests difficult in the past). There may be others, please feel free to comment on them. While I may focus on Java (and possibly occasional JS) examples, these should be pretty universal.

Here are some smells I’ve found:


  • The First and Last Rites – where there’s some ritual/boilerplate at the start and end of most test bodies, suggesting a lack of common setup/teardown code
  • Everything is a property – where a test class has members in place of some methods using temporary variables
  • Missing parameterised test – when you did it the long way round because you didn’t bring in parameterisation
  • Test body is somewhere else – when the test method calls another method entirely with no other implementation in the test method – often a sign of missing parameterised test
  • Two for the price of one – sometimes a sign of missing parameterised tests – where a test is testing two use cases with the same set up.
  • Integration test, masquerading as unit test – where there are too many layers involved in making a unit test, so it runs too long
  • The Parasite – a test which should be written stand-alone, but depends on the running of a previous test


  • Herp Derp – words and comments in test code or names that add nothing, like simple or test or //given
  • Hidden Meaning – where something that should be part of the execution of the test, and appear in a test report, is hidden in a comment – essentially comment instead of name
  • Over refactoring of tests – where you can’t read them because they’ve been DRYed out to death
  • Boilerplate hell – where you can’t read the test because there’s so much code, perhaps a case of missing test data factory
  • Half a helper method – where there’s a utility method to help a test do its job, yet all calls to it are immediately followed by the exact same code. This is because the method is only doing half the job it should, so your test has more boilerplate in it.

Test Data

  • Missing test data factory – where every test has its own way of making the same test example data
  • Unworldly test data – where the test data is in a different style to real-world data e.g. time processing based on epoch milliseconds near 0, rather than on sensible timestamps that would be used in the real world
  • Invalid test data – when the test data would not be valid if used in real life – does this make the test invalid or not?
  • Wheel of fortune – where random values in the test can lead to error – see also It Passed Yesterday


  • Chatty logging – often a substitute for self-explanatory assertions or well defined test names, the test writes lots of data to the console or logs in order to explain test failures outside of the assertions.
  • Second guess the calculation – where rather than using concrete test data, we use something that needs us to calculate the correct answer ahead of assertion
  • Over exertion assertion – where the implementation of an assertion is heavy and in the body of the test, rather than in an assertion library
  • Bumbling assertions – where there was a more articulate assertion available, but we chose a less sophisticated one and kind of got the message across. E.g. testing exceptions the hard way, or using equality check on list size, rather than a list size assertion.
  • Assertion diversion – where the wrong sort of assert is used, thus making a test failure harder to understand
  • Celery data – usually quite Stringy – where the data read from the system under test is in a format which is hard to make meaningful assertions on – for example raw JSON Strings.
  • Conditional assertions – potentially a case of over exertion or diversion – the choice of assertion in a test appears to be a runtime choice, leading to tests whose objectives are harder to understand/guarantee.
  • Badly reimplementing a bit of test framework – related to over exertion asserts, where there’s an ad-hoc bit of what should be a test library, this also includes home-made shallow implementations for deep problems like managing resources such as database or file. It also includes manually pumping framework primitives, rather than using the framework as a whole.
  • Assertion Chorus – aka missing custom assertion method – where a series of assertions repetitively perform a long winded routine to test something.


  • The True Believer – just enough tests to convince the author that the code must surely be right, not that it most likely isn’t wrong
  • Assert the world – where the assertions prove everything, even uninteresting stuff.
  • Blinkered assertions – where the assertions are blind to the fact that the whole answer is wrong, because they’re focusing on a subset of the detail.

Mocks and Hooks Madness

  • Overmocking – where tests are testing situations that are guaranteed to pass as they’re whitebox tested against perfect mocks that do not indicate anything to do with reality
  • Mock madness – where even near-primitive values like POJOs are being mocked, just because.
  • Making a mockery of design – where pure functions have to be dependency injected so they can be mocked.
  • Remote Control Mocking – where a class that depends on a service is tested with those service’s complex dependencies mocked, rather than the service itself being mocked.
  • Hooks everywhere – where the production code has awkward backdoors in it to enable test to perform test-time rewiring or intercepting.
  • The telltale heart – where the production code is repeatedly calculating and returning values that are only used at test time.


  • Is There Anybody There? – the flickering test that occasionally breaks a build – bad test or bad code?
  • It was like that when I got here – ignoring the preparation of pre and post-test state, leading to all manner of shenanigans.


  • Repeatedly re-reading the inputs – where some data that could be made immutable and loaded once is read for every instance of a test
  • The painful clean-up – where every test needs to build or clean up an expensive resource, like a database, as the separation of tests is weak, or the test is too large

Test Last

  • I wrote it like this – testing the known implementation rather than the outcome of that implementation.
  • Contortionist testing – this is really a design smell. You’re probably adding tests after the code was written and are required to bend over backwards to construct those tests owing to poorly designed code. This especially involves NEEDing to use mocking of static functions or types.

Please feel free to complain about your own testing smells in the comments below. I plan to flesh out examples of the above in due course.

Other Test Smells resources:



Leave a Reply to Remote Control Mocking – The Coding Craftsman Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s