Why engineers suck at writing integration tests
In a perfect world, we can all agree that having automated tests are a good thing. They improve reliability, prevent bugs from making it to production, and reduce the load on manual QA, in addition to a whole host of other less obvious, but crucial benefits.
But we don't live in a perfect world.
The real downside to automated tests is writing them in the first place. While important for any CI/CD development environment, writing and maintaining tests is the least favorite part of a developer's job. As we all know, if we don't like to do something, we probably won't do it well.
In short, engineers suck at writing integration tests because writing integration tests sucks
What's so bad about writing integration tests?
✏ Writing integration tests is too hard
Writing a test is fundamentally expressing a user story. You want to make sure that your customer can take a set of actions in a particular order, and your application responds the way you would expect based on those actions. For example, if you're Amazon, you want to make sure that your customer can add an item to the cart, and the item actually loads in the cart when it's opened.
While easy to express in English, writing a test for that story in something like Selenium requires making complicated assertions on every step of the story (make sure a button loads containing some text, make sure it's clickable, make sure a page loads, that an image of the item loads, etc.).
Furthermore, whatever framework you use to test is probably not the framework you use to do most of your job, which means every time you go to write tests, you have to brush off the test-writing skills before you can really get productive.
🛠 Maintaining them is too costly
Test maintenance comes in many forms:
The cost of updating tests every time you make changes to your application
Writing tests is important when you're making changes to your application. At the same time, when you make changes to your application, you likely need to update your tests if any of the changes conflict with the assertions you're making inside of those tests. This circularity means that every time you're making big changes to your site, not only do you need to write new tests for that functionality, you probably also need to update all of the tests you've already written so they don't break.
The cost of investigating and fixing flaky tests
A flaky test fails sometimes and passes other times, despite no changes to the user story it is testing. Every time one of these flakes occurs, you need to investigate whether it is a true failure of the user story, or a false failure of the user story. The more tests you add, the more investigation you need to do when your test suite flakes.
The cost of held-up deploys
When tests fail, even falsely in the event of flakes, CI/CD pipelines will fail, which means your important changes won't make it to production. The whole purpose of CI/CD is to speed up the process of making continuous improvements, but flaky or brittle tests defeat that purpose entirely.
Every test that you write doesn't only carry the cost of authoring it; it also carries the cost of all future maintenance you'll be responsible for related to that test. As such, developers tend to view tests as a burden, and who wants to voluntarily take on a burden?
❓ Results aren't trustworthy
The whole point of writing tests is to verify that any changes you make to your application don't break the user experience. If you have to investigate every failure due to the flaky nature of most testing frameworks, it defeats the whole purpose of testing (to improve reliability, prevent bugs, and move faster). This untrustworthiness makes writing tests feel pointless, and nobody wants to do pointless work.
✅ Full coverage is difficult to achieve
Browser testing technologies have not kept up with the increasing complexity of web applications. Tools like Selenium or Cypress don't have easy ways to handle user experiences like third-party integrations, file uploads or downloads, interacting with maps, or sending/receiving emails.
If those types of flows are important to your product, then writing tests for the functionality around them will feel pointless, because it only represents a partial solution, and won't be testing the true user experience.
Nobody wants to build a partial solution. When faced with a partial solution, it's easier to build nothing at all.
What needs to change?
Getting engineers to write better tests means solving the problems that make them bad at writing tests in the first place. So to get engineers to write better tests:
- We need to make tests easier to write
- We need to make tests easier to maintain by eliminating flakiness
- We need to eliminate false negatives and false positives from our test results
- We need to make it possible to test the full functionality of our applications
Luckily, walrus.ai is here to help!
- End-to-end tests can be written in plain English in minutes, without handling any mapping on your end
- Tests are maintained entirely by walrus.ai, so you don't need to update tests when you change your application
- Results are verified by walrus.ai, so you only get true failures or true passes
- walrus.ai can test the most complicated flows, including third-party integrations, file upload or download, 2fa, or email-based flows.