Why Selenium Sucks for End-To-End Testing in 2021

selenium Feb 21, 2021

Let's get to the basics, shall we?

What is end-to-end test? We define it as a test that spans potentially multiple UIs and performs testing from an end-user's perspective.

Well, Selenium is not a good fit for neither cross-system testing nor for testing from end-user's perspective.

Let me explain my point.

How does Selenium work?

Selenium had been created in 2004. Way before the Single Page Apps were in favor and when the pages looked like this:

and HTML at the time looked like this:

Compare with the modern version of similar part of Amazon's page:

As you probably noticed the complexity grew exponentially where it used to be just one simple table now are 10+ levels of nested divs.

Selenium by design encourages its users to stick to XPaths. This approach might have worked in 2005 with simple stable structures of pages, but in 2021 with pages that have insanely complex barely-human-readable structures that are constantly changing (yeah, HTML was NOT designed for the fancy UI we use it for now) it is technically impossible to rely on any technical information like XPaths to make reference to elements stable enough in an actively developed application. And things like ids and data-test-ids are not really working for list and table elements I'm not even talking about lack of ids in at all in React.

Look at the XPath for the example above from the latest Amazon for the image: /html/body/div[4]/div[2]/div/div[1]/div/div[2]/div/div[1]/div/div[1]/div[2]/div/div[2]/a/div[1]/img

And this is the best what Google Chrome could come up with:


This is absolutely unreadable and would be a nightmare to maintain!

Basically, the current way of working with the page has the following issues:

  1. It is very hard to impossible to understand what element being referred to unless your Selenium code is heavily documented;
  2. Only developers can understand test failures since they are cryptic;
  3. The structure had not been designed to properly handle modern apps with forms and tables - it lacks a stable reliable way to refer to elements.

The end result? Instead of creating more tests, you have to spend an increasingly large amount of time on test maintenance! We have seen that often after 1 or 2 years of developing tests people spend 50%+ of their time on test maintenance instead of doing something productive.

Now compound that with cross-systems testing where you don't control the HTML of the systems you test - no amount of BDD/Shift-left will help you to reduce the amount of test maintenance required to constantly catch up with the someone-else's changes in 3rd party apps (think Salesforce).

How the end-to-end testing should work?

Think about it. What end-to-end tests are supposed to do? They supposed to help you to validate that your functionality works from end-user's perspective.

Therefore, the way you should refer to elements should be from end-user's perspective and there should be an easy and stable way to work with forms and tables from end-user's perspective.

Users would only care that they can enter data into the same field or click on the link on the table row that contains their unique reference.

Forms. Let's see an example on Amazon again:

with HTML:

And you might notice clearly both the id and name of the element! Great! But is it?

Next moment you change your UI framework to React your fancy ids are gone and when you migrate to some back-end-hooked rigid framework (or a new version of it) your name would probably have to change as well (think ASP.NET). And, this is EXACTLY when you want your end-to-end tests to work! Because you just migrate to a new framework!

Therefore, a proper end-to-end framework would never hook up onto the internals of your applications, but, rather, how it looks from end-user's perspective! Look at the City input. I'd argue that it will always have either placeholder saying "City" or whatever an end-user perceives as a "label".

Again, based on our experience (don't trust us, check for yourself) not everyone would have such a proper HTML structure like Amazon with label for structure in place. So, unfortunately, you can't rely on just that either.

Therefore there should be a way to allow to specify input from end-user's perspective relying on what an end-user consider "label" or placeholder whichever is appropriate.

And it should look something like this: enter "San Francisco" into "City" isn't it?

Let's talk about tables for a moment, shall we?

Here is one of the most widely used example from Salesforce:

What a user cares about is that she/he can validate that the row containing ProperUniqueCompany has a certain status. Or click on the down icon on the last column on that row.

So, ideally, it should look something like: validate that table at row containing "ProperUniqueCompany" and column "Lead Status" contains "Open - Not Contacted" or click on the table at the row containing "ProperUniqueCompany" and the last column which should work regardless how the table is rendered. As a HTML <table> like in Salesforce or using <div>-based rendering like in Amazon.

What users certainly don't care about are those ids, names, data-test-ids of those elements and, moreover, they often would lead to situations where those ids/names/etc changed and test failed even though from end-user's perspective everything is perfectly fine. And this is what would reflect the test stability as well! Think about it, if you only need to attend to maintain your test when the application actually changes as opposed to when HTML code would change wouldn't it be wonderful?

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.