Why Selenium Sucks for End-To-End Testing in 2024

Heather Brooks

Let’s first get to the basics.

What is an end-to-end test? We can define it as a test that can potentially span multiple UIs and perform testing from an end user’s perspective of a complete workflow from start to end.

Well, Selenium is not a good fit for cross-system testing or for emulating a user’s real-world experience.

Let me explain why.

How does Selenium work?

Selenium was created in 2004. Way before the Single Page Apps were in favor and when pages looked like this:

and HTML at that time looked like this:

Now let’s compare this with the modern version of a similar part of Amazon’s page:

Did you notice how the code complexity grew exponentially? Where it used to be just one simple table, there are now 10+ levels of nested div elements!

The question is: Why?

Why does Selenium encourage the use of XPath?

Let’s look back to the roots of XPaths. HTML was created in 1993 as HyperText Markup Language to reflect a document’s structure. XPath first appeared a few years later, in 1998 to reflect the path of a structured document (similar to URL for the web). Selenium’s design embraced the paradigm, and relying on XPaths made total sense at the time.

Unfortunately for Selenium, though, a lot has changed since then. HTML is being used differently now – mainly to position elements on the screen, often with a large combination of nested divs.

Selenium Webdriver encourages its users to stick to XPath locators by design. This approach worked well a decade ago when pages had a relatively simple structure, but it doesn’t quite work anymore. Nowadays, pages have insanely complex, barely-human-readable structures; even more so, these structures are constantly changing. HTML was NOT designed to render fancy UI as we do now.

It is impossible to rely on technical information like XPaths to reference elements that are stable enough in an actively developed application. Additionally, things like ids and data-test-ids are not really working for list and table elements. I’m not even talking about the lack of ids in React.

Let’s look at the XPath from the example above for an Amazon a-tag:

/html/body/div[4]/div[2]/div/div[1]/div/div[2]/div/div[1]/div/div[1]/div[2]/div/div[2]/a

And this is the best Google Chrome DevTools inspector could come up with:

//*[@id="zg_left_col1"]/div[1]/div[2]/div/div[2]/a

Even the SelectorsHub extension could only come up with this:

//div[@id='8mNf9lO2-mC1H7sJJMcE_g']//a[@class='a-link-normal']

This is unreadable and would be creating a maintenance nightmare technical debt!

Debugging Issues in Selenium

Another issue with Selenium is the pain of debugging a test script. Imagine yourself investigating why a particular test failed – and finding out that it happened due to being unable to find an XPath. Your next step will likely be to copy this XPath to the browser, only to confirm that such an element on the page does not exist. Now, you have to play the guessing game to determine what this element is about. What if the person who wrote the test is no longer with the company anymore? How do you solve this XPath mystery now?

The same applies not just to XPaths, but also to CSS Selectors, data-testids, ids, etc. As soon as the reference to an element is not from an end-user point of view, it is susceptible to breaking while still working for users simultaneously.

Basically, the current way of working with the page has the following issues:

Difficult element identification: It is nearly impossible to understand what element is being referenced unless your Selenium code is heavily documented and that documentation is not out of sync with the code.
Error and exception handling: Only developers can understand test failures since the error descriptions are cryptic.
Unstable code: The structure had not been designed to properly handle modern apps with forms and tables. It lacks a stable and reliable way to refer to elements.

What is the end result? Instead of creating new tests, you have to spend an increasingly more significant amount of time on maintaining existing ones. Our experience shows it to be a widespread issue among teams that have been developing tests for 1 or 2 years, and the number of tests reaches a certain amount. They often have to spend up to 50% of their day on test maintenance rather than doing something more productive.

Now combine that with cross-systems testing where you don’t control the HTML of the system under test. No amount of BDD/Shift-left will help you reduce the maintenance required to constantly catch up with someone else’s changes in 3rd party apps (think Salesforce).

How should end-to-end testing work?

Think about it. What are end-to-end tests supposed to do? They are supposed to help you validate that your functionality works from the end-user perspective according to the real-world user flow.

Therefore, you should refer to elements from the end-users perspective, i.e., how they see things and not how the developer sees the application. The only things that matter to any actual user are finding the right input to enter or locating the correct button. Therefore, there should be an easy, stable way to work with forms and tables that emulate a user interacting with a browser or a device.

Example: Forms

Let’s talk about forms. Here’s another example from amazon.com:

Here is the HTML:

Did you notice how an element’s ‘id’ and ‘name’ are clear and descriptive? Simply great! Problem solved then, but is it really so?

The moment you change your UI framework to React, all your fancy ids are gone. When you migrate to some back-end-hooked rigid framework (or a new version), your name would probably have to change as well (think ASP.NET). Interestingly, this is EXACTLY when you want your end-to-end tests to work. Because you just migrated to a new framework, you need to run the end-to-end tests to see if everything is working.

Therefore, a proper end-to-end testing tool should never hook up onto the internals of your application, but rather, how it looks from the end-user’s perspective. Look at the “City” input on the screenshot above, this structure might change. However, I’d argue that this form will always have either a placeholder saying “City” or whatever an end-user perceives as a “label” for it.

Again, based on our experience (don’t trust us, check for yourself), not everyone would have such a proper HTML structure as Amazon with a ‘label for’ structure in place. So, unfortunately, you can’t rely on that either.

Therefore, there should be a way to describe or identify this input from an end user’s perspective, relying on what is considered a “label” or a placeholder and not based on the code structure.

So, it should look something like this:

enter "San Francisco" into "City"

Right?

Example: Tables

Now, next, let’s talk about tables.

Here is one of the most widely used examples from Salesforce:

What matters from the end-user’s point of view is that the row containing the ProperUniqueCompany has a certain status. Another example is that the down icon on the last column on that row can be clicked.

So, ideally, it should look something like this:

validate that table at row containing "ProperUniqueCompany" and column "Lead Status" contains "Open - Not Contacted"

click on the table at the row containing "ProperUniqueCompany" and the last column

This should work regardless of how the table is rendered – whether it’s HTML <table> (like in Salesforce example) or using <div> rendering (like in Amazon example). See here for a better way to work with tables.

Hopefully, you’re now on the same page with us that using XPath in 2024 has many disadvantages. Here are the 11 reasons why not to use Selenium.

Users certainly don’t care about the ids, names, or data-test-ids of those elements. Moreover, they often lead to situations where those ids/names/etc. change, causing the test to fail. Even though, from an end-user’s perspective, everything is good. These changes in XPaths would undoubtedly result in the degradation of test stability.

testRigor for end-to-end testing

Think about it: wouldn’t it be wonderful if you only needed to maintain your test when the application actually changes, as opposed to when HTML code would change?

Fortunately, there is a way now!

The examples in this article are actually executable code from testRigor. It is an AI agent that works on generative AI to let you generate, record, or write test cases in plain English.

Through testRigor, UI changes are easily incorporated into the test scripts without human intervention. For example, if an ‘Add to Cart’ button is changed from <button> tag to <a> tag, testRigor will learn how the button is rendered and understand the button’s role within the application context.

You can write test scripts for scenarios such as 2FA, QR code, file, database, geolocation, email, phone call, SMS, video, audio, accessibility testing, and many more, easily using plain English. Here is the top testRigor’s feature list.

Read how to perform end-to-end testing using testRigor.

Conclusion

Times are rapidly changing, and any business or application can only survive with the use of AI and test automation. Riding the wave and staying ahead rather than regretting later is wise. Intelligent tools like testRigor are here to help you drastically reduce test creation and maintenance time. Using its plain English commands, anyone from manual testers, BA, SMEs, and other stakeholders can create, run, and edit test scripts quickly. Read here how testRigor is a test automation tool for manual testers.

If better options are available, why not use them and save valuable time, effort, and money? Let the experts dedicate their energy to creating more robust tests rather than debugging and exception handling of erratic test scripts. Make an informed decision to provide excellent quality and test coverage within deadlines.

Join the next wave of functional testing now.

A testRigor specialist will walk you through our platform with a custom demo.

Request a Demo

Start testRigor Free