JSoup defence for Selenium

Selenium problems

It may happen, Selenium causes problems. There can be at least 2 major things:

  • Selenium test case lasts very long

Selenium request is really expensive. Especially when executing the tests through hub-node (selenium grid) infrastructure. The request is sent from the build machine to selenium hub, then to selenium node, then on selenium node to the browser. The response travels all this way back. When the performance problem shows up during test case execution, the cause is often related to the fact Selenium sends too many requests. We often are not aware how often requests are sent.

  • the page which is to be automated is getting refreshed often

Sometimes we need to assert the page which is automatically refreshed in specific time interval. This problem causes notorious StaleElementReference exceptions. When refresh event happens after Selenium grabs WebElement but before it invokes a method on it the exception surfaces.

 The real life problem

Recently I was dealing with such a problems and was trying to think of a solution.

In my case I was iterating through the table to assert the specific cell in each row.

So the code was more less like this:


The performance was very poor and staleness problem was present in almost each run.

Notice, java is sending requests to web browser at all marked lines. Surprisingly, it is the case for every iteration in the loop as well!

When page refreshes during driver chain method, you will get stale element reference exception. The same thing will happen when page gets refreshed anywhere during loop execution. The list of web elements which is used during the test cannot be refreshed until loop is completed!

How to solve such a problem? The solution is either to try to catch the exception so that processing starts at the beginning of the refresh interval or to decrease number of Selenium requests to minimum and move processing to memory as much as possible. The first solution turned out to be impossible as the loop was lasting 3 times longer than page refresh rate…

Here comes the cavalry

JSoup (https://jsoup.org) is the ultimate solution for all such a problems. Not only it is great library extremely easy to use with great documentation and intuitive methods but also it allows to extremely smooth code refactor because of the fantastic feature it supports: CSS selectors.

Just take a look:

The table is extracted using Selenium and then the processing is passed to JSoup completely for the looping time:

  • JSoup creates document of the html table, which is kind of snapshot of the data present at the time document was created which assures data consistency
  • the document is then queried using CSS selectors – completely offline from Selenium point of view and entirely in memory
  • the result is converted back to Selenium WebElement to continue Selenium methods

Now, the web browser interaction is reduced to only 2 places.

The solution is staleness proof and significantly improves execution performance: one just needs to catch StaleElementReference exception when Selenium is in play:

The only thing to consider in this specific example is to decide if we can accept the situation page was refreshed after grabbing the table but before sending getLocation request. Notice, it is perfectly save if there is no page refresh at all.

As for performance, even using local web browser and very small table the difference is noticable (on selenium grid the difference is really huge, believe me!):

Sum up

If there is a problem with multiple Selenium requests which cause performance issue or are making tests unreliable because of StaleElementReference exception – switch to offline processing with JSoup. Just remember, you need to understand the number of Selenium requests in your code, the exact cause of staleness and the impact offline processing brings to your test case consistency.


Automatic test case generation for state transition diagrams (approach 1.0)

Approach 1.0

This article is left here for historical reason. Please read newest version of the idea which is described HERE.

Increase automatic test case generation

I was writing about 2 things in the past: state transition based testing and automatic testcase generation. This is actually about 2 complementary test design techniques: state transition diagrams and decision tables respectively (I do not want to write about details of these techniques now – this is a subject for separate post I hope to write in the future). In the latter post I showed how to automate test case generation for decision tables, the goal for today is to show how to start automation when diagram is the starting point.

Combinatorial nature of a problem can be expressed as decision table and can be translated into xml for TCases application to process it and produce output which contains optimal set of test cases (automatic testcase generation). However, the most general way to analyze application under test is the state transition diagram. I already showed how to use this technique in order to achieve the coverage but I showed only the manual approach. Still, we need automatic test case generation!

When diagram is in use, the trouble begins: how to process it automatically? How to generate set of test cases from a diagram? It was quite a while until I came up with some reasonable solution.

I recently thought I could try TCases for this purpose. Although this is meant to identify variables and its values, if transitions of the diagrams could be considered as variables and their dependencies were described in TCases xml input file, I could get valid set of transitions and each transition would be used at least once in basic coverage setting. 

Practical example

Create model

Let’s use the same problem as in state transition based testing. We want to test if Notepad is working when switching between tabs and changing text direction inside each of them as well as writing text in each of them. This is very simplified model but it is enough to ilustrate the concept. The state transition diagram looks like this:


Now, it is required to translate it into XML representation which will be parsable for TCases (I was writing about TCases HERE). This is it:

INPUT is the state name, VAR is the transaction.

COMMAND in HAS elements contains domain language sentences which are executable after simple processing by domain language generator.

WHEN elements describe needed dependencies to allow only valid combinations of transactions.

EXPECTED in HAS elements shows we just assert if Notepad GUI is visible after each set of transactions is run.

There is one problem with this file: in line 16 we need to give all the sequence of transactions needed to reach SELECT2TAB as TAB_1_IS_SELECTED state has 2 outgoing transactions. This shows there is a disadvantage of modeling the diagram in this way if there are states using very many transactions.

Generate executable test case

After generator is run, the set of test cases is produced. Generator reference is

The link to the source code is shown at the end of this post if you are interested.

When using basic coverage which is 1-tuple coverage it will mean each transaction will be used at least once. Because each transaction is marked as TRUE or FALSE (decision about transaction is valid when dependencies are met) the set of transactions will contain both TRUE and FALSE: it means in the generated test case there can be all valid transactions but also part of them as well. This is 1-tuple coverage:

With generated test cases (tc3 is missing as it consists of FALSE values only and generator wisely skips such test cases):

Now, when creation process of test cases is automated it is very easy to increase the coverage. This is 2-tuple coverage:

With generated test cases:

Running the testcases

It is time to run the test cases. The generated test cases are just pasted into JUnit class:

And the class is run as shown here:

If curious, you can view all the code HERE under automatic-tc-generation-from-diagram branch.

Sum up

Even if not perfect this is a solution to automatically generate test cases from state transition diagram. Together with automatic test case generation for combinatorial problems described by decision tables it is very solid approach to quickly achieve optimal coverage and thus assure quality in the application under test.

More about the coverage – let’s get it right and fast automatically


I was writing once about superb TCases software which allows to put pair-wise testing into practice. I said it was a giant leap towards the right coverage. In this short article I would like to go much further. This is not going to be another leap, it is going to be a flight 🙂

The problem

When dealing with a problem in QA practice it has often combinatorial nature. There are many possible combinations of either data, actions or other “inputs” which constitute the testing space. The testing space always tends to be infinite or at least extremely large as not only must one take into consideration the fact all the inputs have to be used in test runtime but also all the relations between them. Such a relations can only be tested when specific combinations of inputs will be used. Thus, the main problem to solve is how to choose those combinations. Well, this one is already solved here.

When testing combinations are known next problem arises quickly. There are always many combinations to test. There is huge work to do to create automated tests out of generated combinations not to mention manual testing. And there is even more work to do when system under test changes as the tests have to be adjusted accordingly. Let’s solve this problem.

The solution

Automation is the process which started quite long time ago. We are not testing manually, we have set of automated tests to run. We can use them on all the testing levels: small, medium and large tests are automated and used in development process continuously. However before we can run the test we have to write it. It takes time as it is still manual process nowadays. It is now time to move on and start creating test cases automatically!

It looks more less like this:


old process

It is important to move to such a process:


new process

When speaking about combinatorial testing problem we have TCases at hand which utilizes pair-wise testing concept and generates the set of testcases to be created. After that, such a raw testcases need to be translated into executable test cases:


automatic test cases design time


There are domain specific languages in use at present. I wrote shortly about it here already. I am going to use internal domain language that is a language crafted from Java itself.

Let’s assume, we want to test the behaviour of Notepad++ application in terms of new document settings with regards to format, encoding and default language. These settings are available at: Settings->Preferences->New Document.

I am going to use all the available formats and 5 languages and encodings for the sake of simplicity (in reality, one needs to use all of them in the testing space of course).

We have 3 variables: format, encoding and default language. Each variable contains information related to testing space that is which values are possible and at the same time it contains information related to domain language (Has attributes). Domain language contains two kinds of information: the one related to GIVEN and WHEN clauses (command) and the one related to THEN clause (expected).

The testing space looks like this (input.xml):

So, for example when “format_isChecked” variable will have FORMAT_WINDOWS value, it will mean:

translated later into:

All the hashes and asterisks are used as placeholders only. It is much easier to write dsl related info when using external domain language like Cucumber. Dealing with internal one like in this example (Java) is more problematic because of braces and dots.

Anyway, each variable value will be translated into GIVEN/WHEN and THEN sentences and each variable value which belong to the same testcase will be concatenated with each other to create a single dsl test case.

To achieve this, we need to create output xml file with raw testcase which are not executable yet (they are manually executable though). I use 2-tuple coverage:

After TCases is run we get (output.xml):