Skip to main content

Test recording or test coding
Which one is more efficient?


The purpose of this document is to compare two approaches to Java UI testing: recording (a.k.a. record & replay) vs. coding (manual test creation). The article is build in a free form – my apologies. On one hand, I've seen (and been in) so many silly arguments as to what approach is better, I can't be too serious about that myself any longer. On the other hand, I've been trying to make it strict and structured too many times – I gave up, finally.

Enjoy, if you consider it funny. If not, still, check it out – there could be something mentioned which you never gave a thought to.

“It is much faster to record test! Isn't that obvious?”

From the first look, recording seems to be much more effective in time. Indeed, you do not have to study any new library or, God save ya, any new scripting language, you only have to run some tool and perform UI tests like (or almost like) how you would do it regularly.

Let me tell you this, the impression is right! It is much easier to create a test using some recording tool! And the test will be working right away – you can verify that – the tool is probably provide you with functionality to do that.

Ok, so, you got that working test and add it to a list of other tests you want to perform on the product you're testing. One by one you got all your testing automated. Good.

Now, time passes, and you want to rerun the tests ...

Why on the earth would you want to do that? They've been working before – why to run them again?

Product UI is changed every now and then

Ah! That's right! The product has changed – that's why!

Something in the product has been changed by the product development team for a purpose of bug fixing or improvement or new requirements or whatever reason, so the new product UI is a little bit different here and there. So, what you want to verify with the tests you created earlier is that the product still working. So, you run the tests.

And, guess what, now some tests fail!

Don't panic – that's natural!

Some product UI has been changed, so, those tests which tested that UI (or used that UI for testing some other UI) fail.

Tests have a tendency to fail every now and then

They do! If tests do not fail, it could only mean you're not testing that UI which is changing! Well, either that or the product is not being changed at all. But, if it is not being changed, why do you bother rerunning the tests? Why did you create them in the first place?

So, ok, some tests fail. Now what? Now, well ...

You have to fix the tests

to be able to run them again when next time to run the tests come.

How do you fix them? Well, that's the trouble: there is no general answer to this question. Let's consider several scenarios

Let's talk about the tool. Or, rather, about the way that recording tool stores your tests.

Tests could be stored in different formats

Some of them are human readable (and editable), some not.

If the format is not human readable, there is no other way for you to fix the tests other then to rerecord them from scratch.

But, wait! Why did you record the test, if it only worked for one release? Why didn't you just do the tests manually that time? Wouldn't that save you a trouble of managing the recorded test, etc?

You may say that some tests will likely live for several releases and also you may say that you did not have to rerecord that many tests this time, and also that, using the tests, you, at least, know where the product changed ...

True – all true. Not arguing here.

But, think about this example:

Supposing you test a text editor (classical example I always use). And now, all of the sudden, “File/Open” menu is renamed to “File/Load”. Silly? Perhaps. But still ... you will have to rerecord ... all you tests! Well, almost all, perhaps.

Bottom line, if the tests are not stored in a human readable format, you can't really tell how much time do you need to fix your tests. Thus, you can't tell how long the test cycle would take. Hence you can't really tell when the product will be ready to be released! Does not sound too god, does it.

Now, seriously.


The approach could just work for you. Not so big UI, not so many UI changes, not so many releases planned, etc. Great if it does – you're in good situation.

Ours is the opposite – Huge UI, massive changes (up to 90% UI changed in one release), huge number of tested releases and configurations ... the approach just does not work.



The tests are better to be stored in human readable format

It is better be in such form that you could understand what that code means. Something like (this is a test script):

  • push “File/Open” menu
  • find “Choose the damn file” dialog for me
  • select some file, for God's sake
  • hit the “OK” button, please

or something like that – perhaps without my bad jokes.

This is good! Much better indeed!

What's good about that?!

Here: in case menu gets renamed – you just change all the entries of one string to another one in all  the tests and that's it - and you're done.


Such tools do exist. Some generate XML files, some special scripting languages. XML is even better perhaps, if you do a contextual replacements – you can use XSLT.

This approach could be used for some applications, but, <sigh>, not for us – same reasons as before.


But, you know what?

The tests are better to be stored in some programming language

Preferably, in a form of some high-level language. Scripting language – no less. 'Cause, if you have a slightest idea about any programming language, what you want to do right away is to move common code to subroutines / procedures / functions / methods. Into a thing called “library”.


Such tools exist too – see above.


So, you move the code responsible for file opening into some library and you do not have to change hundreds of your tests any longer – only change that one library function in case if the menu gets renamed.

But, wait! Wait-wait-wait!

There is some contradiction, isn't there?

The code generated by the recording tool knows nothing about your procedure! So how would you go about using the library methods?

Here is the only way:

The tests are better be generated in terms of your own library

Well, now, that's pretty high we got, haven't we?

At this moment, what I would propose an unprepared reader to do is to take a deep breath, relax, and read one other document: “Concept oriented testing”. That document will give you an idea of what kind of libraries we're talking about. It will also help to understand why are we talking about a library in the first place.


Let me rephrase.

In order for recording tool to be successful for big, long living, constantly changing, cross-platform products, the tool should be able to generate test code in terms of the product you're testing.

If you do not know what the hell I am talking about, go back a little and follow the link to another document – it all explained well there – no need to repeat the same stuff.


There is only one such tool I ever heard about. This is our internal tool, which is, with a little bit of luck, will be available in open-source some time soon.


That's an

Ideal recording tool

In order to be able to do so, the tool should be able to use some external code (we call is “plugins”). Those plugins will then specify how exactly generate the code.

I will give just a little bit of technical details for those of you curious. (Neither this is the right document for more details, no I am the right person. With a little bit of luck, separate documents will be available some time soon).


In our version of the recorder we define two stages: action interpretation phase and code generation phase.

On the first phase, plugins are asked to interpret actions performed on UI. Result of such interpretation is a data structure which describes what some actions mean in terms of the plugin.


Pressing a mouse over “OK” button within “File Chooser” dialog could mean different things:

  • pressing a mouse over “OK” button
  • pushing “OK” button
  • pushing a default button in a modal dialog
  • accepting dialog selection
  • opening file for editing
  • number of other things

On the second stage, plugins are asked to generate the code to the best of their knowledges.

There could be number of plugins registered one-over another with the recorder. There is always default plugin, which works in terms of Swing components.


So, which one to choose?

Not that you have a choice, really ... there are not that many of them out there.

If you have a choice, however, consider another document, which would help you to compare different approaches for you particular needs: Automation Effectiveness Formula.

And, please, keep in mind that anything more complicated than just plain recorder using, likely, requires some additional work. That's fine – just don't forget about it while using the formula.

What else ... I wish you luck in finding the right approach for your needs. We find our approach – we do not use the recording. :) It is just slightly less effective than manual test creation for us (there is some overhead in supporting plugins).

Finally ... do not hesitate to get into an argument, if you do not agree with me – I am more then opened! There are good forums for such discussions: or Welcome.

Thank you for your attention.

Shura (a.k.a Alexandre Iline)


Please Confirm