© Copyright 2005 Bolour Computing.
Azad Bolour
Bolour Computing
| Recently I started experimenting with an embedded browser to develop tests for web applications. By using an embedded browser with an in-process web server, I can now write web application tests that run in their entirety in a single process. Such end-to-end in-process tests provide a seamless client-server execution context for a web application under test, which simplifies the debugging of failing tests. As a bonus, end-to-end in-process tests with embedded browsers also exercise any Javascript associated with loading pages and with clicking their buttons and links. In this note, I'll discuss the rationale for end-to-end in-process tests, and demonstrate simple examples of such tests. |
Failures in a web application may occur at three different layers: below the server-side controller, in the HTTP plumbing and its invocation of the controller code, and in the browser. To know that a sample use case works for a web application, a test should exercise all three of these layers for that use case, including browser scripting associated with user interactions to initiate web requests, and browser scripting associated with loading pages returned by web requests. A test that mimicks user interactions with a browser, and exercises an entire vertical slice of an application is called an end-to-end web application test.
Up until very recently, support for end-to-end web application testing was by and large limited to quite expensive GUI testing frameworks such as those from Mercury (maker of WinRunner), and Segue (maker of Silk Test). These products generally target QA organizations. In this article I am more interested in developer testing of web applications, and for this target audience, more affordable options are now becoming available. And I'll be focusing on one of these options in this paper.
Of course, end-to-end tests are not the only way to test the functionality of web applications. It is also useful to test web applications at lower levels than the level of user interactions with the browser. An HTTP request test makes HTTP requests to a web application and verifies the returned pages, without rendering the pages in a browser, and without executing browser scripts. A controller test makes direct calls to the server-side request controller of a web application and verifies the correct functioning of the application's business logic. These lower-level tests make it possible to test an application in a modular fashion, and to restrict the blame for a failure to the lower layers of the application software.
But while these lower-level tests are important in a sound testing strategy for web applications, they do not exercise the browser-side of an application. The only way to verify that the entire processing of a request initiated by user interactions with a browser runs correctly is to pass the request through that browser, and to make sure that the returned web page is rendered as expected in that browser. In order to automate this process, tests must have the ability to trigger browser events as if these events were caused by users, and then to make assertions about the pages returned as rendered by the browser, that is, to make assertions on the document object model of rendered pages.
Unfortunately, the fact that in an HTML page core semantic content is mixed in with text copy and formatting directives complicates the construction of good page-level assertions about a page. Making assertions about core content is simpler within the HTTP controller after the core content has been computed but before it has been merged into the page. But the merging of content into a page, and the parsing and rendering of that page in the browser are also a part of what is going on in the application, and need to be tested. To test these parts of the application, we have to do the best we can to make reasonable assertions about the document object model of rendered pages.
Tests can assert the existence of certain expected elements, such as forms, fields, and links, in a page by using the generic HTML element attribute id to uniquely identify them. And once such elements are found, further assertions can be made on their attributes. Because the id attribute is applicable to any HTML element, a good practice in making web pages testable is to mark important elements of a page by unique ids. That way, as these elements are moved around and reformatted in a page, the test code that finds them by using their ids remains the same.
To be sure, there are limitations to this strategy, as in asserting the existence of some row or some cell of a particular column with certain properties in an HTML table. In such cases, it is a matter of judgement as to how specifically discriminating the assertion should be: the more assumptions made on the structure of the table, the more discriminating the assertion, but the harder it is to write the test, and the more fragile the test becomes to cosmetic changes in reordering, nesting, and so on. In these cases, I often end up opting for simplicity, perhaps even going as far as asserting only that an expected string exists in the page as a whole, and leaving it at that.
Testing a web application above the level of HTTP calls is an accepted practice in many development projects today. In the Java world, which is the context for this article, HTTPUnit (and its derivative JWebUnit) are quite popular and are generally used for HTTP request testing. HTTPUnit also has limited support for Javascript, with its own Javascript scripting engine. So while it is possible to exercise some application level Javascript by using HTTPUnit, end-to-end tests in HTTPUnit do not use a real browser.
Selenium is a testing framework based on Javascript that works in real browsers and allows arbitrary Javascript to be exercised in tests. (Other Javascript-based XUnit-type testing frameworks such as Berlios JsUnit, and Hieatt JsUnit, were developed primarily for testing Javascript code as opposed to testing entire vertical slices of applications.)
But in this article I'd like to concentrate on a different approach to web application testing: the use of an embedded browser. An embedded browser for a Java web application is a real browser that is controlled by a test application written in Java, and can be directed to simulate user interactions. The advantage of an embedded browser over HTTPUnit is that it has a real user interface, and exercises a real browser. The advantage of an embedded browser over Javascript-based testing is that, both the test steps and the application logic being tested are written in the same programming language, and, as we shall see shortly, both can be made accessible in the same process, and examined in the same debugging session when troubleshooting a failed test. By contrast, in Javascript-based testing, the test itself is only debuggable in a Javascript debugger.
To simplify the deployment and debugging of web applications exercised by tests, a common practice is to include an in-process web server/servlet engine within the same test program. The server is started in the test's setup, and shut down in the test's teardown. Jetty is a well-known open-source HTTP server/servlet engine, which allows itself to be embedded easily in the Java VM of a client application. By using an in-process Jetty engine in tests, both the test application and the server-side application logic being tested can be made to reside in the same Java VM. I call a test that includes both a browser and a web server/servlet engine in the same process an end-to-end in-process (web application) test, or simply an in-process test for short. Although in-process testing is a general concept and of interest in any environment for web application development, I am restricting attention here to web applications developed in Java.
In-process tests provide major advantages in debugging an application when a test fails. First, having the test itself, and both the web browser and the web server, in a single process, means that a single execution and debugging context can be used to troubleshoot the test. Second, having a graphical user interface that is automatically associated with a running test means that developers can easily glean important information from the GUI about what the application is doing en route to a failure.
Consider, for example, a use case test for a tax preparation wizard in which multiple screens are used to obtain information about an individual's income and deductions. A failure in the final computation of the tax may in fact be caused by failures in earlier screens, which, for whatever reason, did not surface until the tax computation screen. When debugging a failure of the final tax computation, it is quite useful to visually check the contents of earlier screens for inconsistencies that may have been neglected in assertions about those screens, and to understand the context in which the final failure occurred. It is also useful to single-step through the test steps in a debugger, and at the same time be able to debug the business logic for the application in the same debugging session.
Of course, one can always manually enter the sequence of interactions leading to a test failure into a regular browser and track them in a server debugging session. But the manual replay of a test is inconvenient and error-prone, given that the exact play-by-play for the test has already been automated in the test code. Similarly, when using an HTTPUnit-type test, it is possible but inconvenient to view the returned pages for the test by printing them out and manually loading them into the browser.
In this section, I am going to consider Javascript testing a bit further, and expand some on the differences between Javascript testing and embedded browser testing.
An end-to-end Javascript test is a Javascript program that simulates a series of user interactions with the browser (including form submits and link clicks), and includes assertions about the contents of the document being displayed.
Selenium is a fairly extensive Javascript testing framework that supports end-to-end testing. It is an open source system from ThoughtWorks. Selenium has a test runner, provides access above the level of Javascript to many browser functions, and allows the declarative specification of a sequence of test commands against a browser frame.
Selenium supports a number of browsers, and tests written in Selenium are browser-independent. Under the covers, of course, the Javascript for the test infrastructure itself has to be browser-aware, since controls for simulating many browser functions through Javascript are not standardized. Browser-independence is a compelling advantage of Javascript testing.
Theoretically, of course, embedded browser testing can also be made to work with different browsers, by using specific embedding packages for each browser. I have not looked into embedding packages for browsers other than Internet Explorer, nor into providing a browser-independent abstraction layer for embedded browsers.
From the point of view of debugging test failures, the key structural difference between in-process tests and Javascript tests is in where the control flow of tests resides. In an in-process test, the test resides in a Java program that also houses the server. In a Javascript test, the test resides in a Javascript program that is interpreted by the browser. When a test fails, stepping through the test logic of an in-process test is possible within a Java debugger that also includes the execution context for the server (where breakpoints can be active in the application's business logic). In Javascript testing, on the other hand, this type of debugging generally requires two debugging sessions: one for the client Javascript that is executing the test steps, and one for the server-side Java.
Whenever the test and its runtime framework are written in a different language or require a different process than the server-side logic of a web application, a debugging seam develops between the test program and the server-side program, a seam that I would rather avoid, all else being equal. So similar remarks apply to frameworks such as Web Testing with Ruby (WATIR), and the Perl HTTP::Recorder, when used for testing a Java web application.
Of course, troubleshooting a failure in an application's Javascript may require a Javascript debugging session anyway. But in my experience, such failures are rare relative to business logic failures. So most of the time, I get away without having to enter a Javascript debugger in tracking failures.
Easy debuggability, and programmatically flexible test construction in the same familiar language of the server-side business logic are compelling advantages of testing with embedded browsers. I don't mean to suggest though that these advantages necessarily trump the advantages of other testing frameworks. Just that embedded browser testing deserves serious consideration as an option for end-to-end testing of web applications. For my own work, I have decided to use an embedded browser, and to see how it stands up to its promises over time.
An example of an embedded browser for Java is Jexplorer, a commercial product that wraps the exposed functions of Internet Explorer in a set of Java classes. I will use this package for my examples in this document (without passing judgement on it, since my exposure to it is limited so far, and I have not compared it with other products). Jexplorer provides both a GUI browser, and a headless browser, which is a browser object without an attached UI component. Headless browsers make it possible to run web application tests in the background. When a test fails and requires debugging, you can switch to a version of the test with a GUI browser for a better understanding of what is going on.
In this section I'll outline how to start writing in-process tests by using Jexplorer and Jetty. The tests discussed are available in an accompanying Eclipse project.
To get started, download an evaluation version of Jexplorer from here, and download Jetty from here. My tests currently use Jexplorer Professional version 1.0, and Jetty version 5.1.2pre0. The Velocity template engine is used in the accompanying project to produce dynamic web pages in servlets. Velocity can be obtained from here. The tests use Velocity version 1.4.
EmptyBrowserTest is basically a hello world for
Jexplorer. It shows how to load an HTML page into a (headless) browser
object and check its content. You can use it first to test the installation
and execution environment of Jexplorer.
EmptyBrowserTest.java
|
|
01 package com.bolour.sample.embeddedbrowser.browser;
|
|
This test runs in the background without a GUI. We'll subclass it next to add in a GUI browser window. For now, let's try and get this basic test to run. Jexplorer uses native libraries to make calls on Internet Explorer, and the test's Java VM needs to know the locations of these libraries. On my machine, I use the following JVM argument to make the required libraries available to a test:
-Djava.library.path=D:\jexplorer\bin;C:\WINNT\system32
By using the equivalent library path on your machine, you should be able to get the test to run and to pass.
The equivalent GUI test, GuiEmptyBrowserTest, is derived from
the headless test by using a Jexplorer browser object that is a Java UI
Component, and by placing this component in a window frame, as shown
below. In this sample, the code for the browser GUI frame is abstracted to
its own class, BrowserFrame, so that it can be reused in other
tests. Note also that the delay after loading the page is increased to
1000 milliseconds so the page can be observed by the user.
GuiEmptyBrowserTest.java
|
|
01 package com.bolour.sample.embeddedbrowser.browser;
|
|
BrowserFrame.java
|
|
01 package com.bolour.sample.embeddedbrowser.util;
|
|
We are going to be testing a single web application on the local machine.
For Jetty to be able to serve this web application, it needs to know three
things: the server's listener port number, the web application's
context path specification, and the web application's
location. JettyServerStartTest below shows how to
provide this information to the Jetty server and to start and stop that
server. In our case, the port number is 8002, the context path
is /test/, and the location of the web application is
samplewebapp. These settings yield the URL pattern
http://localhost:8002/test/* for our web application
resources, and tell Jetty to look for these resources under the (relative)
directory samplewebapp.
JettyServerStartTest.java
|
|
01 package com.bolour.sample.embeddedbrowser.server;
|
|
You can use this test to make sure your Jetty environment is set up correctly.
For convenience of reuse, I have isolated the required code to configure,
start, and stop the server into a separate class,
ServerWrapper, whose constructor takes the three pieces of
information needed by the server as parameters, and has the following
signature:
public ServerWrapper(int port, String contextPath,
String webapp)
As we go further into test development with an embedded browser, we find recurring patterns of calls to browser functions that are ripe for abstraction. For example, tests need to click links programmatically in browser documents. Clicking a link requires the following four steps: trying to find the link via its id in the current document, asserting the existence of the link, programmatically clicking that link, and waiting for the referenced page to load. This sequence of steps, and others like it, can be abstracted out of individual tests and made reusable by all tests.
In the tests that follow, such abstractions are provided at two different
levels: in a browser utility class, BrowserUtil, and in a test
base class, BaseBrowserTest. The utility class packages a
number of generally useful abstractions for interacting with Jexplorer.
The base test class allows the convenient invocation of browser tasks from
tests with minimal verbiage.
For brevity, I'll skip the utility class in this article. But below is a snippet of the base test class.
BaseBrowserTest-Snipped.java
|
|
01 package com.bolour.sample.embeddedbrowser.util;
|
|
Note that in order that the abstracted methods in the base class have
access to the current document in the browser, the reference to the
browser object in tests has been lifted to the base class as a
protected field. Note also that the contract for tests derived from this
base class requires a call to some version of setUpBrowser to
construct the required browser object and to initialize other test fixtures
used to control test behavior.
With our basic testing infrastructure for both the embedded browser and the
embedded web server/servlet engine in place, we are now ready to look at
the combined embedding of the browser and the server in a web application
test. The application is very simple, and has two pages: a welcome page,
and a user list page that presents an HTML table listing a set of
users known to the application. The user list is supplied by a servlet
called SampleServlet that uses a Velocity template to
construct the user list page. The welcome page of the application is
accessed via http://localhost:8002/test/index.html, and contains a
link to the servlet, as follows:
<a id='list' href='SampleServlet'>get user list</a>
The test JettyBrowserTest (shown below) loads the welcome page
into a browser object, programmatically clicks the above link, and asserts
that the returned page includes an expected user.
Here is the resulting combined embedded browser and server test, as a subclass
of BaseBrowserTest:
JettyBrowserTest.java
|
|
01 package com.bolour.sample.embeddedbrowser.servlet;
|
|
Note that the method click(elementId id) is inherited
from the base class BaseBrowserTest, and encodes the
required sequence of steps to find and click elements such as the
link to SampleServlet in the welcome page.
As before, the GUI version of this test is a simple subclass of the headless test shown above.
Note that for simplicity, or for laziness, depending on how you look at it, this test asserts the existence of an expected user name anywhere in the page, even though that user name must appear as a member of a particular list representation within the HTML page. If the string representing the user name is not present in the user list (represented in this test by a table) but just happens to appear somewhere else in the page, the test would pass. We can tighten up this assertion, of course, by introducing a dependency on the existence of a particular table on the page. But then if we decide to change the user list representation to something other a table, we would have to change the test.
An automated use case test or an acceptance test for an
application is a test that mimicks the user interactions required to
accomplish a particular task by using the application. This section
presents an example of such a test for a tax preparation wizard. The test,
TaxWizardTest, simulates a user going forwards in the wizard
doing data entry, and then going backwards to review the entered data. The
section concludes by restating the benefits of end-to-end in-process
testing for such larger tests.
For obvious reasons, the application is an exceedingly simplified
income tax wizard, that, nevertheless, provides a realistic view of the use
of embedded browsers for testing web applications. The wizard has two data
entry pages: the incomes page and the deductions page, and a final output
page for the computed tax. The user goes forwards and backwards between
these pages by using next and back buttons. And data entered
in these pages is saved in the HTTP session, and reproduced when the pages
are revisited. See the class TaxServlet and its collaborators
in the accompanying project for more details about the application.
Among other things, TaxWizardTest looks for expected fields in
the web pages being traversed, and verifies that values entered by the user
for incomes and deductions on the forward pass are reproduced by the
application in the subsequent backward pass. For brevity, the code for
TaxWizardTest has been redacted somewhat in this writeup.
(See the accompanying project for the full source.)
TaxWizardTest-Snipped.java
|
|
01 package com.bolour.sample.embeddedbrowser.taxservlet;
|
|
As in earlier examples, the GUI version of this test is a simple subclass of the headless test.
Note that a number of methods such as submit(String
inputSubmitId) for clicking a submit button, and enter(String
inputName, double value) for entering a numeric value into a field
are inherited from the base class BaseBrowserTest.
Now that we have a real example of a larger use case test, let's retrace the benefits of end-to-end in-process testing in the context of this example.
Suppose that a particular set of input values makes the tax go negative
in TaxWizardTest and the test fails. Perhaps the application
does not check for negative income, or perhaps the deductions logic does a
straight subtraction of deductions from income and total deductions are
larger than total income, or perhaps there is a bug in the tax computation
code. The problem may be revealed when, for example, constants used in the
test are changed, or when some server-side computation changes.
Maybe we are lucky and there is a simpler unit test (perhaps at a lower level) that reveals the bug. But maybe we are not so lucky. Perhaps we were not far-sighted enough to provide the required unit test. Or perhaps the failure condition is too complicated and dependent on actions taken at several steps. For whatever reason, we now have the more difficult task of debugging a long sequence of actions initiated from the test that affect the server-side state. And that is when an in-process test with an embedded browser becomes compelling. The burden of our detective work in these most time-consuming types of debugging scenarios is eased considerably by the ability to see what is going on in a real browser, and by the ability to debug both the test itself and the server-side business logic in the same debugging session.
End-to-end in-process testing with embedded browsers is a very promising approach to web application testing. And the available technology for it seems mature enough to be really useful. With a browser-independent abstraction layer (not yet available to my knowledge) such tests would, of course, be more compelling. But as a Java developer, I am quite taken by the debugging ease afforded by tests that embed both the browser and the server in a single process, and by having complete programmatic control over my tests in Java.
For the use of an embedded browser (without an embedded server) in the .NET world, see, for example Writing Automated Browser Tests With NUnit and IE. What I don't know is whether IIS can be embedded easily in a test program (the way we have been using Jetty in our examples) to enable the debugging of the test code and the business logic of the application in the same debugging session.
My thanks to David Vydra of testdriven.com and to Bill Venners of Artima for many useful discussions on testing. Thanks also to David for references to other testing frameworks for web applications.