Tester by Day, Developer by Night: testing

Showing posts with label testing. Show all posts

Wednesday, June 4, 2008

Why I am making defects public

I am liking the concept of making "charity" defect reports. Rather than ranting and raving about the ineptitude of developer X because of problem Y, I will raise a defect, make a post and hope that it gets fixed. I call it a charity because I won't charge for this service; it's more about making software better. Every second post won't be a defect report either. More likely no more than one a week depending on how many bugs are inhibiting my current work. I won't go looking for defects; I get paid to do that already and have much better things to do with my spare time.

My justification and reasoning for making them public and not just silently reporting the defect are as follows:

Firstly, one can never be sure that a defect will be fixed once it has been reported. The vendor may just ignore the defect thinking it only effects a minority. Making it public can allows other users to become aware of the defect and can therefore add their weight to the "urgency factor".

Secondly, inexperienced users often suffer from a lack of self-confidence with computers. When something doesn't work they blame themselves. If they read about a defect that I or someone else raises they realise it wasn't their fault. This may strengthen their resolve towards continuing the use of the application.

Next, some bugs present an inability to achieve a workflow. Making the issues apparent may not provide the workaround, but may enable someone else to uncover a workaround. This can be appended to the initial defect which is now the common source for knowledge on the defect. This can maximise the dissipation of knowledge to users and provides a problem and solution in a single place. This leads me to my next point.

This is all indexed by Google. When I have a problem with an application I search for a solution, then a workaround and finally, if I can't find either, a different application. By making a defect report public I can help others in their defect resolution quests.

Defect resolution

With the Nokia defect, someone posted a comment saying "thanks". I don't expect that. What I do expect, as I'll raise a defect directly with the vendor/developer, is that when the defect is resolved, they will notify me. If such an event occurs. I will edit and top-post my original message with the updated information saying something like: "Defect has been fixed in version X.Y. Upgrade to solve this problem".

This now provides the ideal scenario. Anyone who hasn't upgraded yet, and searches for the problem, will find the defect and solution posted together.

note: I don't expect every defect I raise to be solved post-haste. Some may never be. But I've done what I can to help the situation outside of forking my own branch of the code and fixing it myself.

Monday, May 12, 2008

Testing Tip #1 - Boundary testing business objects

When testing business objects, for example an object that represents a Person or a Customer, you often have a minimum set of data requirements. I.e. a customer must have a last name. As well as having a set of optional data attributes (first name, date of birth, gender, etc).

To provide full test coverage you would mathematically need to provide N factorial minus M factorial combinations where N is the number of attributes and M is the number of mandatory attributes. I don't think I have ever had enough time to test that much, incredibly boring, testing and nor would I, had I.

In my experience the best pass through is to look at the two bounds. The minimum set where you specify the object with the bare minimum of attributes, the mandatory attributes, and each of these attributes has as little data as possible. In our customer example this would be a last name that is one single character in length.

The next test case would include every single attribute populated to it's maximum extent. So if you have a twenty character first name you specify all twenty characters. I document this as my all set.

These two test cases have been enough for every system I've tested. Using them you can prove every attribute being supplied and every attribute being absent, or supplied to their bare minimum allowed.

I don't include out-of-bounds testing here. That is, I don't check for 21 character first name objects, nor do I test for zero length last names. That specific boundary testing, I document as separate tests against the user interface. This achieves two things. Firstly, my test cases are granular and relate to a specific business requirement. This then means that defects raised are very specific and generally easier to track down. Secondly it cuts the amount of testing I have to do to what is more than likely going to cause defects.

Additional:
If you have more complex data types. For instance, if there is a business rule that states that attribute-x is only provided when attribute-z is set. Then that specific combination is already covered by your all set (attribute-z is set) and your minimum set (attribute-z is not set). I would additionally include test cases to ensure that any user interface validation of these attributes occurs.

Tuesday, April 29, 2008

Controlling Testing Environments

Why You Should Care?
Testing environments are fundamental to successful testing. The test environment is where testing occurs and without a controlled, regulated, stable testing environment you are undermining your entire testing foundation. Scary stuff!

What do I mean by controlling a testing environment? I mean ensuring:

that you know that each environment has the correct code,
that the various integrating applications have compatible versions,
that the correct hardware and software configuration exists,
that the data is legitimate and in the right quantities,
access to the environment is restricted and,
security policies mimic production

All of above items combine to make a stable, controlled, test environment.

Without proper management of testing environments whenever a defect is identified you have to:

identify the software build,
determine how long that build has been there,
determine if there is a later build available?
ensure that the date is valid?
review the hardware to ensure it matches production
review the additional software components to ensure it matches production

Beyond environmental stability there are particular test scenarios that you can now perform. You can engage in deployment testing. Every release the software package is released into production. How often is this software deployment process tested?

Other benefits are: when you receive a "bad" build you can un-install it and re-install the previous one until it gets fixed. Or, you can get two competing builds from the development team and compare them for performance. I am doing this one next week.

So how do we go about doing this?
The first step is to identify how many test environments you have / need. In summary, I like to see at least the following:

Development - one per developer, usually the development box but ideally should be a VM or similar that matches production architecture/operating system/software configuration. Developers may call it a build box, but they do unit testing here, so it is a test environment.
Development integration - one per project/release. Here the development team works on integrating their individual components together.
Test - where the brunt of the tester's work is done. There should be a dedicated environment for each project.

The following environments can usually be shared between project teams depending on the number and types of projects being developed concurrently.

User acceptance testing - can be done in other environments if the resources are not available. Ideally should be a dedicated environment that looks like prod + all code between now and project release. This is an optional environment in my opinion as there are lots of good places to do UAT and it really depends on the maturity of your project and your organisation's available infrastructure.
Non-functional - performance, stress, load, robustness - should be identical infrastructure to production, the data requirements can exceed production quantities but must match it in authenticity.

More environments are possible. I didn't cover integration or release candidate environments (you may have duplicate environments or subsets for prod-1, prod and prod+1) and it really depends on the number of software products being developed concurrently. I won't be discussing the logistics of establishing test environments here nor how to acquire them cheaply.

To actually gain control. First talk to the development team about your requirements for a stable testing environment. Explain your reasons and get their support. The next step is not always necessary but can give you good piece of mind. Remove developer access to the test environments. I am talking about everywhere, web servers, databases, terminal services, virtual machines. If its apart of the testing environment they should stay out.

It isn't because you don't trust them. After deployment you probably shouldn't be on those machines either. Sure, there are some testing scenarios where getting into the nitty gritty is required, but not always and certainly not when testing from the user's perspective. The bottom line is that the less people who have access to these machines results in a smaller chance of accidental environmental comprise.

So what aspects do we control?
Primarily we need to control the Entry and Exit criteria to each environment. The first step is the development environment. Entry is entirely up to the developer and exit should be achieved when unit tests passed. As the next step is the development integration environment, the development lead should control code entry.

Entry into the test environment: regardless of the development methodology the delivery to test should be scheduled. Development completes a build that delivers "N chunks" of functionality. Unit tests have passed and they are good to go.

Developers should then prepare a deployment package (like they will for the eventual production release) and place it in a shared location that the deployment testers can access. It is now up to the deployment testers to deploy the code at the request of the project testing team (these are quite often the same team). Once a build has been deployed, some build verification tests are executed (preferably automated) and the testers can continue their work.

To move from test into any environment afterwards (release candidate integration, pre-production, etc) depends on the organisation but usually the following: Testing has been completed, defects resolved, user documentation produced and most importantly user sign-off has been acquired.

The final environments (pre-production, etc) are usually (should be) managed by a release manager who controls the entry and exit gates from each environment after test and on into production. I won't cover these here.

Evidence or it never happened!
Example A: About a month ago we had a problem where one of our test environments wasn't working as expected. It took the developer over a week to find the problem. Turns out another developer had promoted some code without letting anyone else know. The code didn't work and he left it there.

This could have been avoided if the developer didn't have access to deploy in our environment. Unfortunately he does, but it is something that we are working towards rectifying.

Example B: I once worked on a project that had five development teams. Two database groups and three code cutters. Had they been able to deploy when they wanted, our test environment would have been useless. None of the teams were ever ready at the same time and it would have meant we would have had code without appropriate database support. Components that were meant to integrate but did not match because the latest build of application x wasn't ready yet.

By waiting until all builds were ready and running through the deployment ourselves we ensured that our test environment was stable and had the same level of development progression all the way through.

Too much information, summarise before I stop caring!

Controlling Test Environments = Good
Focus on developing entry and exit criteria
Build up to production-like environments - each successive environment should be closer and closer to production.
Evolve towards the goal of environmental control rather than a big bang approach. Some transitions will take longer than others (i.e. getting the right hardware) so pick a level of control for each release, get everyone involved and implement it.
Get team buy in (developers, testers) - education is the key
Don't make the entry into the test environment documentation heavy.

It all looks too easy, how could this go wrong?
Get development buy-in. This is important you don't want to alienate the development team. Not all developers or development teams are inconsiderate, nor do they have ulterior motives. Usually it's a simple lack of awareness and discussing with them the direction you want to take with the testing environments will achieve two things. Firstly, they have greater visibility into the testing arena and secondly they often realise that they can help improve quality by doing less. Who doesn't like doing that?

Don't make it complicated: The goal of this is to achieve a high quality test environment to facilitate high quality testing. Don't produce a set of forms and a series of hoops that you need to force various developers and teams to fill out whilst jumping through. They won't like it and they probably won't like you.

When I first tried locking down an environment, I asked the developers to fill out a handover to test document that listed the build, implemented task items, resolved defects and similar items. I had buy in and for the first few cycles it worked ok. It wasn't great though. All I was doing was accumulating bits of paper and wasting their time by making them fill it out.

All I do these days is discuss with the developers the reasons why the environment needs to be locked down and to let me know when a new build is ready. I'm usually involved in iteration planning meetings so I know what is coming anyway. All that waffle they had to fill out is automatically generated from defect management, task management and source control software.

My testing environments are generally stable, developers are happy to hand me deployment packages and consider deployment defects just as important as normal defects. After all, deployment is the first chance a piece of software has to fail in production. It is also the first place user's will see your application.

It takes time to move towards a controlled environment and as you read in my examples, my employer is not there yet either, but we are getting closer.

One other note: You may not have the ability (whether technical or organisational) to perform development testing. See if you can organise to sit with the technical team that does deployments for you.

Wednesday, April 23, 2008

Usability: What is the ideal date control?

Mace is not the right answer, effective though. As exciting as date controls are they are pretty easy to get wrong. Date controls have the unfortunate problem of trying to represent a large period of time whilst trying to provide a good level of granularity over that time.

Users want to be able to specify a date and if the date control resembles something they are familiar with (i.e. a calendar) then even better. From what I’ve experienced being able to supply a date as quickly as possible doesn’t come into the equation with most non-technical users. That being said, users do not want to waste time trying to operate an unwieldy date control. They want to supply a value in as many steps as few steps they can conceive in their minds.

Developers and experienced computer operators (data entry personnel, power users) tend to want to do things as quickly as possible. This is at odds with user experience. This is also me. I can type over 120 words per minute. Most users can't. Remember that.

Application Platform
Where the application is running has a big impact on how the user is going to interact with the date control. Console applications obviously are going to use a keyboard mechanism. GUI Applications are a mixture of keyboard input and mouse control depending on how many fields need to be supplied. Web applications are mouse driven, and mobile/pda applications are pen driven (essentially mouse), arrow-navigation driven or touch pad.

The type of mouse also impacts usability of a date control. Touch pad mouses and nub mouses are a different user experiences than the standard hand-held mouse.

Date Selection
The types of dates the user selects is also important. How far in the past will the user be selecting a date? If it is a web-site where someone must be 18 years or older or an student records application that requires the date of birth, the date control is going to require the user select a year at least 18 years to something old like 30 (kidding, I mean 40… no 90.)

Are the dates in the past even allowed? Travel booking sites would rather you booked in the future. Does the user need to select a second date based on the first? Can the user supply a partial date? Is the user required to include a range of dates? Simple date range selection is non-trivial whilst keeping it simple, intuitive and aesthetic.

The Options
There are many examples of date controls out there, so I’ll try to provide a real live example of each one.

Drop Down City:

To be honest I had a worst example available but thankfully I can't seem to find it on google (if I can find it again I will post it as it was atrocious. Note to self: bookmark epic failures under "hindenburg"). To be honest it's one of the few times I have been glad to not find an example. A friend told me Immigration Australia had a drop down city control but I couldn't find it either. FYI Some aspects of their site are rubbish and I'll post that under at separate topic in a day or two once I collect my thoughts.

Above, the wordpress one isn't great. Personally I don't like it. Too many controls for something that could be done visually. Furthermore, the month selector doesn't always respond to user input forcing me into mouse input. Some may claim that this is a browser or OS issue. News Just In: The user does not care what you think, only what they experience.

Anyway, back to my original impressions: for starters, lots of drop downs look hideous. Secondly to select a specific date the mouse user has to click on a variety of drop down controls and then the keyboard user must supply some other values. A date control should support either all keyboard or all mouse and preferably both.

Single Box Date Control
Examples: (seriously, I had several examples planned. Perhaps people learn. I know I do so it's entirely plausible. When I find a relevant example again I will so post.). Anyway you have all seen it. The 12 character input control with '/' for dividers!

This option is the equal fastest date control for keyboard users and not usable at all for mouse users. However if you have a number of text fields before this date control then having a keyboard only field, I feel, is OK. The user will already have two hands on the keyboard at this point and it is only a jump away from the date control.

If the user has to take their hands off the mouse just to use the date control then this option is not a good choice.

This option is not aesthetically pleasing unless there are similar looking input controls near by.

Triple Box Date Control
The other fastest date control. Depending on the look and feel of the user interface this option can be aesthetically pleasing. It can also be ugly [here]. This date control is functionally similar to the single box control, or a completed entry should skip the user to the next field. If your users can specify partial dates then this option beats a single box hands down. It provides the easiest way of allowing the user to select which aspect of the date they may wish to supply (month for instance) whilst still displaying that the other two fields are empty.

One caveat with the three box date control is that you will need to make it clear which box is for the month and which is for the day. US date controls tend to be month first while day first is common in countries like the UK and Australia.

Calendar
The most visually appealing date control, the calendar gives the user a date selection control akin to the calendar on their desk or on the wall in their garage. This is a good thing. Your average non-computer literate user can work a calendar with ease. Unless implemented correctly a calendar is also a nightmare for date selection within a few months or a few years.

This website (Dynarch) has a calendar control on the right hand side. It works fairly well, its an odd concept to work in even years on the right, odd years on the left. Personally I don't like it. To find 2012 I have to go to 2013 and then back a year. Too much work, I would rather scroll than have to click on one side of dialogue and then click on the other side to fine-tune my search.

Still kudos go to them, because the calendar control is trying. My only other criticism of Dynarch is the word "Today" is in the middle of the control, it implies that the current date selected is today, like it is on my physical desktop calendar. It doesn't mean that. It means go back to today. This threw me off. With anything there is a usability vs education issue, personally that text should give a short hand notation of today. Quicklinks should be included below the control with other concepts like "next week", "next month". Depending on your content (always, of course).

To make a calendar usable for larger date ranges the following features need to be supported:

Click on the month to activate a dropdown control to select a month. The month should open on June/July so that there is minimal distance to move to the desired month.
Click on the year should allow the user to control the year more easily. I see couple of different options implemented here.

Up Arrow / Down Arrow – this is a bad idea. Your date control should already contain double arrows for going forward or back one year
Enter a year, this is useful if the year is a long way away from the current year
A year slide. Looks like a drop down menu with about 5-10 years visible above and below the current year. At the top and bottom are two buttons for scrolling. These should be on-mouse-over activate or click-and-hold activated.

Other usability requirements that are a must:

Let the user know that if you click on the month your can select a different month. The same applies to the year. Users rely on your visual feedback to let them know what is wrong and what is right within the confines of their education. Step outside those bounds and you lose. I've done it, you've done it. Don't be ashamed, realise a mistake and move towards the user.

What isn’t good for calendars? Scroll bars are a bad idea. It is far too difficult to find a specific year using a scroll bar. On one application I tested each scroll click moved 50 years.

Here is an example of someone that is pretty darn close: http://www.basicdatepicker.com/
They're date control only misses the need for auto-scroll one year, and supply forward one year, back one year options.

Some Other Thoughts
Do you need a date control? Seriously. For the beta registration page for No Horizons we used an age attribute. We don't really care about the specifics of someones age as long we could gather their age. This is very valid for websites where someone must be 18 years or over? Seriously, making someone work a serious of dropdowns and input controls to state they're 18 is ridiculous. A simple "I am N years age" where N is a supplied value is just as effective. To memorise: understand your demographics.

Much like piracy "cautionary messages" the only people you hurt are the people who do the right thing. Pirates leave off such messages when duplicated films and people who enter your site underage, etc, are lying.

Conclusion
So what is the best date control? It really depends on the application environment, whether dates are optional, but generally you should try to provide both, the input box input as well as the calendar. A set of input boxes for the keyboard users, this is especially relevant if you have a number of input fields that the user can supply. For visual applications and especially web or mobile applications you should provide calendar functionality.

Bonus Content: Date Ranges
Asking the user to specify date ranges is a non-trivial task. Often I am expected to supply a start and an end date. This is fine if I know the specific range I am searching. If I don’t then it’s trial and error until I do. A better solution is to provide that functionality for users that do know the date range as well as functionality for those that think in terms that users are familiar with. "It was last week" or "I am pretty sure it was in January" are concepts the user understands. Hell, as a developer/tester it's a concept that I think in.

Being able to search one to four weeks ago, and then in terms of months or years is more ideal. Show me all emails sent between June and August last year. That’s closer to how users think and it’s a lot easier to do than, show me all emails from the 01/06/2007 till the 31/08/2008. It’s a subtle difference but one the users prefer.

How would you implement this? You can allow partial dates in your input boxes for keyboard users. Secondly, you can let people double click on the month header to default to the first of that month and then close the window. Furthermore, quick options, like “last week”, “this week”, “this month” are handy shortcuts that make life just a little bit easier.

Saturday, April 19, 2008

Usability Case Study: Facebook Advertising

A short case study on usability. It is all of a sudden relevant to me (because everyone is doing it wrong) and because I myself have learnt more about usability over the past six weeks than I thought I would. Therefore I feel I have an obligation to share my knowledge with you.

The other day I was investigating Facebook’s social advertising mechanism. Half of it was business related, the other half curiosity. I clicked the "advertise" link and was taken to a screen that pitched their advertising. After scanning that I clicked the “create ad now” button.

This took me to stage 1 of 4 of creating an ad.

This is also where I started to groan. All I wanted to know was how much it would cost to advertise on Facebook. I am not ready to start advertising, I am not even convinced that Facebook advertising is even a viable alternative.

The last step is called “Set Budget”. So I figured I could skip past the first three steps to find out. No Dice! I had to supply my URL, name and a brief description before I could continue.

Page 2 wanted my expected audience and the third, images and the advertising text. Luckily you can supply an image later, but dummy values in the advertising text were rejected as illegal words. At this point I quit. Seriously I said, fuck off.

Some advice to people who are looking to solicit services via the Internet. I don’t care who you are or what you think you know about what I want. You don’t. Secondly, I am not going to give you real information if I am looking for a quote. I just want a quote, be thankful I even know who you are and that I am interested in your services.

In my opinion this application would fail usability testing. It's target audience is businesses. What do businesses want from your service?

They want to know what it is you do
They want to know how much it costs
They want to be able to purchase the services when the time has come
They don’t want to wait to find an answer out.

The last point holds true for anyone, anywhere. It’s not like I can’t just click somewhere else. Seriously, realise how far your competitor is away in Internet terms and then meet them.

So, how should the Facebook advertising web-app have been written?

If prices are fixed based on simple conditions, put them on the second page or have a link called “prices”. If the prices are based on a complex algorithm, then place all the controls on a single page and let the user tweak the setup to match their needs. It is not that hard.

My reasoning is so, firstly from a business perspective you are being open about your pricing policies. Honesty is good. Dishonesty equals zero business.

Secondly, as a user I want, when I want. I don’t have to jump through your hoops to achieve my goal. I don’t care about your hoops. I only care about my goals.

Thirdly, if you only display one price at the end of four pages of clicking one of three things will occur.

The price is less than the budget for the advertiser. They may sign-up, but you may not be maximising your business potential. Secondly, they may have been willing to spend more but don’t know how to get those services from you.
The price matches their budget. Not likely.
The price is over their budget. Without a means to reduce their expectations to value they will go somewhere else.

If you display all the controls on one page then the following can occur:

The user will tweak the control to what they wanted from you at a service level and the cost comes under their budget. This would have happened anyway using your existing setup.
The user will tweak the controls to the absolute maximum of their budget. This is a win for both parties.
The user will play with the figures and may end up spending just a little bit more if they can justify a potential benefit from it. This is a double win for you and a win for them.
The user can’t find anything they like and they leave. However, they now know all about what you offer and what you cost. This is in their mind now and they won’t forget. There is a chance they will come back in the future.

By tailoring the application to the user’s needs rather than your own you do the following:

You increase your chance of doing business.
You appear to be a honest business entity.
Your application
Your application won’t piss people off and build a burning resentment within them.

Usability starts at the very beginning of software development and flows all the way through to User Acceptance Testing. You have to have in mind who you are developing the software for the end users before you write a line of code, before you write down a single requirement. For without a user, you have no need for an application.

Happy to discuss this delightful topic further, just drop me a line,

Friday, March 7, 2008

Project Development Iterations

One of the things I like about where I work is the freedom different development teams have in trying new things with respect to work practices. We are currently starting version 1.5 of a project and one of the developers wanted to try a different way of structuring our iterations to minimise the issues we had in version 1.0 and to improve the overall quality of the project being developed.

The main issue we wanted to fix is developers getting too far ahead of testers - we had a two week iterative process where the developers would code for two weeks straight and then on day 10 they would provide a build to test. Then they would continue on as fast as they could.

This was a problem, they kept on getting further and further away from the testers because new code ranked higher in their eyes than old code did. Therefore some bug fixes took a long time to come, this hampered testing.

So one of the developers suggested a different iteration configuration. An iteration still lasts two weeks but there are five days for development of new code and five days for testing. The final day of development involves preparing the deployment for test. The final day of testing involves preparing the deployment for User Acceptance Testing.

Today was the first day of the first testing phase and so far so good. Defects were raised (not that many which impressed me for day one code) and fixes are already being prepared. Beyond that developers have realised they've made a mistake in one place and as they have the time, they are checking for similar problems in other areas. Proactive bug fixing, another plus.

Now, you may be wondering how complex code can be with five days of development including unit testing. Well for starters the number of developers double the testers but this is a small iteration task wise. Not all will be though so we are breaking our iteration tasks up into two groups. Single iteration tasks and double iteration tasks.

Every second iteration will see the deployment of more complex functionality that cannot be implemented in 5 days (effectively 15 days). As testers I am happy with this, we know these larger tasks are coming in advance and can plan for their arrival. We also still get five days of dedicated developer bug fixing after its delivery.

Both the development and testing teams are aware that the potential for the number of defects to still be greater than what the developers can fix in five days (especially if there are some doozies). To combat this (as what should have always be done), defects have priority one in the iteration following their discovery and as testers we're obliged to ensure that all new code gets at least a once-over to gauge the quality of the build.

Time will tell if policy this works, but at least we are working together to solve our past mistakes.

Wednesday, February 27, 2008

An unusual bug

I gave a deploy of a test application to a friend the other day to test. Whenever he ran the application it would shutdown (not crash) almost immediately. The logs reported no issues and the very same deploy works on every other computer I tried and as a matter of fact this runtime code works on just about every platform I have tried to date. Just not his.

I created a debug build, turned on every scrap of instrumentation I have and sent it back to my friend. Nothing in the instrumentation logs. No exceptions, no errors being thrown. Standard execution paths, it shuts down early, but gracefully.

I went around to my friends place and checked out the code base, installed visual studio and built the lot from scratch on his machine. Still no luck, so I traced through every step the program makes before it shuts down and found the "problem" or more accurately the cause.

The application is receiving a WM_QUIT message before it should. It handles it appropriately and performs the standard shutdown. Now the user did not close the application and no exceptions were thrown so I am at a bit of a loss why the message was added to my application pump.

The only thing I can think of is a personal firewall application has its heart set on blocking my application. So I checked that and there was a few installed. So I uninstalled them and killed any dormant processes that were left lying around (don't get me started on how offensively intrusive virus scanners and software firewalls are, the only reason they work is because there are no system resources left for the user to use their system let alone install a virus).

Back to my story, fresh reboot and no processes running on the OS that I didn't want (the OS happens to be Windows Home SP2). No windows firewall, no random process, just a bare-bones execution. Ran my test application (in debug mode again) and the same problem occurred. Now I really don't know why the quit message is being sent.

So far the interwub has borne no answers on this problem, the code is lightweight, vanilla win32 and the only major difference I can think of on this box is that it is running XP Home rather than XP Pro. I have an XP Home CD around here somewhere that came with my laptop. While I do remember testing this code on XP Home previously, it has been a while.

If it also fails on my XP Home install then that is the problem (no solution though). If it doesn't then I suspect I'll be slowly deconstructing the application until the solution reveals itself. As painful as all this will be, it's a boon in some regards (not enough to make it fun though). There is a problem somewhere and I am just glad it came up in test rather than in prod.

Monday, February 4, 2008

Test Case Structure

Wherever I go, I see test cases named after an aspect of a system (logging on, or a module: like tasking). All the test cases are stored in a single word-doc or excel spreadsheet containing, in some cases 10-15 test cases all written line after line or a 100+ step work-flow that covers potentially 25+ different test cases. It's hard to tell and it's even harder to maintain. In both cases what is being documented is misleading and wasteful.

I think we as testers need to review how we write our test cases. This review should emphasize on how we name our test cases and to minimise the amount of work being performed with every test case being written within the team.

The primary scope or more accurately the goals of this blog is to cover:

how we name test cases
what we document as test steps
what kind of supplementary information is required with a test case to ensure understanding whilst assuming minimal amounts of prior knowledge

I won’t be talking about what test cases need to exist to cover a requirement and the impacts of a test case registry (like QualityCenter) has on test case structure.

Harkening back to my tale of woe, in the first method described the document doesn’t accurately represent the tests covered within. In the second method the work-flow doesn’t let you know if you are correctly covering all the possible scenarios. Both methods are difficult to extend as new functionality is written. In such cases new documents are often written to cover all the new test cases.

This happened with a system I looked into recently where a search of the past twelve months revealed about 15 documents all containing test cases. It was unsure how many of them were still relevant.

Singular of Purpose

To me a test case should attempt to achieve only one thing: the testing of a specific requirement or a specific aspect of a requirement.

Example: There is a requirement that states when creating a task, a title is required. I would have the following test cases:

Creating a task without a title
Creating a task where title is maximum allowed length

The testing of whether a title is required is separately tested under the two scenarios where you have tested creation of a task without a title and with a title that is the maximum length (bounds-checking).

What this gives you is a very clear understanding of what the objective of the test case is. What this will also give you is a lot more test cases. This is not a bad thing. It just means you know how many test cases you have.

It’s All In The Name

Now that the number of distinct test cases is going to increase, we need to ensure that our test cases are named appropriately. It is my firm belief that the name of the test case should describe what you are attempting to do. It shouldn’t include pass or fail conditions and should be as succinct as possible, but not necessarily as short as possible.

I call this the Action Target Scenario method

Action – a verb that describes what you are doing, some good examples are: create, delete, ensure, edit, open, populate, observe, login, etc.
Target – the focus of your test, usually a screen, object entity, program, etc
Scenario – The rest of what your test is about and how you distinguish multiple test cases for the same Action and Target.

So using this method, my test cases would be:

Create – Task – title is not supplied
Create – Task – title is the maximum allowable length

A useful way of formulating your test-case names is to ask yourself questions like:

What happens when a user …

And onto the end of that open ended question you derive your test case name. So using the above, we get the following questions:

What happens when a user creates a Task and the title is not supplied?
What happens when a user creates a Task and the title is the maximum allowable length?

You will find that your test case names will get much longer than what you are used to. This is also ok, it just means that you need to use more characters to accurately represent what you are trying to do.

Note: The inclusion of decimal points and hyphens in test case names is a personal style thing that should be moderately consistent within a team. The important thing is the spelling of the action and the target are identical across test cases that do the same thing or hit the same targets. This will ensure that your test cases can be easily searched upon.

Preconditions

Often when testing, the preconditions required to achieve a test cannot (and should not) be included in the test case name. However, these preconditions still need to be documented somewhere.

So for the first step of your test case, include your pre-conditions. These can be as informal or formal as required and the use of business language is advised.

Examples:

Task exists
Task exists with a state of [Expired]
A User existing that does not have any Tasks created against them

How the pre-conditions are established is outside the scope of the test case. I’ll get to where these are documented later on the article but for the time being, pre-conditions are established as your first few steps of your test case. I suggest that using one test step per pre-condition to aid in clarity.

Test Step 1 [Precondition] User exists in the tasking system

Test Step 1 [Precondition] Task exists in the system that has expired but is not complete

Acceptance Criteria

How do when we know when a test case has passed or failed? All test cases require acceptance criteria and these should be included in your test case. As the test case is only targeting a single objective then your acceptance criteria should be only be a single or a few criterion.

Create – Task – title not supplied

A message is displayed to the user: “Tasks required a title before they can be created”

Create – Task – title is maximum allowable length

The Task is created as per standard business flow. The title must not be truncated.

While the acceptance criterion of the second test case seems a little wishy-washy, that is exactly what the acceptance criterion is. If anything else occurs, then the test fails. I.e. if a database error occurs then fail, if the user receives a message saying that the input is too long, fail.

How do you define standard business flow? That belongs to the realm of supplementary documentation.

Supplementary Documentation

Supplementary documentation is all the documentation that exists to support the system. This can be help-documentation, requirement specifications, technical design documents or how-to documents written on a team-wiki or shared drive.

Use this documentation so that you don’t have to write out for every single test case all of your pre-condition work-flows, test-steps and acceptance criteria. Have it written once, put it in a common location for everyone to use and refer to it in your test cases.

When writing your preconditions, work-flows or acceptance criteria, refer to the particular help document, specification or how-to so that users know where to look for it.

Examples:

[Precondition] User exists in the tasking system {Use: Help > Tasking > Creating Tasks}

[Precondition] Task exists in the system that has expired but is not complete {Help > Tasking > Creating Tasks, Wiki > Task Expiration Job}

Standardise the way you refer to help documentation, where it is, etc so that all team members know that when they see supplementary documentation they know where to find it.

Detailed Test Steps

If after following all of the above, you feel the user-flow that needs to be undertaken to complete the test case is not documented enough that a competent tester, inexperienced with the system being tested can’t complete the test without seeking assistance, then I propose the following:

If the work-flow information that you need to document is more than likely going to be used by more than one test case, then write a how-to document and store it in the how-to section on a wiki, or on the shared drive.

An example of this is the test case: Create – Task – title is not supplied. This test case and others require knowledge of:

Where the Task system resides? How do I login etc?
What user access levels are required?
How to create a Task under normal conditions?

I would include a test step in the test-case after the pre-conditions and before the acceptance criteria that specifies each of the additional documents that should be used and where that document resides.

If the work-flow information is never going to be reused then include the test-steps within the test case as you currently would do now.

Conclusion

The guidelines I've discussed above are very similar to the process that a Business Analyst uses to ensure that each requirement in a specification states only one thing. It is also similar to the process software developers use to ensure their methods are singular of purpose, well named and easy to maintain. The reason for this is, that is where I drew my inspiration.

I couldn't work out why I spend time ensuring well named, modular class methods exist in the software I develop but I don't spend the same time ensuring well named, modular test cases exist in the software I test.

Enjoy

Tuesday, January 15, 2008

Testing SOA: Service dependencies and their impacts on testing

A few days ago, I asked a series of questions regarding testing services and whose responsibility it is, how it should tested, etc. Today I’ll provide a solution, but to a different question. Service dependencies and how they can impact testing.

Whilst talking about service dependencies, I will be ignoring the impact of service-deployment-platforms, service implementations, who does the testing and any tools that could be used to assist in testing. I’ll be looking at the services from a testing organisation viewpoint.

From the aforementioned testing viewport there are two types of services: Simple services (including CRUD services), those being services that do not have dependencies on other services and Orchestration services, those that have such dependencies.

As we all know, the only acceptable way to test a service is in isolation. Reducing a service to the infrastructure dependencies that it will have in production will allow a testing team to prove the service for use amongst a bigger solution.

Prove: to validate the functionality of a service through
the act of testing

Now, for illustrative purposes I’ll define some example services:

Person-Service – CRUD service for a Person Object. Has a dependency on persistent-storage
Time-Service – Simple service that returns the time in a particular time zone. Has no external dependencies.
Searching-Service – an Orchestration Service that utilises the Person service for searching. Is expected to support other services in the future.

In our example the searching-service is dependant upon the person-service. To test the searching-service the person-service is required to be proved before testing on the searching-service can commence. Failing to do so can invalidate the testing undertaken on searching-service and provide a false level-of-confidence.

Conditional Proof: the validation of a subset of service
interfaces to facilitate concurrent development and testing

If timeframes are restricted or concurrent development and testing of services is required then a conditional proof can be organised. In our example this is where the interfaces of the person-service that are required by the searching-service are proved but the entire service itself has not been proved.

Conditional Proofs will allow testing on the searching-service to commence. The only issue is if a different interface on the person-service fails test. In such a scenario the conditional proof is revoked until person-service is fixed and retested once again establishing the original conditional proof or proving the service as a whole.

This may seem implicit to an experienced tester or developer out there. When an aspect of an application is proven (add new-customer address), other dependant code (update customer address) can be tested. However, I believe that when testing services, a more regimented approach to the organisation of testing is required to ensure that all service interfaces are tested against proven dependencies.

To facilitate this I feel a proof-register is required. This can be used within a project team to manage their development and testing responsibilities, or it can be used among a development shop to map which services are stable and what their interfaces are.

A simple spreadsheet listing services and their interfaces and the current state of testing for that interface is all that is required. As can be seen from the example, the searching-service testing has begun on an interface that has a conditional proof.

A database or similar construct may be more appropriate considering this spreadsheet can become unwieldy overtime.

Now, for 3rd party services, things get a little more complicated. Your organisation for example, may have purchased a simple off-the-shelf Client Relationship Management (CRM) service from Company-XYZ and your organisation wants your searching service to be expanded to include the CRM service in its list of services it searches on.

Now, hopefully your company received a test report from Company-XYZ for the CRM service and you can see what testing has been performed on each interface.

Back in the real world, you have some tough choices. (a) Assume the service is proven and hope it works as written on the brochure or (b) spend precious resources retesting a product that might be sound. I prefer option (c) add the service to your proof-register and set the status for every interface to conditional-proof.

If no bugs are ever raised against the service, then your testing is sound. If a bug is raised, then you can determine which services are reliant on the unproven interface. This will bubble up the service dependency chain and you would be able to identify which interfaces will need to be retested after the fix is applied to your CRM service.

In summary: when planning for the testing of services, utilise a process like the one described above to minimise the impacts of defects whilst maximising testing and development concurrency.

Services must be tested in isolation.
A service’s dependencies must be proved before the service is tested
Conditional proofs may allow for some degree of concurrent development and testing
Use a proof-register to manage the status of proved services and conditional proofs granted to interfaces.

Oh and before I go, it’s my firm belief that if you purchase an off-the-shelf service, a test-report is a mandatory part of the package. Off-the-shelf applications are generally speaking self-contained and as such any bugs in the application don’t result in the end of the world. Off-the-shelf services that are going to be integrated into your enterprise architecture may just cause the end of your world.

Friday, January 11, 2008

Testing SOA - an introduction

I might as well start with a fairly large topic and one that puts me in, I feel, an interesting position. The testing of Web services is inherently a technical task in the realm of generally non-technical testers.

There are a number of products out there that attempt to simplify the process of testing services for testing personnel. HP's Service Test, Parasoft's SOAtest, soupUI, etc, but in reality they are just UI wrappers that generate code facilitating communication with the service. Sure they may it easier than cutting your own code to do the same thing and I'm sure some of them auto-generate boundary tests (I don't know if they do but it's not that hard or unreasonable so I'll assume it) but deep down there are couple of issues nagging at me.

Should testers be testing services in the first place?

I've heard some good arguments for both and I've yet to make up my mind. This is really the basis for all my questions, can I justify who is going to test service code? My decision will impact the direction our organisation takes, so this question, I take very seriously. Currently I see five paths:

Developers test all service code (unit, functional, performance, etc, etc)
Developers perform unit and functionality testing and testers perform performance, scalability, availability, etc.
Testers produce a test-specification which they hand to the developer who will code the tests
Testers cut the code themselves (requires a technical-tester)
Testers use a service testing tool

How should the testing of services be performed?

Using an SOA test tool? Cutting code? I'll get onto this in more detail once I've given the various tools a solid workout but they haven't convinced me yet and while that isn't a complete write off, I feel if you can't prove your worth immediately, in the one task you do, you're not effective. Then again, cutting code is prone to human error and having to write repetitive code to test a service, arduous. Code Generation anyone?

How will the testing environment be structured?

Testing services from any perspective is hardly trivial. There are deployment and configuration issues and these impact the viability of testing. I know that it is very easy to spawn a thread at the start of a MS-Test object that creates a service. It then notifies the main test-thread it is ready and testing can commence. The unit-test cases are executed and then everything is cleaned up. This setup would allow a developer to prove a service in the comfort of his own box. It also allows the test cases to be placed on the end of a continuous integration run. However, that deployment of the service will not match how the service is deployed in production. Therefore our testing is only good enough to prove functionality. This leads me to the next question:

Who, and how are we going to test services for Performance, Scalability, Robustness, Load/Stress, Availability, Configuration and Deployment?

These testing scenarios are complex and often time consuming. Will this move into the realm of the developer or will they remain world of testers? I feel that testing most of the above scenarios requires prod-like configuration and deployment. This automatically moves them out of the development environment where stability is mandated.

What defines a service from a documentation perspective?

Services are an IT solution to a Business problem and because of that a requirement specification is not going to detail the interfaces on a service. That belongs in a technical specification. Enter Agile and my chances of seeing such specifications are reduced (apologies to all those that use and like Agile. I like it too but in a world where specifications are thin on the ground Agile doesn't appear to make it any easier).

How do you determine if a service is fit for production?

This may seem like a trivial question but if we were to remove the testers from the equation (and replace them with developer-testers) how do we know that the testing performed by a developer is adequate and proves the service? Years of testing have given me trust issues and organisations generally don't have in place code-promotion paths that bypass the test area.

Have more questions but for the sake of brevity I'll save them for specific topics. Furthermore, I don't plan to answer these questions right away. I'm tackling them all at once at work and when I come up with, what I feel is the answer to a single question. I will put it up here to get some feedback and to share my decision.

Aside from all this I'll be writing and testing some services from a developer perspective. I'll trial the various SOA testing tools to gauge their effectiveness and I'll be having many discussions with my colleagues who are a mix of testers and developers.

Stay tuned

Tester by Day, Developer by Night