Error Handling

by Kevin Garwood

The ability for a code base to identify and recover from errors is an important aspect of design that can influence the way validation is supported and the way test cases are designed. In this section, we develop an approach for error handling that meets three objectives:

it allows the middleware and client applications to recover gracefully from an error
it provides client applications with a human-readable explanation of the error that contains minimal sensitive data.
it provides automated test suites with a machine-readable code that identifies specific causes of errors.

Limiting the scope of concern of error handling

Our first task is to limit our concern about what scenarios will cause errors. We turn to two general design decisions:

General Design-4 : Wherever possible, limit the paths of execution that are likely to occur.

General Design-7 : Encapsulate business concept and data storage layers of the architecture through service APIs. Do not allow clients to know which class is implementing the service interfaces.

These two decisions mean that the only execution paths we need to be concerned about are those that begin by using the service methods. Therefore, we can assume that all exceptions that are thrown or caught will occur along these paths.

If an error occurs, client applications should be informed by a thrown exception. They should not rely on interpreting errors based on the nature of returned results. Therefore, all methods specified in service interfaces should be able to throw an exception.

Error Handling-1 : Error handling will be designed to pass exceptions back to client applications via the service methods. The signatures of all methods for a service interface will allow them to throw an exception.

Designing an application exception class that appeals to human and machine users

The service methods need to appeal to both client applications that service end users and to automated test suites that simply compare expected and actual outcomes. When a test suite tries induce the middleware to throw exceptions, it may not be sufficient to know that an exception did or did not occur. Test cases need to verify that an exception was caused for a specific reason.

One way to provide test cases with a reason is to make a subclass of java.lang.Exception for every type of problem we can envision. When the service method throws an exception, the test code can catch the exception and verify the class of the exception. However, the hierarchy of exception classes may prove unreliable if new subclasses are added that might make the cause of a problem ambiguous.

A better way is to provide a test with an error code. Rather than returning an arbitrary number, we can return a more meaningful value defined in an enumerated type. The enumeration rifServices.system.RIFServiceError lists a large number of specific errors that allow test cases to be very specific in the kinds of exception cases they test.

The general java.lang.Exception class allows one error message to be passed. However, it may be important to include a distinct message for every problem that was detected when the problem was detected. For example, a call to investigation.checkErrors() may reveal multiple blank fields. It is useful for client applications if they can display multiple discrete causes of the exception from which they are trying to recover.

In order to appeal to both human and machine users, we developed the class rifServices.system.RIFServiceException. The following diagram illustrates how its properties are used by client applications.

Error Handling-2 : All service methods will be able to throw a checked exception rifServices.system.RIFServiceException. This checked exception will support two features:

an error code that provides a machine-readable cause for the error. The error codes will come from some enumerated type.

a collection of human-readable error messages that client applications can display or log for the benefit of end-users.

Logging exceptions before throwing them

In order to help ensure that RIFServiceException is the only kind of checked exception a client could expect, the middleware needs to try and trap and log all other checked exceptions. For example, suppose executing a query produces an SQLException. The exception should be caught and logged so that the original stack trace for the error is preserved. A new instance of RIFServiceException which describes the error for a client should be created and thrown. The error messages will provide useful information for end users and wherever possible they will contain a minimum of sensitive information.

Error Handling-3 : All checked exceptions should be caught and logged before being re-thrown using a RIFServiceException instead.

Not all RIFServiceException will be created as a way of masking other exceptions. We expect that most instances of the exception will be generated from the checkErrors() and checkSecurityViolations() methods that appear in business classes. To ensure that all exceptions are captured for auditing, we need to log RIFServiceException instances as well. However, they should all be logged and re-thrown to the client application at the same point in the code. We decided that the service classes will be responsible for catching and rethrowing them.

Error Handling-4 : All instances of RIFServiceException that are generated in the code base will be thrown until they are caught by service classes. The service classes will then log these exceptions before returning them to client applications.

Supporting graceful recovery through `finally {...}` code blocks.

In order to promote graceful recovery for the middleware, we make use of the finally part of the try {..} catch {...} finally {...} exception handling mechanism that Java provides. We use the finally {...} block to ensure that whether a method returns normally or throws an exception, that persistent resources are reclaimed or closed. The two examples that follow show the importance of cleaning up resources using this mechanism.

Gracefully closing `PreparedStatement` and `ResultSet` resources

All methods in the manager classes that execute SQL queries use the following code template:

	PreparedStatement statement = null;
	ResultSet resultSet = null;
	try {
	
	   //do query
	   statement = connection.createPreparedStatement([[queryText]]);
	   
	   ...
	   ...
	
	   resultSet = statement.executeQuery();
	   ...
	   ...
	   //return results;
	}
	catch(SQLException sqlException) {
          logSQLException(sqlException);
	   
          String errorMessage
             = RIFServiceMessages.getMessage(
                "...",
                paramA,
                paramB,
                ...);

          RIFServiceException rifServiceException
             = new RIFServiceException(
                [[[some error code, eg: RIFServiceError.INVALID_INVESTIGATION]]],
                errorMessage);	
                
          throw rifServiceException;      
	}
	finally {
	   SQLQueryUtility.close(statement);
	   SQLQueryUtility.close(resultSet);	
	}

In this example, if an exception is thrown after statement or resultSet have been assigned values, then may not be closed properly.

Gracefully recovering pooled database connections

The other major example where the finally {...} block is used is when the service classes are assigning and reclaiming database connections.

   Connection connection = null;
   try {
      Connection connection
         = sqlConnectionManager.assignPooledReadConnection(user);
      ...
      //do something
      ...
   
   }
   catch(RIFServiceException rifServiceException) {
       logException(
          user,
          [[method name]],
          rifServiceException); 
   }
   finally {
      sqlConnectionManager.reclaimPooledReadConnection(
         user,
         connection); 
   }

This code block ensures that when exceptions occur, the connection manager is still able to reclaim the connection that was used. If the finally block were not there, then repeated exceptions could cause the database to run out of available database connections.