RIF Architecture: Data Storage Layer

by Kevin Garwood

The most important source files in the data storage layer are the classes which implement the service APIs. For example, rifServices.businessconceptLayer.ProductionRIFStudySubmissionService implements the service API rifServices.dataStorageLayer.RIFStudySubmissionAPI.

For reasons that are explained in the section on Designing for Testability, most of the implementation for rifServices.dataStorageLayer.ProductionRIFStudySubmissionService is implemented by the class rifServices.dataStorageLayer.AbstractRIFStudySubmissionService.

One of the design challenges for making maintainable code in this layer is that as the service APIs grow to include more methods, so do the classes which implement those methods. There is a risk of producing a "God class" which attempts to do everything. There are at least three ways of simplifying these classes:

Rather than try to have a single service API to service all the tools, we developed separate service APIs for each tool in the suite. For example, we developed separate interfaces rifServices.businessConceptLayer.RIFStudySubmissionAPI and rifDataLoaderTool.businessConceptLayer.RIFDataLoaderServiceAPI to separate concerns for submitting studies and loading health data sets.

Maintenance-6 : Instead of having one service API with many methods, develop a service API for each tool instead.

The second way we simplified the service implementation classes was to look for service APIs that appeared to have a lot of overlap. The best examples are APIs for study submission and viewing study results. Both APIs have a need for methods that can let client applications manipulate maps at different resolutions. We have tried to maintain a correspondence between the inheritance hierarchy of interfaces and the inheritance hierarchy for the classes that implement them.

Maintenance-7 : Develop a hierarchy of super classes which can reduce repetitive coding efforts and hold the bulk of implementation code.

The most important way of minimising the complexity of service implementation classes is to have them delegate to other classes for the bulk of the implementation code. Typically, a service class will delegate to a method with the same name in a Manager class.

Maintenance-8 : Make service classes invoke delegation classes to support most of the implementation for a business task. Ensure that the service class and the manager classes have a clearly defined separation of concerns.

For delegation to work, the service and the manager classes need a clear definition of roles and a clear separation of concerns.

A service class will have the following responsibilities:

Data Storage Layer-1 : Classes that implement service APIs are responsible for: safe-copying parameter values; checking for empty or malicious values in parameters; checking for invalid users; and acquiring and relinquishing database connections from a pool of connections.

A typical delegation Manager class will have the following responsibilities

Data Storage Layer-2 : A Manager class is responsible for: calling the checkErrors() method of parameter values that are business objects; constructing and executing SQL queries using parameter values; packaging results as business objects; recording SQL exceptions and returning human-readable exception messages; closing database resources associated with obtaining results.

The roles of each class will be described in greater detail in the various aspects of design that follow. The important point here is that in order to allow the RIF code base cope with more service methods, we must rely on a class delegation chain that spreads the implementation code over a number of classes that can be separately maintained.

The requirement for the RIF to support both PostgreSQL and SQL Server is perhaps the greatest source of complexity in the middleware design. We anticipate that as code we've written for PostgreSQL is ported to SQL Server, there will be a need to rewrite database queries.

Originally, the SQL queries that appeared in the SQL*Manager classes used a StringBuilder class to concatenate SQL phrases with tables and field names that suited a particular operation. However, this task seemed repetitive, error-prone, produced queries that were difficult to read and which might have to be reworked so they could be executed with both PostgreSQL and SQL Server databases.

We later developed a set of query formatter classes, which are meant to simplify and standardise the way the text for SQL queries is constructed. The classes encapsulate details for whether SQL key words should be in upper or lower case, and how much each line should be indented. They also have methods that allow clauses such as SELECT, FROM, WHERE being built up in a systematic way. During porting activities, it may become useful to display the SQL query that is being executed in order to study how it should be modified to support execution on SQL Server. Because the construction of different types of queries has been standardised, it may be easy to make queries more portable by changing the query formatter classes.

Maintenance-9 : Create query formatter classes that standardise the way common types of SQL queries are constructed. Allow the query formatters to support changes of case and have them adopt a consistent way of indenting lines of SQL code to make them more readable.

Coding Conventions

Coding conventions for this layer are described here.