Domain Analysis Using Textual Analysis Approach
Analysis stage can be broken down into two following sub-stage:
Domain Analysis: construct a domain model to show a static structure of the system. It contains real-world classes and their relationships to each other. Information to do the domain model comes from the problem statement, general knowledge about the domain in real-world, expert knowledge of SME of the application domain and from similar applications in the same domain developed before.
We will explore each of these steps in context of the following system:
Case Study – ATM System
Here’s the problem statement for the ATM network system that is to be implemented:
Design the software to support a computerized banking network including both human cashiers and automatic teller machine (ATM) to be shared by a consortium of banks. Each bank provides its own computer to maintain its own accounts and process transactions against them Cashier stations are owned by individual banks and communicate directly with their own bank’s computers. Human cashiers enter account and transaction data.
Automatic teller machines communicate with a central computer that clears transactions with the appropriate banks. An automatic teller machine accepts a cash card, interacts with the user, communicates with the central system to carry out the transaction, and dispenses cash and prints receipts. The system requires appropriate record keeping and security provisions. The system must handle concurrent accesses to the same account correctly.
The banks will provide their own software for their own computers; you are to design the software for the ATMs and the network. The cost of the shared system will be apportioned to the banks according to the number of customers with cash cards.
Using this problem statement, we first decide to devise a precise, concise, understandable and correct model of the real world. Analysis model addresses three aspects of objects: static structure of objects (class model), interactions between objects (interaction model) and life cycle histories of objects (state model). To determine the domain model for an application, we take the following steps:
Prepare a data dictionary.
Find attributes of objects and links.
Organize and simplify classes using inheritance.
Verify all access paths
Iterate and refine the model.
Reconsider the level of abstraction in the model.
Group classes into packages.
This is the first step to take in coming out with a domain model – finding relevant classes for objects from the application domain. For finding classes, we try to determine and extract nouns that can be easily understood by the user and not computer implementation constructs (like hash table, lists etc). Those are the potential candidate classes for the system being built. The classes obtained should make sense in the application domain. Use the problem statement first to determine the initial set of candidate classes in addition to the common knowledge of the problem domain and SME knowledge.
Below is a simple example of this technique:
e.g. “A reservation system to sell tickets to performances at various theatres”. Nouns: Reservation, Ticket, Performance, Theatre and System.
First just find classes using nouns in the problem statement. Don’t worry about inheritance or high-level classes. In this step, we are only trying to find specific classes.
For ATM Example, we find following classes from the problem statement:
Software, Computerized Banking Network, Human Cashiers, ATM, Consortium, Bank, Computer, Account, Transaction, Cashier Station, Bank Computer, Account Data, Transaction Data, Central Computer, Cash Card, User, Central System, Cash, Receipt, System, RecordKeeping Provision, Security Provision, Access, Cost, Customer.
Using our knowledge of problem domain, we add: Communication Line, Transaction Log.
We now discard unnecessary and incorrect classes. Classes that are vague (not very specific), implementation level, irrelevant (class has nothing to do with the problem) or redundant (two classes represent the same concept) should be eliminated.
For ATM example, following classes should be eliminated. Following classes are implementation classes (Access, Software, Transaction Log, Communication Line), vague classes (System, Security Provision, RecordKeeping provision, Banking Network), redundant classes (User and Customer serve the same purpose), irrelevant classes (Cost).
Final set of candidate classes for ATM Example: Account, ATM, Bank, Bank Computer, Cash Card, Cashier, Cashier Station, Central Computer, Consortium, Customer, Transaction.
Preparing a Data Dictionary
Data dictionary provides the scope/meaning of each class in the final set of classes in the previous step, including any assumptions or restrictions on its use. It also describes the attributes and operations of each class. Following is a sample data dictionary for the ATM Example.
a single account at a bank against which the transactions can be applied. Accounts can be of many types: checking and savings. A customer can hold more than one account.
a station which allows customers to enter their own transactions using cash cards as identification. The ATM interacts with the customer to gather transaction information, sends the transaction information to the central computer for validation and processing and dispenses cash to the user. We assume that the ATM need not operate independently of the network.
a financial institution that holds accounts for customers and issues cash cards authorizing access to accounts over the ATM network.
And so on. As an exercise, you should create extend this data dictionary for other classes identified above (Cash Card, Bank Computer, Cashier, Cashier Station, Central Computer, Consortium, Customer, Transaction).
Associations show structural relationships between classes. Attributes should not used to refer one class to another class; in such cases, associations are a better choice to use. e.g. class Person should not have an attribute called employer; relate class Person and class Company with association WorksFor. During the analysis stage, we are not concerned how associations are implemented in a programming language. We are just trying to capture all important relationships between classes comprising the application domain model.
To identify associations, look for stative verb or verb phrases. This includes physical location (NextTo, part of), directed actions (Drives), communication (TalksTo), ownership (Has, part of), or satisfaction of some condition. Remove associations between classes that have been eliminated earlier and also remove associations that are outside the problem domain. An association should describe a structural property of the problem domain, and not a transient event. Decompose ternary associations into binary associations or phrase them as qualified associations. Omit redundant or derived associations.
Use proper names to indicate associations. Use role names to indicate association end points whenever possible. Use qualified associations wherever possible to add more information about an association. Specify multiplicity of each entity in the association; it could change over time. Start with a good guess. Add any missing associations. Use aggregation wherever possible to indicate part-of relationship.
For the ATM Example, following are some verb phrases that can be captured from the problem statement and are potential associations:
Banking network includes cashier stations and ATMs (involves eliminated classes)
Consortium shares ATMs (derived association using two primary associations: Consortium owns central computer and Central computer communicates with ATMs)
Bank provides bank computer
Bank computer maintains accounts (a statement of action. Can be changed to Bank computer has accounts)
Bank computer processes transaction against account (ternary relationship: can be broken into Bank computer processes transactions and Transactions concerns account)
Bank owns cashier station
Cashier station communicates with bank computer
Cashier enters transaction for account (ternary relationship. Can be broken into cashier enters transactions and transactions concerns accounts)
ATM communicate with central computer about transaction (a ternary relationship – broken into ATM communicates with central computer and transactions entered on the ATM)
Central computer clears transaction with bank (shows interaction relationship and implies a communication relationship between the central computer and the bank)
ATM accepts cash card (shows interaction relationship and not a permanent relationship)
ATM interacts with the user (shows interaction relationship and not a permanent relationship)
ATM dispenses cash
ATM prints receipts
System handles concurrent access (implementation construct)
Bank provide software (involves eliminated classes)
Cost apportioned to banks (involves eliminated classes)
Implicit Verb phrases
Consortium consists of banks
Bank holds accounts
Consortium owns central computer
System provides recordkeeping (involves eliminated classes)
System provides security (involves eliminated classes)
Customer have cash cards
Knowledge of problem domain
Cash card accesses accounts
Bank employs cashiers
Transactions entered on cashier stations
Customers have accounts
Transactions authorized by cash cards
Bank is a part of Consortium (Aggregation)
Attribute values should not be objects; use an association to show any relationship between two objects. Attributes usually correspond to nouns followed by possessive phrases, such as “the color of the car” or “position of the cursor”. We usually depend on our real and SME knowledge of the problem domain to discover attributes for the domain model. We should only capture attributes that directly relate to the problem domain in our analysis model. We should not worry about derived attributes in this phase. If the value of an attribute depends on a particular context, then consider restating the attribute as a qualifier. E.g. if a Person works for multiple companies, then the person will have a unique employee number for each job. If we used association to capture the WorksFor relationship between a Company and a Person, then employeeNumber would be a qualified attribute for this association. Avoid using object identifiers or implementation dependent information for identifying objects. Sometimes, a value requires the presence of a link, then the property is an attribute of the association and not of a related class (especially in m:n relationships). E.g. in an association between Person and Club the attribute membershipDate belongs to the association because a person can belong to many clubs and a club can have many members. Avoid implementation-detail attributes. Avoid boolean attributes and use enumerations instead to specify those attributes.
Refining with Inheritance
Organize classes by using inheritance to share common structure. RemoteTransaction and CashierTransaction are similar, except in their initiation and can be generalized by Transaction. Top-down generalizations are often apparent from the application domain. We look for noun phrases composed of various adjectives on the class name: fluorescent lamp, incandescent lamp; fixed menu, pop-up menu, sliding menu.
For similar associations (i.e. associations that appear more than once with substantially the same meaning), try to generalize the associated classes.
Refine the Class Model
Further refine the class model based on the additional information obtained from the interaction model of the system. Add new classes if needed by analogy. Split a class into two classes to make the set of closely related attributes together in one class. If a class is playing multiple roles, break it into two or more classes so that each new class only plays one specific role. Look for missing associations. Remove redundant associations – those that do not add information or are derived.
Figure 1 – ATM network system Class Diagram