Lab 3 : Database Design

Database design process has three steps, namely, conceptual design, logical design and physical design. Lab 3 focuses on the first two steps. Lab. 3 focuses on the first two steps. Study the handout containing the requirements for the databases and prepare the following:

conceptual models (E R diagram)
logical data models (table design, normalization)
SQL DDL statements.

CASE Tools: CASE tools (ER Assistant, Microsoft Visio, ERwin etc.) can be used but are not essential. You may carry out the process by hand since the end product is relatively small ( 6 to 12 entities in ER diagram, 6-12 tables in SQL).

Step I. Requirements

Understanding the requirements is crucial to successful design of databases and software systems. A popular approach is based on use cases as detailed below:

List major categories of human users of the proposed database system.
List application programs (i.e. software systems) interacting with the the proposed database system.
What do different user categories and application programs need from the database? Limit your answer to a few use cases per user category and application program.
Prioritize the use cases. Briefly justify the priority level of the use cases with high priority.
Examine various forms, reports and legacy codes against the list of high priority use cases. Identify relevant parts of forms, reports and legacy codes data declaration. Also list irrelevant parts of forms, reports and legacy codes data declaration.
Design a few test cases (e.g. queries) for the database. The test cases may be based on high priority use cases.

Deliverables include

A summary statements limited to about 1 page (500 words) identifying user categories, application programs, high priority use cases and sample test queries.

Conceptual Data Model - First draft

The goal of this step is to prepare a draft of the conceptual model using the process discussed in the class.

First identify the entities and their identifying attributes. Entities can come from Nouns in the text. They also come from the groups of related items in forms, reports and legacy code. First draft may include about half a dozen major entities. You can include additional entities in next draft after reviewing the first draft against requirements.
Next add relationships between the entities. Sources for relationships include verbs in sentences connecting multiple entities. Legacy code may include foreign keys associating sets of records. Those may also include nesting (e.g. part-of) or generalization (e.g. inheritance) hierarchies. Do not include inheritance hierarchies for now. Do not the cardinality and participation constraints yet.

Deliverables include

First draft of the ERD with relevant entities, relationships, identifying attributes, a few other major attributes. First draft of ERD need not be perfect. Next steps evaluate the draft for possible revisions.
Brief justification of entities and relationships based on cross-referencing against the requirements (e.g. use cases, forms, reports, legacy code, conversations). Use a table to summarize the justifications.
Are all high priority use cases covered? If not, list the high priority use cases which are not covered. Are come entities covering only low priority use cases? Limit your answer to 100 words.
Brief review of the ERD for completeness and consistency using the checks listed in Table 7-4 (pp. 237) of the textbook. Limit the discussion to 100 words.

Conceptual Data Model - Revised draft

Revised draft of ERD addresses the issues uncovered from the evaluation of the first draft. It also adds details, e.g. attributes, cardinality constraints, sub-types. The revision process may be decomposed into the following steps:

First address the completeness and consistency issues listed in Table 7-4 (pp. 237) of the textbook. For example, add relationships to entities that do not participate in any relationships.
Next add additional entities and relationships to cover the high priority use cases not covered by the first draft. Consider dropping entities and relationships which only cover low priority use cases.
Review the ERD again for completeness, correctness and coverage of high priority use cases.
Add attributes, cardinality and participation constraints. Consider identifying SQL data types for each attributes.
Examine the possibilities for refinements (e.g. transforming attributes into entity types, splitting compound attributes, adding generalization hierarchies) as discussed in section 7.5.3 (pp. 2320 of textbook.
Review the ERD again for completeness, correctness and coverage of high priority use cases.

Deliverables include

A summary of revisions. Limit your answer to about 200 words.
A table listing names of all entities, their identifying attributes, other attributes.
A table listing names of all relationships, names of participating entities, cardinality constraints, and participation constraints. For generalization hierarchies list disjointness and completeness constraints for subtypes.

Logical Data Model - Table Design

A conceptual data model can be translated into a logical data model using the conversion rules discussed in section 7.6 (pp. 235-248). It is useful to evaluate the resulting relation for conformance to first few normal forms (chapter 8). The process may be decomposed into the following steps:

Translate the entities into relations. Translate multi-valued attributes into separate relations.
Translate Entities in generalization hierarchy using section 7.6.3.
Translate 1:1 and 1:N relationships into foreign keys in one of the relations for the participating entities. For 1:1 relationships, consider both options and justify the choices made. Check that the relations for Weak Entities include the primary keys from their identifying entities.
Translate M:N relationships in new relations. Include primary keys of participating entities in the relation.
Identify primary keys for every relation.
Identify foreign keys in each relation and whether null values are allowed for those by the participation constraints.
Identify SQL data type for each attribute. Ensure that the data type for foreign keys matches data type for corresponding primary keys.
Identify dependencies among attributes. Test the relation design for conformance to first few normal forms.

Deliverables include

A table listing names of all relations, with primary keys constraints and referential integrity constraints.
Result of the test checking for conformance to first few normal forms.
A table listing names of all attributes, entities they describe, SQL data types, uniqueness constraints, null-ability constraints.

SQL DDLs for tables

Deliverables include

SQL code to define the schema for relations. Include primary key and referential integrity constraints. Specify integrity rules for any attributes that need them.