Design of the data model
A model tries to represent the reality. We can compare it with a plan for a house. A data model concerns the organisation and documentation of data. It is a plan for the design of a database. It concerns
; which data are stored?
; how the data have been structured?
; how the data are joined with each other?
; how the data are stored?
We will use the entity relationship model (Peter Chen 1976) to design the data model. 1 The entity relationship model
An entity is something we want to store data about. This can be in an abstract way (e.g. an appointment) or real (e.g. a product). Examples of entities are
; the SUBJECT “Linear Algebra” st; the GROUP “1 bac industrial sciences”
the PROF “Dhoore Fernand” ;
; the BOOK “Database Processing”
; the COURSE “Relational System Design”
; the EMPLOYEE “Dekoning Carlo”
Entities are grouped in entity classes. The entity class describes the structure of the entities (lists its attributes, to see further). Each instance (or occurrence") of an entity class is called an entity and is described by the concrete values of its attributes.
We further often will use the name entity instead of entity class.
Entities have attributes or properties.
Examples of attributes are
; SubjectName, Points, HoursLecture, ... for SUBJECT
; Count, ... for GROUP
; LastName, FirstName, Km, ... for PROF
; Title, ISBN, ... for BOOK
; Length, ... for COURSE
; Name, Function, MonthSalary, ... for EMPLOYEE
All instances of a particular entity class must have the same attributes. Multivalued attributes are not allowed. E.g. if an entity CUSTOMER has an attribute PhoneNr, then this attribute can contain only 1 phone number. Also relations can have attributes.
An attribute that uniquely identifies an entity (instance) is called an identifier. E.g. for an entity
STUDENT that can be a unique identification number. The other attributes Name, FirstName, BirthDate are probably not unique. An identifier also can consist of several attributes. This is called a composed identifier. Thus the combination Name + FirstName can act as a composed
identifier if there are no two students with the same name + first name.
Remark: identifiers play the same role for entities as keys for tables (relations, to see further). Instead of identifier sometimes the term key attribute is used.
Design of the data model 1
1One are more entities can be connected with each other. Such connections are called relations.
Examples of relations are
; A PROF teaches a SUBJECT
; A READER borrows a BOOK
; An EMPLOYEE is the manager of another EMPLOYEE
Relations can be named differently depending of the viewpoint. Instead of “a PROF teaches a
SUBJECT” we can also say “a SUBJECT is teached by a PROF”. Instead of “an EMPLOYEE is the manager of another EMPLOYEE” we can say “an EMPLOYEE is managed by another EMPLOYEE”.
In an entity relationship diagram (ERD) entities, attributes and relations are graphically represented. There are several notations in use. We follow the notation of Chen. Entities are rectangles, attributes ellipses and relations diamonds.
Figure 1 entity with attributes
Figure 2 entity with attributes: alternative
Figure 1 and Figure 2 show several possibilities to represent the entity DEPARTMENT and its attributes DepartmentNr, Name and Location. We underlined the identifier.
Nouns indicate entities and attributes. The nature of a relation is reflected by a verb. An entity in one model can be an attribute in another.
2 Relations: types and maximum number of participants
The number of entities involved in a relation is called the degree of the relation. The degree of the relation "EMPLOYEE works in DEPARTMENT" is two because two entities are involved. A relation of degree two is also called a binary relation.
When a subject is teached by a PROFessor to a STUDENT, this can be considered as a relation of degree 3. Most supporting software concerning data models however works exclusively with binary relations; therefore one must try to split up a relation of degree 3 in several relations of degree 2.
1 To be even more accurate, one also uses the relation class for a connection between entity classes and the name relation for a connection between entity instances. In the future we will always use the term relation. From the context it will be clear what we mean.
Design of the data model 2
Figure 3 ternary relation
Unary relations are also possible e.g. “PERSON is the brother/sister of (another) PERSON”,
“EMPLOYEE is the manager of (another) EMPLOYEE”.
Binary relations are classified based on the maximum number of entities that can occur on both 2sides of the relation.
; “1:1” or “one to one”
; “1:n” or “one to n” or “one to many”
; “n:n” or “n to n” or “n:m” or “n to m” or “many to many”
2.1 one to one relations
Figure 4 one to one relation
Figure 5 one to one relation: from left to right Read from left to right: an EMPLOYEE is the head of 1 DEPARTMENT.
2 One also uses the terms cardinality and connectivity.
Design of the data model 3
Figure 6 one to one relation: from right to left Read from right to left: a DEPARTMENT has as head 1 EMPLOYEE. This relation implies that no EMPLOYEE is the head of several DEPARTMENTs. Also no
DEPARTMENT has 2 are more heads of DEPARTMENT.
Figure 7 one to one relation
Read from right to left:a TEAM has as professors’ room 1 ROOM.
Read from right to left: a ROOM is the professors’ room of 1 TEAM.
This relation implies that a TEAM has only one professors’ room and that a room can be the
professors’ room of only one team, in other words no 2 teams share a room.
2.2 one to many relations
Figure 8 one to many relation
Figure 9 one to many relation: from left to right Read from left to right: a DEPARTMENT has many EMPLOYEEs. Design of the data model 4
Figure 10 one to many relation: from right to left Read from right to left: an EMPLOYEE works in 1 DEPARTMENT.
Figure 11 one to many relation
A house is the home of many PERSONs. A PERSON lives in one HOUSE. 2.3 many to many relations
Figure 12 many to many relation
Figure 13 many to many relation: from left to right Read from left to right: a PROFessor teaches many SUBJECTs.
Figure 14 many to many relation: from right to left Design of the data model 5
Read from right to left: a SUBJECT is teached by many PROFessors.
Figure 15 many to many relation
A READER borrows many BOOKs and many READERs can borrow a BOOK.
3 Relations: minimum number of participants.
3For a relation we also can indicate the minimum of entities involved in the relation. The
minimum is denoted before the maximum followed by a colon. 1:n means minimum 1 and maximum n.
Figure 16 one to one relation: minima
Read from left to right: an EMPLOYEE not necessarily (0) is the head of a DEPARTMENT or not all EMPLOYEEs are head of (must be head of) a DEPARTMENT.
Read from right to left: a DEPARTMENT has (at least) one head. 1:1 thus means: every DEPARTMENT has precisely one head.
Figure 17 one to one relation: minima
Each TEAM has exactly 1 professors’ ROOM but a ROOM does not have to be a professors’ ROOM of a TEAM.
3 One also uses the term ordinality.
Design of the data model 6
Figure 18 one to many relation: minima
Every DEPARTMENT has at least 1 EMPLOYEE. Every EMPLOYEE belongs to exactly 1 DEPARTMENT.
Figure 19 one to many relation: minima
A HOUSE can be the home 0 PERSONs and thus be inhabited. A PERSON can live in 0 HOUSEs thus be homeless.
Figure 20 many to many relation: minima
A PROFessor teaches at least 1 SUBJECT and each SUBJECT is teached by at least one PROFessor.
Figure 21 many to many relation: minima
A READER can borrow 0 or more BOOKs and zero or more READERs can borrow a BOOK. If one knows exactly the value of n, then one denotes its concrete value instead of n. So 2:5 means at least 2 and maximum 5. Some authors, schemes and software change the notations to the left and the right of the relation (diamond). This can give rise to some confusion. Design of the data model 7
4 Special entities
4.1 Weak entities
A weak entity is an entity which cannot exist on its own; its existence logically depends on the existence of (an)other identifying identity(ies).
E.g. consider an entity BOOK. A library can own different copies of the same book. There exist a one to many relation between the entities book and copy. A copy of a book cannot exist without the existence of the book.
For this relation BOOK and COPY are called the parent and the child respectively.
Figure 22 weak entity
To indicate that we deal with a weak entity and that book is the identifying entity, a double rectangle is used for the weak entity (COPY) and a double diamond for the identifying relation (“has as copy”). We then can e.g. identify a copy of a book by combining the identifier of book (Title + Author) with a CopyNumber (e.g. copy nr. 3 of the book “Harry Potter and the Deathly Hallows” of “J.K. Rowling”). CopyNumber is no identifier for Copy but only a partial identifier.
In the ERD diagram we use a dashed line instead of a line. We also use the name ID-dependent entity here. The situation is depicted in Figure 23
Figure 23 weak entity (id-dependent)
Other examples of such a relation are the relation between COMPUTERMODEL and COMPUTER, between SOFTWARE en SOFTWAREVERSION, between COURSE and
For the relation between BOOK and COPY also another choice can be made for the attributes. If each copy receives a unique number within the library, then this number satisfies to identify a copy. The relation no longer is an identifying relation. We no more use a double rectangle or a double diamond. COPY remains a weak entity (cannot exist without the existence of the entity BOOK); in this situation COPY no longer is ID-dependent.
Design of the data model 8
Figure 24 weak entity - not ID-dependent
4.2 Subtypes and super types
A subtype-entity is a special case of another entity (super type).
So an entity EMPLOYEE can have as subentities the entities MANAGER, ADMIN and PROF. The super type contains all common attributes; the subtypes contain the specific attributes. The super type can contain an attribute that acts as a discriminator e.g. Sex for subtypes MAN and WOMAN.
Subtypes can be exclusive (entity belongs to maximum 1 subtype) or inclusive (entity can belong to several subtypes, e.g. an EMPLOYEE which is at the same time PROF and MANAGER). A subtype-relation also is called an “IS-A”-relation in contrast to the other relations that are
The main reason to introduce subtypes is to avoid too much NULL-values for some attributes. If a PROF e.g. teaches only one subject, this subject can be an attribute of PROF. Setting this attribute as an attribute of EMPLOYEE would lead to NULL-values for the attribute subject for all EMPLOYEEs who are not a PROFessor.
5 Replacing a ternary relation by binary relations
A ternary relation between e.g. PROF, STUDENT and SUBJECT cannot be replaced simply by 3 binary relations: one between PROF and SUBJECT, one between PROF and STUDENT and one between STUDENT and SUBJECT.
“Student Peeters follows the subject Algebra by Hellings” is not equivalent to “Student Peeters follows the subject Algebra”, “Hellings teaches Algebra” and “Student Peeters follows a course
of Hellings”. It is then possible that Peeters follows Algebra by another professor and another subject by Hellings.
The ternary relation can be correct translated using a weak entity LECTURE and 3 identifying relations (to STUDENT, PROF and SUBJECT).
6 Example: library
A library plans to automate the borrowing of books. The analysis of the problem leads us to the following synbthesis.
The library has several branch libraries scattered over the city, each with their own opening hours.
A member may be enrolled in one of the branches and must be known with full name and address to permit checking where the books are and the sending of warnings. …
Design of the data model 9
When a book is borrowed, the date is registered. When the book is returned this date is stored too. These can be used for calculating the fines.
We choose not to incorporate the summations and fines in the model. However, the model must contain the necessary data to be able to send the summations and to calculate the fines.
7 Example: internal training
7.1 The problem
2The meanwhile renowned training institute IE, Interactive E-based Educating, wishes to give in-
service training to its own employees. Moreover all data about this must be kept in a database for managing internal training.
Per trimester under the employees a brochure is distributed concerning the planned courses (see 错误；未找到引用源。Figure 25).
Systeem design with Oracle
CASEStart: 9 februari 2006 Days: 5 Prof: dagen certificate Location
Introduction SQL and SQL*Plus
Start: 19 april 2006 Days: 4 Prof: Mark Swinnen dagen Location: Interactive E-based Educating
Figure 25 brochure spring 2006
Courses are classified in three types: introductory courses (GEN), courses on application building (BLD) and courses on system analysis and -design (DSG). Courses can be arranged more often in the same trimester, where the place of the course implementation, the teacher, who itself works in the firm, and the starting date can possibly be different. The duration of the course, its type and a unique code are fixed.
The same course never starts at the same time on two different locations.
An employee can register itself for a particular course implementation. After the course the teacher assesses the activity of the course participant with a score between 1 and 10. At the moment employees’ data are still recorded on index cards (see Figure 26).
Since the first use of a time clock - against which indeed much protest had come - each employee has got a unique identification number. This number must used to identify employees. This number exists of 4 digits.
Each employee has been assigned to one department. A department always has one place as its location and one head of department. Remark that the direct manager of an employee not necessarily is the head of the department. The departments have a unique number (of 2 or 3 digits). Recently a new department was added and at this moment it still has no employees. Many employees can work in a department.
Design of the data model 10