DOCX

datamodel_to_tables_English_2008_2009

By Mario Payne,2014-01-19 23:03
9 views 0
datamodel_to_tables_English_2008_2009

From data model to tables

    1 Relations

    In the introductory chapter we also used the name relation instead of table. To be a relation, some extra conditions must hold.

     rows contain data about an entity

     columns contain data about attributes of entities

     all entries in a column are of the same kind

     each column has a unique name

     cells of the table hold a single value

     the order of the columns is unimportant

     the order of the rows is unimportant

     no two rows may be identical

    2 Functional dependency

    Consider s=f(t), z=g(x,y), total=count*price.

    s depends on t, z depends on x and y, total depends on count and price. We say s is functionally dependent on t

     z is functionally dependent on x and y

     total is functionally dependent on count and price

    and we write

     t;s

     (x,y);z

     (count,price);total

    This notation stands for the functional dependency and not the formula. What‟s to the left of the arrow is called a determinant. A determinant can be composite or not.

    A functional dependency exists when the value of one or more attributes determines the value of another attribute (or other attributes).

    Not every functional dependency can be expressed by an equation. In fact this holds true in most cases because if an attribute can be simply computed, it can be simply omitted. E.g. consider tables and chairs where tables always are white or black and chairs red or green. Based on the color we then know if we deal with a table or a chair. Thus

     color;kind (table or chair)

    As another example consider (for books)

     ISBN;(Title, Author)

    3 Keys

    A key is a combination of one or more columns to identify particular rows from a table. If a key contains more than one column, it is called a composite key.

    A candidate key is a determinant that determines all other columns.

    Candidate keys identify precisely one row of a relation (table). Given the value of a candidate key, we can find one and only one row in the relation that has this value.

    From data model to tables 1

Consider the table WHO_TEACHES from the introductory chapter.

    WHO_TEACHES

    SubjectCode GroupCode Professor

    ALG 1BACA Hellings

    ALG 1BACB Verraedt

    ALG 1BACE Vanbeveren

    ANA 1BACA Van Huyck

    ANA 1BACB Van Huyck

    ANA 1BACE Vanbeveren

    A given value for a SubjectCode identifies a number of rows. A SubjectCode equal to ALG identifies the first three rows and SubjectCode ANA identifies the last three rows. (SubjectCode, GoupCode) is a candidate key. Given a value for SubjectCode (e.g. ALG) and a value for GroupCode (e.g. 1BACE) there is only one row containing those values for the columns given namely ALG, 1BACE, Vanbeveren.

    During the design of the database one must chose one candidate key as primary key. The DBMS

    will use this primary key. Each table has only one primary key (composite or not). Because no row of our database contains identical rows, each table has at least one candidate key namely the combination of all columns.

    A surrogate key is an artificial column that is added to a table to serve as the primary key. This key has no meaning to the users and can be added automatically by the DBMS when the row is created. In case the primary key contains many columns, a surrogate key is recommended. A foreign key is a column or combination of columns that is the primary key of a table other than the one in which it appears. It represents a relation.

    Figure 1 foreign key

    Consider the situation of Figure 1. sporting club is the primary key of the second table. sportclub s added as foreign key to the first table. i

    In most cases the values of the foreign key must correspond to an existing value of the primary key. This is a referential integrity constraint.

    From data model to tables 2

4 From Entities and Relations to Tables

    4.1 Entities

    Each entity becomes a table and an identifier of an entity is chosen as primary key of the table.

    Composite attributes of an entity must be split in several attributes. One can also transform a

    composite attribute to an entity if this attribute plays a role on its own. Consider the entity EMPLOYEE.

    Figure 2 entity EMPLOYEE

    We then get the following table.

    EMPLOYEE(EmployeeNr, Name, FirstName, Function, MonthSalary, BirthDate) After the name of the table follow inside parenthesis the column names and the primary key is

    underlined.

    Consider the entity DEPARTMENT.

    Figure 3 entity DEPARTMENT

    This becomes a table

    DEPARTMENT(DepartmentNr, Name, Location)

    The relations from an ERD must also represented. We will add extra columns to the tables and/or

    extra tables.

    4.2 One to one relations

    The primary key of one table is stored in another table as a foreign key. Consider the example EMPLOYEE is manager of/has as manager DEPARTMENT.

    Figure 4 the relation "is manager of/has as manager"

    Two possibilities arise.

     The primary key of the table DEPARTMENT (thus DepartmentNr) can be stored as foreign

    key of the table EMPLOYEE. In this case the table becomes

    EMPLOYEE(EmployeeNr, Name, FirstName, Function, MonthSalary, BirthDate, HeadOf)

    From data model to tables 3

    where the column HeadOf contains the DepartmentNr of the department of which he/she is

    the head. We underlined the foreign key with a dotted line.

    This choice is not that good because most of the employees are not the head of a department

    and the column hence will contain a NULL value.

     The primary key of the table EMPLOYEE (thus EmployeeNr) can be stored as a foreign key

    of the DEPARTMENT table. In this case the table becomes

    , Name, Location, Head) where the column Head contains DEPARTMENT(DepartmentNr

    the EmployeeNr of the head of the department.

    An attribute of the relation gives rise to an extra column.

    4.3 One to many relation: the general case

    The primary key on the one side (the parent) is stored as foreign key of the table on the many side (the child).

    Consider the example EMPLOYEE works in/has as employee DEPARTMENT

    Figure 5 the relation "works in/has as employee"

    DEPARTMENT is the parent and EMPLOYEE is the child. We store the primary key of DEPARTMENT (thus DepartmentNr) as foreign key of EMPLOYEE.

    We then get

    EMPLOYEE(EmployeeNr, Name, FirstName, Function, MonthSalary, BirthDate, DepartmentNr)

    Then consider EMPLOYEE is manager of/has as manager EMPLOYEE. The same table is parent and child. The primary key of the parent (thus EmployeeNr) is stored once again in the table as foreign key. We must choose a good column name. We then get.

    EMPLOYEE(EmployeeNr, Name, FirstName, Function, Manager, MonthSalary, BirthDate,

    DepartmentNr)

    EmployeeNr and Manager both contain the number of an employee. EmployeeNr plays the role of primary key and Manager the role of foreign key (tells who is the manager of the employee given).

    4.4 One to many relation: ID-dependent entity as a special case.

    We consider the other one-to-many relations from the IEE-example. We have EMPLOYEE teaches/is taught by IMPLEMENTATION and IMPLEMENTATION is implementation of/has as implementation COURSE.

From data model to tables 4

    Figure 6 two one-to-many relations

    There are no problems handling the first relation. We just follow the general rule. The primary key of the parent (EmployeeNr from EMPLOYEE) is stored as a foreign key with name Prof to the child (IMPLEMENTATION).

    The second relation is a special one. The entity IMPLEMENTATION namely is a ID-dependent that cannot exist without the existence of the entity COURSE. In this case the primary key of the parent (thus Code from COURSE, we will use the name Course) becomes part of the primary key of the child.

    Hence we get

    IMPLEMENTATION(Course, StartDate, Location, Prof).

    4.5 Many to many relation

    In the case of a many-to-many relation a new table is constructed and the primary keys of both tables involved are part of the primary key of the new table. The attributes of the relation become new columns of this table.

    We consider the relation EMPLOYEE enrolls for/has as participator IMPLEMENTATION. From the above we already have the following tables.

    EMPLOYEE(EmployeeNr, Name, FirstName, Function, Manager, MonthSalary, BirthDate,

    DepartmentNr)

    IMPLEMENTATION(Course, StartDate, Location, Prof)

    The relation also has an attribute Evaluation.

    We construct a new table ENROLMENT

    with key EmployeeNr (we will use CourseMember as name), Course, StartDate ENROLMENT(CourseMember, Course, Evaluation, StartDate).

    As a second example we take the library example and the relation READER borrows/is borrowed by BOOK.

    Based on the tables

    READER(ReaderNumber, ReaderName, ReaderAddress, ReaderCity)

    BOOK(BookNumber, Title, Author, ISBN)

    we construct the new table BORROWING. We let ReaderNumber and BookNumber be part of the primary key. As a primary key we then can take e.g. the combination (ReaderNumber, BookNumber, BorrowDate). (ReaderNumber, BookNumber) is not sufficient because the same reader can borrow the book more than once. We then get

    BORROWING(ReaderNumber, BookNumber, BorrowDate, ReturnDate)

    Another possibility is to use a different ERD. In this ERD we add an extra BORROWING entity with a unique BorrowNumber, BorrowDate and ReturnDate. Then there will be a one to many relation between READER and BORROWING and a many to many relation between From data model to tables 5

    BORROWING and BOOK. The attributes which previously belonged to the relation, disappear; they now are attributes of the new entity. The conversion of the many to many relation gives rise to a new table with as primary key the combination of BorrowNumber and BookNumber. 5 Normalization

    5.1 Anomalies

    It is important to construct a good data model; this means a simple data model without redundancy. Simple means no multivalued attributes and attributes that logically belong to an entity.

    A good ER-diagram mostly gives rise to normalized tables. If not, tables need to be normalized. Normalization is a process that converts tables to other tables without anomalies. For an existing database the following holds: the higher the normal form (see below) of the tables, the better the database (less problems).

    Consider the table PROF_SUB_POINT(Prof, Subject, Points)

    PROF_SUB_POINT

    Prof Subject Point

    Hellings Algebra 4

    Van Huyck Analysis 6

    Verraedt Comp & Data 4

    Vanbeveren Analysis 6

    Verraedt Algebra 4

    If we delete the row “Verraedt, Comp & Data, 4”, we lose not only the fact that Verraedt teaches the subject Comp & Data but also the fact that the subject Comp & Data has a weight of 4 points. This is called a deletion anomaly: deleting data of one entity (the fact that Verraedt teaches

    Comp & Data) means that unintentionally we also delete data of another entity (the fact that Comp & Data counts for 4 points).

    If we want to add that Chemistry has a weight of 8, this is only possible if we also have a prof who teaches Chemistry.

    Data concerning a particular entity can be added only if we also have information about another entity. This is called an insertion anomaly.

    In the table above some data are recorded more than once. This is the case for the points of a subject if there are more than one professor who teaches this subject.

    We can solve the problem by splitting the table into 2 tables PROF_SUB(Prof, Subject) and

    SUB_POINT(Subject, Point).

    From data model to tables 6

PROF_SUB

    Prof Subject

    Hellings Algebra

    Van Huyck Analysis

    Verraedt Comp & Data

    Vanbeveren Analysis

    Verraedt Algebra

    SUB_POINT

    Subject Point

    Algebra 4

    Analysis 6

    Comp & Data 4

    For the two tables some referential integrity constraints can be added.

     e.g. each subject from the column Subject of SUBJECT_POINT must exist in the column

    Subject of PROF_SUBJECT (each “existent” subject must be taught)

     e.g. each column Subject of PROF_SUBJECT must exist in the column Subject of

    SUB_POINT (each subject that is taught must have a weight)

     both conditions must hold

    The real problem arises from the fact that the original table contains data of two different subjects (entities). A normalized table only can have one subject.

    Anomalies can be avoided by normalizing the tables. Several normal forms exist: the first, second, third, Boy-Codd, fourth and fifth normal form where each next one implies the previous one.

    5.2 First and second normal form

    A table is in first normal form if the table represents a relation. Hence all tables considered are in first normal form.

    The previous example contains a dependency where only part of the key is needed. (Prof, Subject) is the key and Subject ;Point. Subject is a determinant and only a part of the key or Point is partly dependent on the key.

    A table is in the second normal form if it is in the first normal form and all attributes not included in the key are dependent on the whole key.

    The table can be transformed to tables in second normal form by splitting it as explained above. If the key is simple, the table automatically is in second normal form. A table only can be not in second normal form is it has a composite key.

    From data model to tables 7

5.3 Third normal form

    Consider the following table.

    EMP_DEP_LOC

    EmployeeNr Department Location

    7499 sales Kontich

    7521 sales Kontich

    7654 sales Kontich

    7782 main Antwerpen

    7788 education Leuven

    We have EmployeeNr;Department and Department;Location because each employee works

    in one department and each department has only one location. Thus EmployeeNr;Location is a transitive dependency.

    Also in this case deletion and insertion anomalies occur. E.g. if we delete the data of employee

    nr. 7782, we also lose the fact that the main office is located in Antwerp. And how do we record

    that the department human resources is located in Heverlee?

    We can delete the transitive dependency by splitting the table in two EMP_DEP(EmployeeNr, Department) and DEP_LOC(Department, Location).

    EMP_DEP

    EmployeeNr Department

    7499 sales

    7521 sales

    7654 sales

    7782 main

    7788 human resources

    DEP_LOC

    Department Location

    sales Kontich

    main Antwerpen

    human resources Leuven

    A table is in the third normal form if it is in the second normal form and it contains no

    transitive dependencies.

    From data model to tables 8

5.4 Boyce-Codd normal form

    Consider the table GROUP_SUBJECT_PROF Group Subject Prof

    1 mathematics Cauchy

    1 physics Maxwell

    2 chemistry Curie

    2 mathematics Euler

    3 physics Einstein

    3 chemistry Curie

    3 mathematics Cauchy

    Each group studies more subjects, each subject is taught by more than one prof but every prof

    teaches only one subject.

    Group determines neither Subject nor Prof, thus Goup is no candidate key. (Group, Subject) and

    (Group, Prof) both are candidate keys and Prof;Subject.

    The table is in the third normal form. Once again anomalies occur. E.g. if we eliminate group 3, we lose the fact that Einstein teaches

    physics.

    We say that the table is in Boyce-Codd normal form (BCNF) if the table is in the third

    normal form and each determinant is a candidate key. We can transform the above table to BCNF by splitting it in two.GROUP_PROF(Group, Prof) and PROF_SUBJECT(Prof, Subject).

    GROUP_ PROF

    Group Prof

    1 Cauchy

    1 Maxwell

    2 Curie

    2 Euler

    3 Einstein

    3 Curie

    3 Cauchy

    From data model to tables 9

PROF_SUBJECT

    Prof Subject

    Cauchy mathematics

    Maxwell physics

    Curie chemistry

    Euler mathematics

    Einstein physics

    5.5 More normal forms

    As already mentioned, a fourth and a fifth normal form exist. It hereby concerns multivalued dependencies (a generalization of functional dependency) and generalizations of multivalued dependencies. We will not discuss this topic in this course.

    1We end by mentioning the result of Fagin who defines the domain/key normal form. A table is

    in domain/key normal form if every constraint is a logical consequence of candidate keys and domains (domain = set of possible values of an attribute). Fagin proves that the domain/key normal form implies all previous normal forms and excludes anomalies.

     1 Fagin R. „A Normal Form for Relational Databases that is based on Domains and Keys.‟ Transactions on Database Systems 6 (September 1981).

    From data model to tables 10

Report this document

For any questions or suggestions please email
cust-service@docsford.com