DATA MODELING USING THE ENTITY-RELATIONSHIP MODEL

Click here for audio-text lecture and feed it to the speech agent

Click here for part A of an audio lecture that can be played using RealPlayer

Click here for part B of an audio lecture that can be played using RealPlayer

Click here for part C of an audio lecture that can be played using RealPlayer

  • E-R model is a high-level conceptual model for database design
  • Example
    1. COMPANY has departments
    2. Department has name, number, manager
    Manager is an employee
    Manager has starting date
    Department has several locations
    3. Department controls projects
    Project has name, number, location
    4. Employee has name, social security, number, address, salary, sex, birthdate.
    5. Employee is in one department but may work on several projects
    6. Hours each employee works on each project
    7. Employee has supervisor
    8. Employee has dependents with name, sex, birthdate, and relationship to employee.
    Phases of database design (simplified):


    ENTITIES AND ATTRIBUTES


  • An entity is an object in the real world.
  • Each entity has properties, called attributes.
  • A particular entity has values for the attributes.
  • Attributes may be:
    simple (atomic) (ZIP)
    composite (ADDRESS)
    single-valued (AGE)
    multivalued (DEGREES)
  • Derived attribute:
    AGE from BIRTHDATE
  • Attributes may have NULL values
    unknown
    not applicable




    ENTITY TYPES


  • An entity type is the set of entities having the same attributes
  • We can describe an entity type by an entity type schema
  • The real entities in the entity type are entity instances
    entity type <-> entity instances
    object type <-> object instances
    relation <-> tuples (relation instances)
    intension <-> extension

    Two entity types and some of the member entities:


    KEY ATTRIBUTE OF AN ENTITY TYPE


  • A key attribute uniquely identifies an entity
    The entity type is
    (v1, v2)
    Suppose the entities are:
    (1, A)
    (4, A)
    (5, B)
    The key attribute is the first attribute v1.

    DOMAIN OF AN ATTRIBUTE
    The domain of v1 is {1,4,5}
    The domain of v2 is {A,B}
    CARTESIAN PRODUCT
    {1,4,5} X {A,B}
    {1,A},{1,B},{4,A},{4,B},{5,A},{5,B}

    The CAR entity type. Multivalued attributes are shown between braces {}. Components of a composite attribute are shown between ():


    INITIAL CONCEPTUAL DESIGN OF THE COMPANY DATABASE


  • First we identify the entity types
    DEPARTMENT
    PROJECT
    EMPLOYEE
    DEPENDENT
  • Then we identify the attributes for each entity type
  • Multivalued attributes, composite attributes, derived attributes, attributes with possible nulls
  • Key attributes are identified or assumed.
    For example, we assume each project has a unique project_number
  • Question: Why do we need key attributes?
  • Are there real life objects without key attributes?

    RELATIONS, ROLES AND STRUCTURAL CONSTRAINTS



  • An entity type PARTICIPATES in a relationship type
    eg. EMPLOYEE and DEPARTMENT participate in the relationship WORKS_FOR
  • DEGREE: the number of participating entity types
    eg. WORKS_FOR has degree 2
  • ROLE_NAME: the role each participating entity plays in a relationship.
    eg. EMPLOYEE plays the role "employee" in WORKS_FOR relationship
  • RECURSIVE ROLES: EMPLOYEE can play the role of both "employee" and "supervisor"
  • CONSTRAINTS: We use CARDINALITY RATIO to express a constraint on a relationship type, such as
    1:1
    1:N
    M:N
  • PARTICIPATION CONSTRAINTS: partial or total PARTICIPATION CONSTRAINT
    An employee MUST work for a department
    An employee entity can exist only if it participates in a WORKS_FOR relationship instance
    Thus its participation is TOTAL


    Only some employees manage departments
    The participation is PARTIAL


    A formal constraint: (m, n) where
    m, n are number of times

    PARTICIPATION CONSTRAINT


    An employee MUST work for a department
    An employee entity can exist only if it participates in a WORKS_FOR relationship instance
    Thus its participation is TOTAL


    Only some employees manage departments
    The participation is PARTIAL


    A formal constraint: (min,max) where m, n are min and max number of times an entity participates in a relationship instance. For example, (0,10) means partial participation, and (1,max) means total participation.
    Some instances of the WORK_FOR relationship:


    The ternary relationship SUPPLY:


    The recursive relationship SUPERVISION - EMPLOYEE plays two roles of "supervisor" and "supervisee":


    The 1:1 relationship MANAGES, with partical participation of EMPLOYEE and total participation of DEPARTMENT:


    The M:N relationship WORKS_ON:


    ATTRIBUTES OF RELATIONSHIP TYPES


    An employee entity WORKS_ON a project
    The relationship is M:N
    If we want to keep track of the number of hours an employee works for a project, it is difficult to keep it as an attribute of the employee entity
    We can keep HOURS as an attribute of the WORKS_ON relationship type

    An employee WORKS_FOR a department
    The relationship is N:1
    If we want to keep track of the starting date of an employee working for a department, where do we keep it?
    IN SCHEMA DESIGN, THE DESIGNER MUST DECIDE WHERE TO KEEP ATTRIBUTES.

    WEAK ENTITY TYPES



    John Doe is an employee. He has a dependent Cindy Doe.
    Mary Doe is an employee. She has a dependent Cindy Doe.
    The two DEPENDENT entities are IDENTICAL!
    Such entity types are called WEAK ENTITY TYPES, where entities may not be distinct.
    (They still correspond to same real-life object)
    Weak entity type must be OWNED by some owner entity type. For example, the EMPYLOEE entity type owns the DEPENDENT entity type.
    How can you eliminate weak entity types?

    INITIAL DESIGN OF COMPANY DATABASE


    DEPARTMENT
    Name, Number, {Locations}, Manager, ManagerStartDate

    PROJECT
    Name, Number, Location, ControllingDepartment

    EMPLOYEE
    Name, SSN, Sex, Address, Salary, BirthData, Department, Supervisor, {WorksOn(Project, Hours)}

    DEPENDENT
    Employee, DependentName, Sex, BirthDate, Relationship



    Note: {} indicates multivalued attributes, and () indicates component attributes of a composite attribute

    ADD RELATIONSHIP TYPES TO DATABASE SCHEMA


    MANAGES (1:1)
    EMPLOYEE partial
    DEPARTMENT total
    Attribute: StartDate

    WORKS_FOR (1:N)
    DEPARTMENT total
    EMPLOYEE total

    CONTROLS (1:N)
    PROJECT total
    DEPARTMENT partial

    SUPERVISION (1:N) EMPLOYEE partial
    EMPLOYEE partial

    WORKS_ON (M:N)
    EMPLOYEE total
    PROJECT total
    Attribute: Hours

    DEPENDENTS_OF (1:N)
    EMPLOYEE partial
    DEPENDENT total
    (Weak Entity Type)

    The ER Diagram

    ER diagram is a graphical design tool


    ER schema diagram for the COMPANY database:


    ER diaram for the COMPANY schema with all role names included and with structural constraints on relationships specified using the (min,max) notation:


    Summary of ER diagram notation:


    An example of ER diagram for the library is shown below, where a key icon indicates the relationship is linked to the key attribute, and a ring icon indicates the relationship is linked to an attribute (which could be part of the key).


    A TERNARY RELATIONSHIP TYPE IS NOT EQUIVALENT TO THREE BINARY RELATIONS



  • CAN_TEACH: relates a course to the instructors who can teach that course
    (i,c)
  • TAUGHT_DURING: relates a semester to the instructors who taught some course during that semester
    (i,s)
  • OFFERED_DURING: relates a semester to the courses offered during that semester by ANY instructor
    (s,c)
  • OFFERS: relates an instrcutor who offers a course during a semester
    (i,c,s)
    Constraint: (i,c,s) cannot exist unless (i,c),(i,s),(s,c) exist, but converse is not true.

    The following figures illustrate (a) ternary relationship type SUPPLY; (b) three binary relationship types are not equivalent to one ternary relationship type SUPPLY; (c) another example of ternary versus binary relationship type:



    A weak entitye type INTERVIEW with a ternary identifying relationship type:


    THE UML MODEL

    The Unified Modeling Language(UML) is an object-oriented analysis and design that evolved as a result of the combined work of James Rumbaugh, Grady Booch, and Ivar Jacobson. In 1997, a UML consortium(which include Microsoft, Hewlett-Packard, Oracle, MCI Systemhouse, IBM, IntelliCorp and more) submitted the 1.1 version of the UML to the Object Management Group(OMG). Since then, UML has become a de facto standard in the software industry, and it continues to evolve.

    For database modeling, the most useful part of UML is the Class Diagram and Object Diagram. We can use these diagrams for object-oriented database design.

    An example of the conceptual design of a library information system

    A library is planning to develop a library information system. This library will serve only registered students. Each book has title, author, publisher, date-of-publication, ISBN number and a set of keywords. Each student has SSN, first name, last name, address, telephone number. A student can borrow a book for up to two weeks, so it is necessary to keep track of the date-of-borrow. He can then renew the book for another two weeks. Two renewals are allowed. After two renews, the student must return a borrowed book. If a book is overdue, the student must pay a fine of $0.5 a day.

    ER Model for library information system

    Entity type BOOK(ISBN,title,author,publisher,date-of-pub,{keywords}) Notice {keywords} is a multi-valued attribute. The key is ISBN.

    Entity type STUDENT(SSN,fname,lname,address,telephone) The key is SSN.

    Relation type BORROW(SSN,ISBN,date-of-borrow,no-of-renewal,fine)

    The BORROW relation type is one-to-many. Participation of BOOK is partial Participation of STUDENT is partial

    The UML Class Diagram

                            ---------------
                            |  Borrow     |
          ------------------|-------------|-------------------------
       1  |             N   |  SSN, ISBN  |   1               1    |
    --------------          | data-borrow |                  ---------------
    |   Student  |          | no-renewal  |                  |    Book     |
    |------------|          |  telephone  |                  |-------------|
    | SSN        |          |     fine    |                  |    ISBN     |
    | fanme      |          |-------------|                  |   author    |
    | lname      |          | borrow()    |                  |    title    |
    | address    |          | renewal()   |                  | publisher   |
    | telephone  |          | cal-fine9)  |                  | date-pub    |
    |------------|          ---------------                  |  keywords   |
    | add()      |                                           |-------------|
    | delete()   |                                           |    add()    |
    --------------                                           |  delete()   |
                                                             ---------------
    
    

    The UML Object Diagram

                      -------------------
        --------------|Doe,book1: Borrow|-------------------------------
        |             -------------------                              |
        |                 -------------------                   -------------
        |    -------------|Doe,book2: Borrow|-----------        |book1: Book|
        |    |            -------------------          |        -------------
    --------------                                     |
    |Doe: Student|                              -------------
    --------------                              |book2: Book|
                                                -------------