Welcome





Monday, August 29, 2011

Limitations of E-R model

To quash one very common misconception, we emphasize that the E-R approach is not a relative, a derivative, or a generalization of the relational data model. Infact, it is not a data model at all but a design methodology, which can be applied to the relational model. The term “relationship” refers to one of the

two main components of the methodology rather than to the relational data model.

The two main components of the E-R approach are the concepts of entity and relationship.

· Entities model the objects that are involved in an enterprise.

· Relationships model the connections among the entities.

NAVIGATION TRAP:

Starting with a given entity and moving along the triangle formed by the three relationships, we might end up with a different entity of the same type. This probem is referred as navigation trap.

Consider the example about Client/Broker information.

client/broker information

There is one problem with our design . Suppose that we have the following relationships:

§ Client1, Acct1, Office1 _ HasAccount

§ Acct1, Broker1_ HandledBy

§ Broker1, Office2_ WorksIn

What is there to ensure that Office1 and Office2 are the same i.e., Account1’s office is the same as that of the broker who manages Account1? This problem

is known as a navigation trap: Starting with a given entity, Office1, and moving along the triangle formed by the three relationships HasAccount, HandledBy,and WorksIn, we might end up with a different entity, office2, of the same type.

Navigation traps of this kind are particularly difficult to avoid in the E-R model because doing so requires the use of participation constraints in combination with so-called functional dependencies but these constraints are supported by the E-R model only in a very limited way: as keys and participation

constraints.

Note that we can avoid the navigation trap by removing the relationship HasAccount completely and reintroducing the Owns relationship between clients and accounts. However, this brings back the problem that the constraint that a client cannot have more than one account in any given office is no longer represented.

CONVERSION FROM ENTITY TO ATTRIBUTE OR RELATIONSHIP TO ENTITY:

There is considerable freedom in deciding whether a particular datum should be an entity,a relationship, or an attribute. The arity of a relationship might change by demoting an entity to an attribute or by collapsing an entity into a relationship.



figure 1

Entity or attribute? In Figure1, semesters are represented as entities. However,we could as well make Transcript into a binary (rather than ternary) relation and turn Semester into one of its attributes. The obvious question is which representation

is best (and in which case).

Ø To some extent, the decision of whether a particular datum should be represented as an entity or an attribute is a matter of taste. Beyond that, the representation might depend on whether the datum has an internal structure of its own. I

Ø If the datum has no internal data structure, keeping it as a separate entity makes the E-R diagram more complex and more important, adds an extra relation to your database schema when you convert the diagram into the relational model.

Ø If the datum has attributes of its own, it is possible that these attributes cannot be represented if the datum itself is demoted to the status of an attribute.

For instance, in Figure1 the entity type Semester does not have its own attributes, so representing the semester information as an entity appears to be

overkill. However, it is entirely possible that the Requirements Document might state that the following additional information must be available for each semester:Start_date, End_date, Holidays, Enrollment. In such a case, the semester information

cannot be an attribute of the Transcript relationship.

Entity or relationship? Consider the Figure 1, where we treat transcript records as relationships between Student, Course, and Semester entities. An alternative to this design is to represent transcript records as entities and use a new relationship type, Enrolled, to connect them as shown below.

figure 2

Here we incorporate some of the attributes for the entity Semester, as discussed earlier. We also add an extra attribute, Credits, to the relationship Enrolled. Clearly, the two diagrams represent the same information. but which one is better?

· For instance, it is a good idea to keep the total number of entities and relations as small as possible because it is directly related to the number of relations that will result when the E-R diagram is converted to the relational model.

· Generally, it is not too serious a problem if two relations are lumped together at this stage because the relational design theory is geared to identifying relation schemas that must be split and to providing algorithms for doing that.

· On the other hand, it is much harder to spot the opposite problem: needlessly splitting one relation into two or more.

Coming back to Figure.2, we notice that there is a participation constraint for the entity Transcript in the relationship type Enrolled.Moreover,the arrow leading from Transcript to Enrolled indicates that the Transcript role forms a key of the Enrolled relationship. Therefore, there is a one-to-one correspondence between the relationships of type Enrolled and the entities of type Transcript.

This means that relationships of type Enrolled can be viewed as superfluous, because Transcript entities can be used instead to relate the entities of types Student,

Course, and Semester. All that is required (in order not to lose information) is to transfer the proper attributes of Enrolled to Transcript after converting the latter into a relationship.As a result of this discussion, we have the following rule:

*Consider a relationship type, R, that relates the entity types E1, . . . , En,and suppose that E1 is attached to R via a role that (by itself) forms a key

of R, and that a participation constraint exists between E1 and R. Then it might be possible to collapse E1 and R into a new relationship type that involves only the entity types E2, . . . , En.

Note that this rule is only an indication that E1 can be collapsed into R, not a guarantee that this is possible. For instance, E1 might be involved in some other

Relationship R’. In that case, collapsing E1 into R leaves an edge that connects two relationship types, R and R, which is not allowed by the construction rules for E-R diagrams.

Information loss:

The arity of a relationship might change by demoting an entity to an attribute or by collapsing an entity into a relationship. In all of these cases,however, the transformations obviously preserve the diagrams information content. There are some typical situations where seemingly innocuous transformations cause information loss; that is, they lead to diagrams with subtly different information content.

Consider the Parts/Supplier/Project diagram of Figure.3. Some designers do not like ternary relationships, preferring to deal with multiple binary relationships instead. Such a decision might lead to the diagram shown in Figure .4

figure .3

Figure.4

Although superficially the new diagram seems equivalent to the original, there are several subtle differences.

*First, the new design introduces a navigation trap. It is possible that a supplier, Acme,sells “Screw” and that Acme has sold something to project “Screw Driving.” It is even possible that the screw driving project uses screws of the kind Acme sells.

*From the relationships represented in the diagram it is not possible to conclude that it was Acme who sold these screws to the project. All we can tell is that Acme might have done so.

*The other problem with the new design is that the price attribute is now associated with the relationship Supplies. This implies that a supplier has a fixed price for each item regardless of the project to which that item is sold.

*In contrast, the original design in Figure .3, supports different pricing for different projects Similarly, the new design allows only one transaction between any supplier and project on any given day because each transaction is represented as a triple (p, s, d), so there is no way to distinguish among different transactions between the same parties on the same day.

E-R and object databases:

Some of the difficult issues involved in

translating E-R diagrams into schemas become easier for object databases

* Issues involved in representing entities with set-valued attributes in a relational database.The objects stored in an object database can have set-valued attributes, so the representation of such entities in the schema of the object database is considerably easier.

* Issues involved in representing the IsA relationship in a relational database. Object databases allow a direct representation of the IsA relationship within the schema, so, again, representation of such relationships is considerably easier..

It should be apparent that not only is it generally easier to translate E-R diagrams into schemas for object databases than into schemas for relational databases, but for many applications object databases allow a much more intuitive model of the enterprise than do relational databases.

Monday, August 22, 2011

COMPARING EXPRESSIVENESS OF DIFFERENT QUERY LANGUAGES


This paper outlines a series of exercises that relate propositional logic to various straightforward query languages as used by some popular programs. It also provides exercises which naturally motivate the use of boolean algebra to translate between equivalent formulas (including a practical application of converting a formula to conjunctive normal form). These exercises also illustrate connections between topics student see in logic/discrete math, programming topics, and topics in design.

Click here to download this paper.

Friday, August 19, 2011

Class Assignment 1: Hospital Administration

Schema :
Nurse(NID,Name,Bdate,WID)
Function(FID,Fname,Description)
Ward(WID,Wname,Location)
Services(WID,FID)
Certified(NID,FID)
Queries :
1.print names of nurses not assigned to any ward.
2.print the name of the ward for which no nurse is assigned.
3.for each ward print ward name and no.of services it offers.
4.print the ward with maximum no.of nurses assigned.
5.print the names of nurses whose functions are ensured by the ward to which they are assigned.
6.list the wards that offer all services offered by ward w1.
7.print the name of the most certified nurse.
8.print pairs of nurses assigned to same ward.
9.print the name of wards that ensure each function offered by the hospital.
10.print nurse-id of nurses certified for every function the hospital offers.
11.for each ward print ward-id and nurse-id of most certified nurse.

A Pedagogical tool for Teaching Advanced Database Systems


The designing and implementing database engines may have become a lost art. Although most standard database text books include ample coverage of algorithms for design and implementation database engines, many computer science programs seem to provide minimal coverage of file organizations, theoretical foundations, and algorithms necessary to build a database engine. The systematic removal of “file organizations and information retrieval” as a topic of study coupled with greater emphasis on the so called “practical applications” of databases, have joined hands to eliminate the coverage of theory and implementation of the underlying database engine.


This paper discusses a step by step process by which we in this advanced database course can design and construct a simple, yet fully functional database engine.

Click here to download this paper.

A Simpler (and Better) SQL Approach to Relational Division


A common type of database query requires one to find all tuples of some table that are related to each and every one of the tuples of a second group. In general those queries can be solved using the relational algebra division operator. This paper conveys that the phrasing of this operator in SQL seems to present an overwhelming challenge to novice and experienced database programmers. This paper presents an alternative solution that is not only more intuitive and easier to deliver in the classroom but also exhibits a better computational performance.

Click here to download this paper.

A Relational Model of Data for Large Shared Banks - IBM Research Laboratory

Future users of large data banks must be protected from having to know how the data is organized in the machine (the internal representation). A prompting service which supplies such information is not a satisfactory solution. Activities of users at terminals and most application programs should remain unaffected when the internal representation of data is changed or even when external representation is changed. Changes in data representation will often be needed as a result of changes in query, update etc. Here in this paper we study the inadequacies of the tree structured files or general network data models, and this paper also provides the operations on relations to solve the problems of consistency and redundancy in the user's model.

Click here to download the paper.

Basic Data model proposals


This paper provides a summary of 35 years of data model proposals, grouped into 9 different eras. It discusses the proposals of each era, and show that there are only a few basic data modeling ideas, and most have been around a long time. Later proposal inevitably bear a strong resemblance to certain earlier proposals. Hence, it is a worthwhile exercise to study previous proposals. Here in this paper we study data models in 9 historical epochs;

Click here to download the paper.