Project Guidelines, Part 1

As the course progresses you will be building a substantial database application for a real-world scenario of your choosing. This will involve designing a relational schema for the database, and creating an actual database using a relational database management system. You will populate the database with sample data, write interactive queries and modifications on the database, and develop user-friendly tools for interacting with the database.

Some Mechanics for Project Assignments

Throughout the project, your deliverables will be due on the day and time specified on Canvas; make sure that your Canvas settings will promptly notify you when announcements are made or when assignments are updated. Assignments submitted after the due date/time may not receive credit. For all submitted work, include your name (both within files and in the name of the submission) and contact information (within submission components). As always when submitting more than one file or document, you should create a compressed tar (.tgz) or zip file that when expanded will create a folder having your name or username. Your archive should always include a plain-text README file that (minimally) has your name, acknowledgments, and a manifest of included files with brief description of what the files are and the role they play in your submission. As you make revisions to your project, you should record a description of what changed and why.

Over the course of the semester, you will be building and adding more and more functionality. Except for your data-generation program in Project 2, everything should appear at the top-level of the folder created by extracting your archive. You may modify files from one project to the next; just be certain to document those updates in your README file that must be submitted with each deliverable.

Choosing Your Database Project

For your first task, you must propose two domains you would like to manage with your database. I am having you pick two domains in order to maximize the chance that one will be acceptable (and to give you more practice in this crucial step of the design process). For your submission, keep the two candidate application domains clearly distinct. I suggest that you pick two domains that interest you, since you'll be working on one of these for the rest of the semester. I predict that if you pick something about which you are excited -- a hobby, material from another course, a research project, etc. -- you will maximize what you learn from this class (not to mention maximizing your fun in the process). You must also consider what purpose your proposed database will serve. Think about the types of queries you would want to ask of your database. What benefit could your database serve to society (if fully scaled, etc.)? If you cannot come up with interesting questions to ask of the data, then your project will probably be painful to complete.

Try to pick applications that are relatively substantial, but not too enormous. For example, after expressing it in the entity-relationship model, you might want your design to have in the range of five to seven entity sets, and a similar or slightly larger number of relationships. A reasonable design might have a total number of entity sets plus relationships in the 12-18 range; I will look askance at proposals having more or fewer. Be aware, however, that entity sets or relationships that should be represented by attributes instead (a matter we'll discuss in class) do not count. If in doubt, pick a more complex domain and then only model a sufficiently rich subset.

Entity-Relationship diagram

Having settled on two candidate problem domains for your project, your next step is to construct entity-relationship diagrams for both. An entity-relationship diagram should reflect a moderate number of entity sets and relationships -- in the 12-18 range. You should already be thinking about the types of queries you would want to run. In other words, what is the intended purpose of the database? The answer to this question should guide your modeling process.

You should certainly include different kinds of relationships (i.e., many-one, many-many, one-one) and different kinds of data (strings, integers, etc.), but your application is not required to use features such as subclassing, multiway relationships, or weak entity sets, if they are not appropriate for your application.

  1. Create two entity-relationship diagrams for your proposed database applications. As always, don't forget to underline key attributes and include arrowheads and rounded arrows indicating the multiplicity of relationships. If there are weak entity sets, indicate them by doubled shapes, as described in the text. I suggest you use a drawing tool for this part of the assignment, although a scanned pdf of a neatly hand drawn diagram is acceptable.
  2. Use the method for translating an E/R diagram to relations that we covered in class and in the text in order to produce two sets of relations from your E/R designs. Specify your relational schema using the notation introduced in class and be sure to underline key attributes. For each relation, include a list of functional dependencies.
  3. Are there any flaws in the relational database schema you get from step 2? Are there opportunities to combine relations without introducing redundancy? If so, indicate these, and if not, indicate that you found none. Are there examples of non-BCNF, 3NF or 4NF relation schemas? If so, should you decompose them? For each opportunity to combine or decompose relations, decide whether or not to do so, and explain your reasoning briefly (e.g., explain what queries you expect will be typical for your database, and tell how the design you pick facilitates them). Is there anything you still don't like about the schema (e.g., attribute names, relation structure, duplicated information, etc.)? If so, modify the relational schema to something better. You will be working with this schema quite a bit, so it's worth spending some time at the beginning to make sure you're happy with it.

Project 1 Deliverable Requirements

(10pt) For each proposed projects, describe the database application you propose to work with throughout the course. Your descrinptions should be brief and relatively informal but should include sufficient motivation to explain why this is an interesting database proposal and sufficient description to discern that your scope is not too large or too small. You should include examples of the kinds of questions to which your database will support answers. If there are any unique or particularly difficult aspects of your proposed application, please point them out. Your description will be graded on suitability, completeness and conciseness.

(10pt) Include your entity-relationship diagrams as described above. Don't forget to save a copy of your E/R diagram for later reference and revision as you work on subsequent parts of the database project.

You should submit two pdf documents, one for each proposed domain. Each document should include the components described above: project description, E/R diagram, conversion to relations and functional dependencies, and final analysis regarding design decisions and normalization.

In your zip or tar file, include a plain-text README file that has your name, acknowledments, manifest for included files and a note expressing your preference between the two proposals that you are submitting.