Project 2 Deliverable

For this assignment, everyone needs their PostgreSQL account on vwardrobe. Your account name will be your Westmont email address. Your password was announced in class and you should have changed it by now.
  1. Please attach a copy of your E/R schema from Part 1 of your project. If you have modified your design because of my feedback (or any other reason), please submit the modified design instead; the new design will not be graded but will be compared with your relational design.
  2. Write an SQL database schema for your project, using the CREATE TABLE commands. Pick suitable datatypes for each attribute. Section 2.3 of your text provides examples and additional information. Specifics can be easily found in the PostgreSQL documentation. Submit a file, "createschema.sql", that contains the SQL commands you use to create your database schema (it is a good idea to keep and maintain this file for the balance of the course).
  3. Write and include another file, "destroyschema.sql", that will drop all of the relations that you created in "createschema.sql".
  4. In yet another file, "runqueries.sql", use the INSERT command to insert a few tuples into each of your relations.
  5. Develop a substantial amount of data for your database and load it into the appropriate relations using the postgres copy command. (Try '\h copy' in psql for information on how to copy data from files to tables.) Place the "copy" command inside your "runqueries.sql" file after the inserts.

    To create the actual data, write, and include in your submission, a program in your favorite programming language (that I can run on Linux) that creates large files of tuples or records in a format acceptable to the Postgres copy command. (Make sure that your code is adequately documented!) After generating and saving this data to a file, load the resulting file into your Project relations. If you are using real data for your project, your program will need to transform the data into files of records conforming to your project schema. The rest of you will write a program to fabricate data; your program will generate either random or nonrandom (e.g., sequential) records that conform to your schema. Note that it is both fine and expected for your data values -- strings especially -- to be meaningless gibberish. The point of generating large amounts of data is so that you can experiment with a database of reasonable size, rather than a "toy" database. The data you generate and load should be on the order of:

    • At least two relations with a few thousand tuples each,
    • At least one additional relation with hundreds of tuples,
    • The rest having 10 to 20 tuples.
    If the semantics of your application includes relations that are expected to be relatively small (e.g., schools within a university), it is fine to use some small relations, but please ensure that you have relations of the sizes prescribed above as well. When writing a program to fabricate data, there are two important points to keep in mind: (a) Although you have not (yet) declared keys in your database schema, you should already know what they are. (Note, you did declare keys in your E/R diagrams and relational schema.) Be sure not to generate duplicate values for these attributes when designing your data-generation program. (b) Your project certainly includes relations that are expected to join with each other. For example, you may have a Student relation with attribute courseNumber that is expected to join with attribute number in relation Course. In generating data, be sure to generate values that actually do join -- otherwise all of your interesting queries will have empty results! One way to guarantee joinability is to generate the values in one relation, then use the generated values in one relation to select joining values for the other relation. For example, you could generate course numbers first (either sequentially or randomly), then use these numbers to fill in the courseNumber values in the Student relation.
  6. Include a README file with information about each of the files you're submitting. Don't forget to save a copy of your work for reference as you proceed to subsequent deliverables of the project.
  7. Submission

    Refer as needed to the information under the heading, Some Mechanics for Project Assignments, given in Project 1. Hopefully, you are working on your project in a folder with sub-directories for each deliverable. Regardless, create a folder consisting of your name and "p3" for this deliverable. In this folder, place your E/R diagram (reflecting any changes you have made); the files createscheama.sql, destroyschema.sql and runqueries.sql; any program and/or data files needed to generate random data; and a README file that describes each of the included files. Zip or tar this folder and its contents so that when I unpack it, it will create a folder with your emailname and "p3". I expect to be able to unpack your file, visit your directory, look at your README file, load create and populate your database from psql with "\i createschema.sql" and "\i runqueries.sql", run any other ad hoc queries of my own choosing, and then remove your schema with "\i destroyschema.sql".