For this assignment, everyone needs their PostgreSQL account. Your
account name will be the same as your Westmont student email account
(typically your first initial and first seven letters of last name, but
not always). Your password will be the final six (6) digits of
student ID number. Accounts are activated as of Tuesday, Sept 30
You should submit your project
assignments from this one onward as email. When multiple files
are involved in an assignment, please bundle them with tar or zip.
a README file explaining what is in
the directory and what role the files play in answering the various
parts of the assignment. Please put your name in the README
Project 3 Deliverable
Write an SQL database schema for your project, using the CREATE
Pick suitable datatypes for each attribute.
Page 292-293 of the text gives you the principal options regarding
but PostgreSQL specifics can be easily found in the documentation.
Hand in one (1) file with the commands you use to create your database
schema (it is a good idea to keep this file for the balance of the
Show the response of psql
to a request to describe each of
For example, to see the schema for relation Foo type
Execute five INSERT commands to insert tuples into one
Show the response of psql and the relation that results
you issue a SELECT * command.
Develop a substantial amount of data for your database and load
your relations using the postgres copy command. (Try '\h
copy' in psql for
information on how to copy data from files to tables.)
To create the data,
write a program in any programming language you like that
creates large files of records in a format acceptable to the Postgres
then load the data into your Project relations. If you are using real
data for your project, your program will need to transform the data
files of records conforming to your project schema. The rest of you
write a program to fabricate data: your program will generate
either random or nonrandom (e.g., sequential) records conforming to
your schema. Note that it is both fine and expected for your data
values--strings especially--to be meaningless gibberish. The point
of generating large amounts of data is so that you can experiment with
a database of realistic size, rather than a "toy" database.
The data you generate and load should be on
the order of:
At least two relations with a few thousand tuples each.
At least one additional relation with hundreds of tuples.
If the semantics of your application includes relations that are
expected to be relatively small (e.g., schools within a university),
it is fine to use some small relations, but please ensure that you
have relations of the sizes prescribed above as well. When writing a
program to fabricate data, there are two important points to keep in
Although you have not (yet) declared keys in your
relations, in many cases you probably know that an attribute or set of
attributes in a relation will serve as a key. If so, be sure not to
generate duplicate values for these attributes.
Your project almost certainly includes relations that are
expected to join with each other. For example, you may have a
Student relation with attribute courseNo that's expected
join with attribute number in relation Course. In
generating data, be sure to generate values that actually do
join--otherwise all of your interesting queries will have empty
results! One way to guarantee joinability
is to generate the values in one
relation, then use the generated values in one relation to select
joining values for the other relation.
For example, you could generate course
numbers first (either sequentially or randomly), then use these
numbers to fill in the courseNo values in the student
Turn in your program code for generating
or transforming data, a small
sample of the records generated for each relation (5 or so records per
relation), and a script showing the loading of your data into
of the material for the project descriptions and assignments has been
taken from Professor Ullman's website.