0% found this document useful (0 votes)
10 views

04.04 Databases

The document discusses database concepts and designing data structures for storing customer and transaction information for a bank. It describes designing two data structures - one to store unique customer information and another to store multiple transactions for each customer. The customer data structure uses the customer ID as the primary key, while the transaction structure uses customer ID as the primary key and a sequence number as the secondary key.

Uploaded by

Mazin Mukhtar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

04.04 Databases

The document discusses database concepts and designing data structures for storing customer and transaction information for a bank. It describes designing two data structures - one to store unique customer information and another to store multiple transactions for each customer. The customer data structure uses the customer ID as the primary key, while the transaction structure uses customer ID as the primary key and a sequence number as the secondary key.

Uploaded by

Mazin Mukhtar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 6

Databases

This is a huge topic, so only the most basic concepts will be covered.

The ability to permanently store data elevates programs from being mere super-sized
calculators to the feature rich, powerful tools that they are.

The term data structure refers to the logical layout or format of something that stores data,
which happens while the program is running (ie an array in RAM), or when the program ends (ie
in a file). Using the coordinates of a point example, that could mean the use of two arrays for
the x, y coordinates (the associated arrays), or a record of two numbers per line in a file.

The term database refers to the actual data stored in a data structure/an organized fashion (eg
files). Databases can be structured simply or be very complex; they have at least one data
structure (logical organization) but can involve more.

Exercise 04.02.04 dealt with a simple data structure for a bank account. Its structure was:
customer ID balance

But a more realistic situation would be that i) the bank would hold more demographic
information about the customer other than ID, and ii) and would want to hold information
about the history of transactions on the account. A transaction is a set of actions and the
information needed to complete one logical unit of processing. The “one unit of processing”
depends on the circumstances.

For banking, one typical transaction (logical work unit) is “put money into account”, called a
credit transaction or just credit, while another is “take money out of account”, referred to as a
debit transaction or just debit. The detailed information for a credit transaction would be the
amount, the date/time, the account ID, the source of the money etc. The actions associated
with the transaction can include retrieving the account information, validating the account ID,
validating the amount, validate the date/time, etc.

Considerations for designing a data structure for storing the data include whether that data is
unique or whether there are multiples in some way. Data that is unique (ie one record per
entity or “thing”, like 1 customer) is usually keep in one data structure (which usually means
one list and/or one file), while data that is repeated in some way (ie. one entity has many
records, like many transactions for 1 customer) is in another data structure. Data structures
meant to store multiples (in some way), usually need keys that have more than 1 field in it.
For instance, typical customer demographic information could be:

Information Data type


customer ID (positive integer 6 digits long)
customer first name (20 char)
customer last name (20 char)
customer marital status (1 char, M/S, for Married/Single)
owns home (Yes/No)
balance (positive float) #Current amount in the bank

For each customer, this information is usually unique (ie. each customer only has 1 ID, has 1 last
name, has 1 marital status, etc). The information describes how the data is to be used, whereas
the data type describes the physical aspect of the information. For coded information, like
“customer marital status”, any description needs to indicate what can be stored here (ie “M” or
“S”), and what each code means.

An example of how this data would appear could be:


100000 Sammi Chowdhury S N 400.50

Typical account transaction information could be :


customer ID (positive integer 6 digits long)
date/time or sequence (let’s revisit this later)
type (1 char, C/D, for Credit/Debit)
amount (positive float)

Credit means a positive change (add to the account). Debit means a negative change (take away
from account).

There typically wold be zero to many of these types of transactions for each customer.
An example of how this data would be used would be:
100000 Sept 14, 2020, 1:00pm C 100.00
200001 Sept 14, 2020, 1:05pm D 202.11
100000 Sept 25, 2020, 4:15pm D 43.00

Notice in this database, the amounts are always positive even though some amounts are taken
away from the account balance. That’s why there is a type field – C means the amount adds to
the account, while D means the amount is taken away from the account.

An alternate design could skip having a type, but then needs the amount to be positive or
negative. And instead of a date/time to help order the transactions, a sequence number is used
instead.
For example:
100000 002 100.00
200001 121 -202.11
100000 003 -43.00

What information is lost if using the alternative data structure?

According to these transactions, customer 100000 has 2 transaction of which first they credited
$100.00 to their account (sequence 002), then later debited $43 from their account (sequence
003)

A final consideration is that some data could be optional. For example, a bank could have a
data structure to hold information about the properties a customer owns. But not every
customer owns property.

Designing the data structure

Since customer personal or demographic information is unique, it would go into one structure.
By contrast there could be zero to many transactions, so transaction information would go into
another structure. But there are two things to consider.

The data from different customers will be mixed up in the database files – it is clumsy to have a
separate file per customer (although this is technically possible). Which means there needs to
be a way to find the records for one customer among many stored customer records.
Databases often use a key, to identify or manage the records for one customer. The key can be
1 datum, or field from the data structure; or the key can be made of many fields (a field is any
one of the information elements in a structure). The key must provide a unique identifier for
each customer. For the banking situation, the clear choice would be the customer ID. Each file
or array which uses customer information must have this key field, to be able to work with
individual customers.

For the customer demographic, with its many unique fields, this one key is sufficient. However,
in the transactions data structure, one customer will have many transactions. The main usage
for this structure is to hold a customer’s information, so customer ID needs to be the (main) key
field. But to put order to all the records for one customer, a secondary key field, which helps
identify each record uniquely must be used. For bank transactions, this is often a date/time
field. However, as I haven’t gotten around to looking at date/time info in Python, we can use a
sequence number field as a secondary key field. Additional key fields can be added as
So the data structures can be designed as follows:

Customer Information
customer_ID (positive integer 6 digits long) key field
first_name (20 char)
last_name (20 char)
marital (1 char, M/S)
owns _home (1 char, T/F)

Transactions
customer_ID (positive integer 6 digits long) key field
sequence (positive integer) secondary key field
type (1 char, C/D)
amount (positive float)
04.04 Databases

Can skip validations, but do look for modularity for practice, where possible.

04.04.01
Design the data structure for a database given the description of data to be managed by a
family doctor’s office. A patient’s basic information includes :
 their first name
 last name
 being single or part of a family
 being the prime contact for the family or secondary (if a family) or a dependent (child),
 address (unit #/apt #, building #, street, city/town, province, postal code)
 telephone number (only a single number even if the person is part of a family)
 It should be indicated whether the patient is covered by OHIP (Ontario Health Insurance
Plan), and if covered, what the OHIP number is (must have 9 digits with no leading
zeroes).

The database should also keep track of:


 date/time (or sequence) of a patient’s visits
 the purpose of the visit
o whether it was for an initial check-up or
o a follow-up or
o whether the visit required a treatment.
 The database should store the cost of a visit based on purpose.

04.04.02
Use the database files, 04.04.02_transactions.txt and 04.04.02_customer.txt included in the
classroom, which are based on the bank account example discussed, to write a program that
can allow the user to enter a customer ID and choose to display either:
i) current summary, including current balance
ii) history of transactions, in sequence order

Note: When showing the current balance, read in all the relevant transactions, then apply the
transactions (adding the credit amounts and subtracting the debit amounts, in sequence order)
to arrive at the current balance.

Note: The current balances for each customer should be:


123456 $530.50
341900 $1200.00
230001 $475.15
200100 $9823.00
300200 $280.00
290303 $800.00

04.04.03
Modify 04.03.02 so users can update existing records.
That is:
The program allows the user to do the following:
i) update the amount on hand for a specific vegetable
ii) update the price for a specific vegetable
iii) update the “Selling well?” between Y/N/C
iv) Allow the user to print out the current record details of an individual vegetable.
Note: The mode that allows a file to be both read and written to is “r+”.

You might also like