Unit 6

Unit 6: Database Systems

Database: an organised collection of data which can be accessed in many different ways.

FLAT FILES:

All data stored in a single table
The headings are the field names
There is no sharing data
Can be created using a spreadsheet

Problems with Flat Files:

Duplication of data - data is entered more than necessary
Inconsistencies can arise - a change of address results in records having old addresses and new addresses.
Very restrictive - only suitable for small sotres of data.

1 of 12

Unit 6: Database Systems

RELATIONAL: A large collection of data items stored im tables containing links between them structured in a way that it can be accesssed by a number of different applications programs. The tables are linked together by primary and foreign keys.

Features of a Relational Database:

Data is held in two or more tables
Links between the tables
Data from any of the tables can be extracted
Greater knowledge is needed to create them

2 of 12

Unit 6: Key 'Data' Words

Data Consistency: Is where the relationship between the input data, the processed data and the output data are related. If the system works correctly the data will be correct at each stage so said to be consistent.

Data Redundancy: Data which is repeated unncessarily or stored twice in a database.

Data Integrtity: The correctness of the data. Truthful and Accurate.

Data Independance: The data and the applications programs used to access it are independant/ seperate.

3 of 12

Unit 6: Data Warehousing

Used to store all an organisations historical data and is used by management information systems to extract information which will help make decisions.
Is a corporate resource which everyone in the organisation has access to.

E.g. finding out the day of the week on which a particular store sold the greatest number of a certain product.
E.g. how employee sickness varied over the last year between Manchester and Newcastle.

Advantages:

Can then be mined
Allows the company to store all details of what is sold to each customer.
Company can see who uses a loyalty card and what they have brought.
Can compare information like sickness data from other stores.
Sotring all the historical data better equips managers to make decisions.

4 of 12

Unit 6: Data Mining

Once data has been stored in a data warehouse it then needs to be mined to discover:

patterns in the data
associations in the data (people who read the times are more likely to drink red wine)
Trends over time (person buying more healthy food and drinking less alcohol)
By mining the mass data it allows users to understand the data more
Looking for patterns presents the results in the form of tables and graphs.

Can produce information such as:

Lists of customers likely to buy a product
Comparisons with competitors
Analysis of best sites for shops
Predictions for future sales
Customer buying patterns

5 of 12

Unit 6: Applications of Data Mining

Helping fight against terrorism since 9/11 -
Government has been analysing peoples travel, spending and communications habits using data mining, in order to spot patterns of abnormal behavioural that could lead it to terrorists.

Fighting shop lifting in clothes retailer Jaeger -
Using data mining with information about transactions and position of the clothes in the stores, they found that most of the clothes stolen were situated near doors desptite them being security tagged.

Identification of customer needs -
Virgin Media who supply broadband, telephone and cable TV packages use data mining to segment the target customers most likely to buy new services or upgrade those services they already have.

6 of 12

Unit 6: Distributed Databases

A single logical database whose parts are physically located in more than one place, each location is accessible accross a network.

Why use a distributed database?

Many organisations have branches and offices located accross the country or world.
A supermarket stock system with many branches or travel company with branches off it.

A central database updated over night as a batch operation, but day to day work in the branches are carried out using the local part of the database. If the network fails then they carry on as normal until it's fixed.

Duplicated Database: A local copy of the entire database is kept at each location.

7 of 12

Unit 6: Disadv & Adv of Distributed Databases

Disadvantages:

Very heavy use of the network
Whole database may be huge requiring extra storage
Difficult to keep all the copies in synchronisation

Advantages:

If the database is fairly small, then this may be a practical solution.

8 of 12

Unit 6: Partitioned Database

The database is split into conveneient data sets depending on the specific needs of the organisation. E.g. a supermarket company that has dozens of stores dotted around the country needs an efficient stock control database so each store has its own section of the stock control database.

Disadvantages:

Not so useful if each branch needs information on other branches
The database must be carefully positioned to keep each section as local as possible

Advantages:

Each store is independent problems in one store dont affect others
Network load is less
High performance as no network bottleneck

9 of 12

Unit 6: Partitioned + Index Database

Includes an index of all the remote database records as well. Still keeps all records as local as possible but also keeps the indexes updated on a nightly batch run.

Disadvantages:

The system must now keep all indexes up to date
More complicated

Advantages:

High performance as queries and updates remain local
Can efficiently access remote records by using the index rather than a network query to a central database.

10 of 12

Unit 6: Security Issues

Multiple Entry Points - there are many data entry points to the system. Each 'node' of the system has to be kept physically and logically secure.

Encryption Keys - parts of the system exchange secret keys that are then used to encrypt network traffic. So the more the keys are spread around the more likely they are to be compromised.

Corrupted Node - if one node is compromised by a virus or hacking attempt then the rest of the system is vulnerable as well.

11 of 12

Unit 6: Adv and Disadv of Distributed Database

Advantages -

Resilient - a problem in one part will not stop the other branches from working.
Security - staff access can be limited to only their part of the database
Network traffic reduced - bandwidth costs reduced
Local database still works if the company network is broken
High performance - queries and updates are largely local so network bottleneck

Disadvantages -

Complexitiy - more comlpicated to set up and maintain compared to a central database
Security 0 there are many remote entry points in the system compared to a central database
Data Integrity - more complex to make sure data and indexes are not corrupted
Data needs to be carefully parititioned to make the system as efficient as possible
Not so efficient if there is heavy interaction between branches in which case a central database is a better option.

12 of 12

Get Revising

Unit 6

Unit 6: Database Systems

Unit 6: Database Systems

Unit 6: Key 'Data' Words

Unit 6: Data Warehousing

Unit 6: Data Mining

Unit 6: Applications of Data Mining

Unit 6: Distributed Databases

Unit 6: Disadv & Adv of Distributed Databases

Unit 6: Partitioned Database

Unit 6: Partitioned + Index Database

Unit 6: Security Issues

Unit 6: Adv and Disadv of Distributed Database

Comments

Similar ICT resources: