Normalization
Normalization is the process of efficiently organizing data in a database."Normalization" refers to the process of creating an efficient, reliable, flexible, and appropriate "relational" structure for storing information.
Goals:
- Eliminating redundant data.
- Ensuring data dependencies make sense. (only storing related data in a table).
Advantages:
- Cures the ‘SpreadSheet Syndrome’.
- Store only the minimal amount of information.
- Remove redundancies.
- Restructure Data.
- Decreased storage requirements.
- Faster search performance.
- Improved data integrity.
Types:
A series of logical steps to take to normalize data tables.- First Normal Form (1NF)
- Second Normal Form (2NF)
- Third Normal Form (3NF)
- Boyce-Codd Normal Form (BCNF)
- Fourth Normal Form (4NF)
- Fifth Normal Form (5NF)
First Normal Form
A table is considered to be in 1NF if all the fields contain only scalar values . i.e; All columns (fields) must be atomic : no repeating items in columns.Here, Ord no & date are non-atomic attributes,(Not in 1 NF).
Non-atomic attributes are eliminited by filling with atomic values where there are non-atomic values are present.
First Normal Form Decomposition
- Place all items that appear in the repeating group in a new table.
- Each row must be unique - Designate a primary key for each new table produced.
- Duplicate in the new table the primary key of the table from which the repeating group was extracted or vice versa.
Functional Dependencies:
If one set of attributes in a table determines another set of attributes in the table, then the second set of attributes is said to be functionally dependent on the first set of attributes.Ex: Table Scheme: {ISBN, Title, Price}
Functional Dependencies: {ISBN} {Title}
{ISBN} {Price}
Second Normal Form
For a table to be in 2NF, there are two requirements- If a data item is fully functionally dependent on only a part of the primary key, move that data item and that part of the primary key to a new table.
- If a data item is fully functionally dependent on only a part of the primary key, move that data item and that part of the primary key to a new table.
- If other data items are functionally dependent on the same part of the key, place them in the new table also.
Let x={ ord no.,prod no.} -> y={date}, {prod name}, {Qty}, {U. price}.
Therexists, z ={ord no.} c x -> {date}
Therexists, z ={prod no.} c x -> {prod name}
This suffers from functional dependencies.
- To convert this relation into 2 NF, we must eliminate the partial functional dependencies by decomposing the relation into several no. of relations.
- We must create a separate relations for the attributes which causes the partial functional dependencies.
Third Normal Form
A relation R is in third normal form (3NF) if and only if it is in 2NF and every non-key attribute is non-transitively dependent on the primary key. An attribute C is transitively dependent on attribute A if there exists an attribute B such that: A ® B and B ® C.Steps to convert 2 NF into 3 NF : we have unitCode as our primary key, we also have a courseName that is dependent on courseCode and courseCode, dependent on unitCode. Though couseName could be dependent on unitCode it more dependent on courseCode, therefore it is transitively dependent on unitCode.
BCNF (Boyce - Codd Normal Form)
A relation schema R is in BCNF if for every set of attributesX in R and attribute A in R, X is a superkey whenever X A.
--- no attribute is transitively dependent on a key
R1=(Prof, S#) K1=(Prof S#)
R2=(Prof, C#) K2=(Prof)
--- R1 and R2 are both in BCNF