Unfortunately, there is no clear definition of an anomaly, which is why I adopted the following one: Anomalies are data patterns that do not adhere to a well-defined idea of normal behavior. Anomalies in data can develop for a variety of reasons. Malicious behavior, credit-card fraud, invasions, system failures, etc. These oddities pique the data analyst’s interest. As a result, anomaly detection is a necessary and advantageous procedure in many decision-making systems.
TYPES OF ANOMALIES
It may be divided into three categories:
- Anomalies in Points: A point anomaly occurs when one thing may be detected as an abnormality compared to other objects. This is the most basic anomaly category, and many studies include it.
- Anomalies in Context: if a particular setting contains an unusual object, it is a contextual anomaly (also known as a conditional anomaly) in this circumstance.
- Anomalies in the Collective: If certain related things can be detected as an anomaly versus other objects, in this scenario, only groups of objects can be anomalous; individual objects cannot.
Types of Normalization One NF, Two NF, and Three NF
The database normalization procedure is further classified as follows:
- The Initial Normal Form (1 NF)
- Second Standard Form (2NF)
- Third Standard Form (3 NF)
- The Fourth Normal Form or Boyce Codd Normal Form ( BCNF or 4 NF)
- Fifth Standard Form (5 NF)
- Normal Form, Sixth (6 NF)
- The first normal form demands that a table meet the following requirements:
- Rows are not sorted.
- The columns are not sorted.
- There is data duplication.
- Row-and-column intersections are always unique.
- There are no hidden values in any of the columns.
- A Second Normal Form (2NF) entity is one in which all of its properties are dependent on the whole primary key. As a result, the values in the various columns are dependent on the values in the other columns.
- The table must already be in 1 NF, and all non-key columns must be dependent on the PRIMARY KEY.
- The partial dependencies are eliminated and relocated to a new table.
Only when using a composite main key does Second Normal Form (2 NF) become a problem. The main key is composed of two or more columns.
- Third Normal Form (3NF): The third normal form specifies that fields in a table that do not depend on the key should be removed.
- A table already exists in 2 NF.
- Non-Primary key columns should not be dependent on each other.
- There is no functional dependence that is transitive.
Normalization techniques: BCNF, Four, and Five NF.
- The Boyce-Codd normal form (or BCNF or 3.5NF) is a database normalization form. It is a slightly more potent variant of the third normal type (3NF). Raymond F. Boyce and Edgar F. Codd created BCNF in 1974 to handle some sorts of anomalies not addressed by 3NF as initially established.
- Fourth normal form (4NF): a level of database normalization in which there are no non-trivial multivalued dependencies other than a candidate key. It is based on the first three normal forms (1NF, 2NF, and 3NF) as well as the Boyce-Codd Normal Form (BCNF). It indicates that, in addition to completing the BCNF standards, a database must not include more than one multivalued dependence.
Properties – If and only if the following requirements are met, a relation R is in 4NF.
- It must be in Boyce-Codd Normal Form (BCNF).
- The table shouldn’t contain any multi-valued dependencies.
- Fifth Normal Form / Projected Normal Form (5NF): A relation is in 5NF if and only if each join dependence in R is implied by each of the candidate keys for that relation. When relations are reconnected using a natural join, a relation is divided. into two relations must have Multiple losses formed. The main property ensures that no fictitious or superfluous tuples are produced.
Properties: A relation R is in 5NF if and only if the following requirements are met:
- R should already be in 4NF.
- It cannot be further loss-free deconstructed (join dependency).