What Is Data Levels of Measurement
Variable types are more than just the difference between numeric and categorical data from a statistical standpoint. In fact, there are four so-called data measurement levels that define what the variable truly signifies and what mathematical operations may be done to it. Additionally, the data type of your variables may have an impact on how machine learning models treat and learn from them.
The data measurement type of your variables may influence how machine learning models treat and learn from them.
In order from lowest to highest, the four levels of data measurement are nominal, ordinal, interval, and ratio.
NOMINAL: The term “nominal” derives from Latin and means “being such merely in name.” The only information carried by nominal data is the observation’s group. “”Color is an illustration of this. Yellow may be encoded as 1 and blue as 2, but these numbers would have no value or significance. And such encoding would not instantly convert green to 1.5.
ORDINAL: Ordinal data, as the name implies, contains some sort of order. It allows you to choose, such as your education level: elementary is lower than high school, which is lower than university. The ordering allows us to compute the median: if our dataset has 100 samples of each education level, the median education is high school. According to the median definition, half of the cases have a high school or primary education, while the other half have a high school or university education, which is correct and may be significant.
INTERVAL: Interval data is constructed on top of ordinal data. It also specifies that the intervals between the values are the same, in addition to ordering them. Temperature, measured in degrees Celsius, is an excellent example: the difference between 1 and 5 degrees is the same as the difference between 20 and 24 degrees, i.e., 4 degrees. This was not the case for ordinal data; we cannot state that the difference between graduating from a high school and an elementary school is the same as the difference between a university and a high school.
In the case of interval-type variables, calculating the arithmetic mean, in addition to the mode and median, makes sense. Interval data can also be transformed using linear transformations.
RATIO: On top of interval data, ratio data is built. The distinction is that ratio type variables can have valid zero values. price, length, weight, amount of anything, or Some Examples include temperatures measured in Kelvin. We can derive ratios between two data points using the meaningful zero: Four apples cost twice as much as two, or $5 costs half as much as $10. This wasn’t the case with interval data: we can’t say that a temperature of 10 degrees Celsius is twice as hot as one of 5 degrees. Ratios are useless for scales that lack a meaningful zero.
What is the significance of the measurement level?
To begin, you must first understand the degree of measurement in order to interpret the data from that variable. When you know a measure is nominal (as in the example above), you know that the number values are only abbreviations for the lengthier names. Second, knowing the degree of measurement allows you to decide what statistical analysis should be run on the data you’ve been given. If a measure is nominal, you would never average the data values or do a t-test on the data.
It is critical to note that the level of measurement concept implies a hierarchy. Assumptions are looser and data analyses are less sensitive at lower levels of measurement. Each level in the hierarchy adds something new to the current level, which incorporates all of the features of the one below it. In general, a higher measurement level (e.g., interval or ratio) is preferable to a lower measurement level (nominal or ordinal).