In data science, data scientist basically deals with three types of data. Those who have prior knowledge of computer programming, to them this classification may seem to be strange.. The classification of data in data science is as bellow:
- Numerical Data
- Categorical Data
- Ordinal Data
Lets have a brief explanation of those types
1. Numerical Data : Numerical data are those data that have mathematical basis. For example, if we are working on a Diabetics prediction system, then we have to deal with data like, sugar = 56% , bp = 80/130 bla bla bla .. these data have mathematical computation power. These are known as numerical data. In other word , data that can be processed with normal mathematical operations are known as numerical data.
2. Categorical Data : Sometimes we have to use some sort of data just to represent some cluster of objects, but those data don’t have any value related to mathematical operation. Like, in a group of people some people may be ‘Bangladeshi’ some other may be ‘Indian’. These two data, ‘Bangladeshi’ and ‘Indian’ really have no mathematical basis. These data only represent the class of people. You can not set priority based on the above data.
3. Ordinal Data : This type of data is also like categorical data , but quite different. This data don’t have any exact mathematical value, but can be used for Comparison purpose. For example in a movie review website, 3 stars , 5 stars are used for comparing the quality of a movie. But these 3 stars or 5 stars really don’t have any exact values, but they can represent two different sate of quality of a movie.
These three classification of data is in the view, how we human being treat data in our brain. And so the data scientists are trying to make the machine to treat data in the same way.