Data Types in Machine Learning: Continuous vs Discrete

In machine learning, understanding data types is critical to choosing the right models and preprocessing techniques. This guide presents a detailed breakdown of continuous and discrete data types using a hierarchy-style explanation.

Continuous Data

Continuous data includes values that can be infinitely divided and are usually measured. These values make sense when represented in decimal format and support meaningful mathematical operations like addition, subtraction, multiplication, and division.

Characteristics:

Can be expressed in decimals
Infinite possible values
Values fall within a measurable range

Subtypes:

1. Interval Data

Data is measured on a scale with equal spacing between values.
No true zero point (zero does not mean absence).
Often subjective in interpretation.

Examples:

Temperature in Celsius: 10, 20, 30
IQ Rankings:
- 84 - 114 (Average)
- 115 - 129 (Above Average)
- 130 - 144 (Gifted)
- 145 - 159 (Highly Gifted)

Note: You can say 30 is 10 more than 20, but not that it is “50 percent hotter.”

2. Ratio Data

Like interval data, but includes a true zero point.
Objective and mathematically accurate.
Most preferred for machine learning and statistical analysis.

Examples:

Weight: 10, 20, 30, 40
Height: 5, 6, 7 feet
Age: 20, 30, 40 years

Note: You can say 40 is twice as much as 20.

Discrete Data

Discrete data consists of distinct, separate values. These are usually counted and not measured. Decimal representation does not make sense for this type of data.

Characteristics:

Finite or countably infinite values
Decimal values are invalid or meaningless
Used in classification and grouping tasks

Subtypes:

1. Categorical Data

Categorical data assigns observations to categories or labels.

a. Binary

Only two possible values
Least preferred for complex tasks due to low variability

Examples:

Gender: Male, Female
Color (simplified to yes/no): Red, Green
Jersey Number (used as identity): 1, 2, 3

b. Nominal

Multiple categories with no meaningful order

Examples:

Military Rank:
- Second Lieutenant
- First Lieutenant
- Captain
- Major
- Lieutenant Colonel

c. Multiple

More than two unordered categories

Examples:

Eye Color: Brown, Blue, Green
Animal Types: Dog, Cat, Bird

2. Ordinal Data

Categories that follow a meaningful order
Differences between values are not uniformly measurable

Examples:

Clothing Size: Small, Medium, Large, Extra Large
Class Rank: 1st, 2nd, 3rd

3. Count Data

Represents the number of items or events
Cannot have negative or decimal values

Examples:

Number of people in a room
Number of calls received

Summary

Category	Subtype	Ordered	Decimal Valid	Examples
Continuous	Interval	Yes	Yes	Temperature, IQ
	Ratio	Yes	Yes	Weight, Height, Age
Discrete	Binary	No	No	Male/Female, Yes/No
	Nominal	No	No	Eye Color, Military Rank
	Multiple	No	No	Pet Types, Color
	Ordinal	Yes	No	Clothing Size, Class Rank
	Count	Yes	No	Item Counts, Room Population

Final Thoughts

Recognizing the type of data you are working with is key to building effective machine learning models. Continuous data enables rich mathematical analysis, while discrete data supports classification, ranking, and logical segmentation. Choose preprocessing and algorithms that match the nature of your data for optimal performance.

CRISP-DM: A Practical Guide to Data Mining Projects

Deep Dive into Data Types in Machine Learning