Data Types
Lecture 3: Types of Data
Outline
- Readings: Data Types and Chapter 1
What is Data?
Cambridge defines data as:
- facts or numbers.
- in electronic form.
- stored and used by a computer.
FBS: What is a spreadsheet?
- Google Sheets: rows, columns, sheets
Finding Data
- Gemini.
- tidytuesday [in .csv]
- Kaggle
- UCI Machine Learning
Examples
What is a variable?
Something that varies. Distinguished from constants.
Classifying Data
From the above definition, anything we can put in a spreadsheet [and that may be too limiting] is data. To classify it, we will need some categories.
- The spreadsheet may impose character limits or the like, that is a programming choice and not tied to definition..
Classifying Data I: Quantative vs. Qualitative
Literally numbers vs. non-numbers. But it is not quite that simple.
There are numbers that are not numerical, e.g. Social security numbers. There is some, but not much, encoded information and there is little we can do with them.
Your Willamette Student ID
Quantitative Data: Levels of Measurement
Four categories:
- Nominal. (No order and no arithmetic)
- Ordinal. (Order, no arithmetic)
- Interval scale. (Addition and subtraction, no multiplication and division)
- Ratio scale. (All arithmetic)
Nominal Data
Names ID numbers
Ordered Data
Ranks Medals Satisfaction Likert scales
Interval data
The year example in the Primer.
Ratio scale
The duration example in the primer.
Distinction: Unit of Analysis
What are the units?
Sometimes, this is singular
Sometimes, it is a combination
Popular units:
cross-sectionalunits [people, countries, products]- time periods – time series
- panel data is a combination of units and time. [Multiple time series or multiple cross-sections over time]
A Data Taxonomy
- Generally column-centric.
- Variables in columns.
- Units in rows.

