Data Storage

Lecture 4: Storing Data

Robert W. Walker

Outline

What is Data?

Cambridge defines data as:

data

  • facts or numbers.
  • in electronic form.
  • stored and used by a computer.

What is a variable?

Something that varies. Distinguished from constants.

Classifying Data

  • Quantative vs. Qualitative.
    • Nominal. (No order and no arithmetic).
    • Ordinal. (Order, no arithmetic).
    • Interval scale. (Addition and subtraction, no multiplication and division).
    • Ratio scale. (All arithmetic).

Data Organization in Spreadsheets

  • Google Sheets: rows, columns, sheets

Sheet

How to Organize Spreadsheets?

  • Be consistent.
  • Dates in YYYY-MM-DD.
  • No empty cells.
  • One thing per cell.
  • Single rectangle [rows as units, columns as variables, header row].
  • Create a data dictionary.
  • Do not include calculations in raw data.
  • Fonts and color hides data.
  • Name things well.
  • Make backups.
  • Use validation.
  • Save the data in plain text.

FBS vs. DBMS

  • File Based System: spreadsheets.
  • Database Management Systems: an interface between files and users.

More on DBMS

GeeksforGeeks

DBMS Architectures and Levels

  • 1 tier: Local database and application.
  • 2 tier: Database on server, client with user and application
  • 3 tier: Application server and database server, client with user and application.

DBMS Types

Relational and non-relational

Joins

SQL Joins

Some Disagree

Takeaways

FBS is usually how things start; structure is expensive and ex ante required.

The challenge: When to transition?

In many ways, the distinction comes down to when to no longer trust organic collaboration by imposing structure.

Musings

  • Much of this distinction may soon become moot.
  • AI makes mistakes and those could be awful.

A Class Assignment to Conclude

How do you think Willamette stores your data?

FBS or DBMS What organizes it and why?