Data Modeling

What is data modeling?

Modeling Real Things

Kinds of data

  • Strings (characters within quotes, "Hi my name is Liz.")
  • Numbers (numeral characters - 1, 3, 27, 49)
  • Boolean (true or false)
  • Arrays(Lists of anything ["Hello", 27, "Arugula"])
  • Objects (key-value pairs)

An object looks like this:

{
    "name" : "value", 
    "property" : "value",
    "age" : "35",
    "weight" : "180",
    "favorite_foods" : ["Artichoke", "Alphalpha", "Anchovies"],
    "favorite_books" : [{"title": "Moby Dick", "author" : "Hermann Melville"}, {"title" : "Where the Wild Things Are", "author" : "Maurice Sendak"}]
}
                        

Data Comes in Hierarchies

Often, data is "nested" in "hierarchies" of objects within objects.

Methods of storing

  • Plain Text (.txt)
  • CSV (.csv)
  • JSON (JavaScript Object Notation)
  • Relational Database (SQL, Oracle)
  • Non-Relational Database (MongoDB, Cassandra, etc)

Structure

car = {
    "name": "Herby",
    "make": "Volkswagen",
    "model": "Bug",
    "purpose": "Love",
    "engineType": "Back",
    "color": "Stripes",
    "year": "1970"
}
Might not be as helpful as...
car = {
    "name": "Herby",
    "make": "Volkswagen",
    "model": "Bug",
    "purposes": ["Love", "Driving around", "Saving people?"],
    "engine": {"location": "Back", "cylinders": 4, "fuel-injected": false, "loud": true},
    "description": {"paint_profile": "Stripes", "colors": ["black", "white", "silver"], "attitude": "sassy"},
    "year": "1970"
}

This matters a lot!

Data can limit your capabilities, or expand them.

Grouping similar pieces of data together helps you stay organized, and helps the computer use it faster and easier. It also helps engineers program things more efficiently.

if (car.engine.loud == true && car.description.attitude == "sassy") {
    console.log("I think Herby the love bug is comin' down the road!");
}
                    

How are things the same?

Let's try to model these books together.

What unique properties do they have? What properties do they share?

How are things the same?

What does it mean to be a book?

{
title: "",
author: "",
length: "",
ISBN: "",
cover: "",
language: "",
customer_rating: "",
tags: "",
amazon_link: "http://www.amazon.com/The-Power-Habit-What-Business/dp/1400069289/ref=sr_1_1?ie=UTF8&qid=1355257104&sr=8-1&keywords=power+of+habit"
}
                        
{
title: "",
author: "",
length: "",
ISBN: "",
cover: "",
language: "",
customer_rating: "",
tags: "",
amazon_link: "http://www.amazon.com/Hyperspace-Scientific-Parallel-Universes-Dimension/dp/0195085140/ref=tmm_hrd_title_0?ie=UTF8&qid=1355257238&sr=1-1",
amazon_link2:"http://www.amazon.com/Hyperspace-Scientific-Odyssey-Parallel-Universes/dp/0385477058/ref=wl_it_dp_o_pC_nS_nC?ie=UTF8&colid=N55WK4E5RGPM&coliid=I39ORQQR1YUM18"
}
                        

Baselines

Usually we try to go from generality - what does everything share in common, to specific - what makes everything unique?

Asking "what does something have to have at minimum to be this type of thing?" is a good way to find out a lot about your models.

Schema.org did a lot of this - check them out for some examples.

Relationships

Non-relational Systems

Modeling Relationships

Or... How does a non-relational DB work?

Types of Relationships

  • One-to-One
  • One-to-Many
  • Many-to-One
  • Many-to-Many

Together!

Let's try to model recipes together.

How are Foods and Recipes related?
Foods can appear in multiple recipes, so it makes sense not to duplicate foods.

It's mostly decision-making

Questions to ask:

  • What does it mean to be an object? (duuuuuude.)
  • If I split these objects up, what does that mean for related data?
  • If I create a relationship, what does that mean for related data?
  • If I create a hierarchy, what does that mean for sub-objects? Related data?
  • Will this allow me to do more things later, or restrict what I can do later?
  • Is it worth the time right now to have an existential crisis about this?! (Usually not.)

Don't be an "Architecture Astronaut"



"It is better to have a codebase you're moderately ashamed of that's full of hacks than nothing at all."
- Ancient Native American Proverb

Resources

  • Schema.org - A bunch of geeks (one of whom I live with) decided to write down once and for all what you need at minimum to be anything.
  • A super technical essay at agiledata.org
  • A whole amazing course on Coursera.org that will teach you this in far more detail than I ever could. (I signed up for this course, if you'd like to study for it with me!)