Skip to content

Chapter 1 - Data Sources

Contents

Terms

Term Definition Notes
API Application Programming Interface
SQL Structured Query Language
RDBMS Relational Database Management System
ODBC Open Database Connectivity Uses drivers to standardize interfaces between software applications and database.
ERD Entity Relationship Diagram
Foreign Key Primary Key referenced by another table as a constraint

Structured vs. Unstructured Data

Unstructured:

  • Text Documents
  • Images
  • etc.

Structured:

  • Tabular
  • Spreadsheets
  • Database
  • etc.

Database vs. Database Schema

  • Database = Collection of tables
  • Database Schema = Stores table information and relationships (i.e. defines the structure)

One-to-Many Relationships

  • Where a unique entity only occurs in one table once but can have multiple entries in another
  • e.g., patient tbl and appointments tbl

Many-to-Many Relationships

  • Connection between entities where records on each side of the relationship can connect to multiple records on the other side
  • Junction of associated table needs to capture the pairs of related rows
  • Allows the ability to reduce the amount of redundant data stored in the database

Database Normalization

  • Idea of not storing redundant data in a database

Dimensional Data Warehouses

  • Often contain data from multiple underlying sources

  • may contain row and summary data

  • Can contain historical data logs, etc
  • Star scheme design (pg 7)

  • Divides data into facts/dimensions

    • Facts tbl = metadata of an entity and measures
    • Dimension tbl = property of entity you can group or “slice and dice” the fact records by, get further info, etc
  • Table grain

  • level of detail; what set of columns makes a row unique

  • Database roles

  • SME’s = subject matter experts

  • DBA’s = Database administrators
  • ETL engineers = PEople who extract, transform, and load data from a source system into a data warehouse


Jimmy Briggs jimmy.briggs@jimbrig.com | 2022