Intro to Databases & Data Modeling
What is a Database?
A database is an organized collection of data that is stored and accessed electronically.
Databases are designed to manage large amounts of information efficiently and allow for quick retrieval, insertion, updating, and deletion of data. They are typically managed by Database Management Systems.
In this chapter we will be learning about Structured Query Language (SQL) to interact with a database.
In SQL, data is organized into tables, which consist of rows and columns. Each row represents a single record, while each column represents a specific attribute of that record. This structure allows for efficient data storage, retrieval, and manipulation.
Important Terminologies
Table: A table is a collection of related data organized in a grid of rows and columns.
Each table in a database represents a specific entity, such as customers, products, or orders.
Fields: A field is a single piece of information or data value stored within a record or row.
Fields act as containers that hold data of a particular type, such as text, numbers, or dates.
In the context of a table, fields are synonymous with columns.
HOWEVER, we can also use the word
"field"
to represent the values within a record
Records: A record is a collection of fields that contain data about a specific entity.
In a table, a record is typically stored as a row.
Each record includes all the data for one entity instance.
Attributes: Attributes are the characteristics or properties that define and describe entities within a database.
Each attribute corresponds to a column in a table and holds specific data values for an instance of an entity.
Example
1
Alice
20
Computer Science
2
Bob
22
Mathematics
3
Carol
21
Physics
The diagram above is a table called
StudentData
There are 3
records
(rows)There are 4
attributes
called:StudentID, Name, Age, Major
The record with
StudentID
of1
has the following fields:1, Alice, 20, Computer Science
Databases vs. Spreadsheets
Structure
Highly structured with tables, rows, and columns.
Less structured, often used for individual data sets.
Data Volume/size
Can handle large volumes of data efficiently.
Best for smaller data sets.
Data Integrity
Enforces data integrity and relationships between tables.
Limited data integrity features.
Concurrency (Multiple access at once)
Supports multiple users accessing and modifying data simultaneously.
Limited support for concurrent access.
Complex Queries
Can perform complex queries and transactions.
Limited querying capabilities.
Automation
Supports automation through triggers, stored procedures, and scripts.
Basic automation through formulas and macros.
Advantages of Databases
Scalability: Databases can handle large volumes of data and scale as needed.
Data Integrity: Ensures data accuracy and consistency through constraints and relationships.
Security: Provides robust security features to protect sensitive data.
Concurrent Access: Allows multiple users to access and modify data simultaneously without conflicts.
Complex Queries: Supports complex queries and data manipulation through SQL.
Backup and Recovery: Offers advanced backup and recovery options to prevent data loss.
Advantages of Spreadsheets
Ease of Use: User-friendly interface, easy to set up and use without extensive training.
Flexibility: Great for ad-hoc analysis and small data sets.
Visualization: Built-in tools for creating charts and graphs.
Cost: Often more cost-effective for small-scale data management tasks.
Accessibility: Easily accessible and shareable, especially with cloud-based solutions like Google Sheets.
Last updated