Week 6 Worklog

Week 6 Objectives:

  • Review foundational database (DB) concepts, including RDBMS (primary key, foreign key), optimization techniques (Index, Partition), and operational concepts (Database Log, Buffer).
  • Clearly differentiate between the two main DB system types: OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing or Data Warehouse).
  • Understand the managed relational database service Amazon RDS, including core features like Multi-AZ (for high availability) and Read Replicas (for read performance).
  • Learn about Amazon Aurora, AWS’s cloud-native DB service, with its unique shared storage architecture, high performance, and superior features like Zero Replication Lag.
  • Learn about Amazon Redshift, the petabyte-scale Data Warehouse service designed for OLAP, and understand its MPP architecture and Columnar Storage technique.
  • Understand the role of Amazon ElastiCache (Redis, Memcached) as a high-speed caching layer to reduce load on the primary database.

Tasks to be carried out this week:

DayTaskStart DateCompletion DateReference Material
2Database Concepts

- Lesson: Review fundamental database concepts like Database (a structured information system) and Session (a working session).
- Learn about architecture: Relational Databases (RDBMS), including Primary Key (to uniquely identify a row) and Foreign Key (to create links between tables).
- Understand the technique: Normalization, a technique of dividing data into multiple tables (using keys) to prevent data duplication.
- Understand the technique: Performance Optimization:
+ Index: A data structure that speeds up retrieval (read) operations, but increases the cost of writes.
+ Partition: Dividing a large table into many smaller parts to speed up queries.
+ Execution Plan: The set of steps the database decides to use to access data (e.g., whether to use an Index or not).
- Understand the technique: Ensuring integrity and speed:
+ Database Log: Records all changes, important for recovery and replication.
+ Buffer: A temporary storage area in RAM, helping to speed up reads because reading from RAM is faster than reading from disk.
- Lesson: Database Classification:
+ RDBMS (ACID): Fixed structure (Schema), storage optimized (Normalization), scales vertically (Vertical Scaling).
+ NoSQL (BASE): Flexible structure (Dynamic Schema), performance optimized (Denormalization), scales horizontally (Horizontal Scaling).
- Lesson: System Classification:
+ OLTP (Online Transaction Processing): Transaction processing systems (banking, ordering), need to quickly handle read/write/update operations and ensure integrity (roll back).
+ OLAP (Online Analytical Processing): Data Warehouse systems, store historical data for complex analysis (reporting, finding trends).
13/10/202513/10/2025Module 06
3Amazon RDS

- Learn about the service: Amazon RDS (Relational Database Service), a fully managed relational database service, supporting popular engines (MySQL, PostgreSQL, Oracle, etc.).
- Lesson: The goal of RDS is to automate administrative tasks (updates, backups) so users can focus on the application.
- Understand the technique: Automated Backups of the database and transaction logs, allowing for Point-in-Time Recovery within a 35-day window.
- Learn about architecture: Multi-AZ (High Availability)
+ Automatically creates a standby replica in another AZ.
+ Uses Synchronous Replication.
+ Supports Automatic Failover if the primary database fails.
- Learn about architecture: Read Replicas (Read Performance Optimization)
+ Creates read-only copies to offload the primary database (e.g., for reporting tasks).
+ Uses Asynchronous Replication, which can cause “replication lag”.
- Lesson: RDS is often used for OLTP applications and is protected by Security Groups.
14/10/202514/10/2025Module 06
4Amazon Aurora

- Learn about the service: Amazon Aurora, a database developed by AWS, compatible with MySQL/PostgreSQL, part of the RDS service but with higher performance (3-5x faster).
- Learn about architecture: The biggest difference for Aurora is the redesigned storage layer.
- Learn about architecture: An Aurora “Cluster” consists of 1 Writer (write instance) and up to 15 Readers (read instances), all sharing a single (Cluster Volume) storage partition.
- Understand the technique: Data on the Cluster Volume is replicated 6 times across 3 AZs to ensure durability.
- Lesson: Aurora’s outstanding advantage is Zero Replication Lag because the Readers read from the same volume as the Writer.
- Understand the technique: Enterprise features like Backtrack (rewind the database without restoring) and Global Database (create read-only replicas in different Regions).
15/10/202515/10/2025Module 06
5Amazon Redshift

- Learn about the service: Amazon Redshift, a petabyte-scale Data Warehouse service, optimized for OLAP.
- Learn about architecture: Massively Parallel Processing (MPP).
+ Leader Node: Receives, parses, and coordinates queries.
+ Compute Nodes: Store and execute parts of the work in parallel.
- Understand the technique: Columnar Storage.
+ Unlike OLTP (stores by row), Redshift stores data from the same column together.
+ This technique is extremely efficient for analytical (OLAP) queries (e.g., Calculate average age only needs to read the Age column).
- Understand the technique: Redshift Spectrum, allows running SQL queries directly on data in Amazon S3 without needing to load it.

Amazon ElastiCache

- Learn about the service: Amazon ElastiCache, a high-speed in-memory caching service.
- Objective: Speed up applications and reduce the load on the primary database (like RDS).
- Learn about supported engines: Redis (supports many data types, often preferred) and Memcached.
- Lesson: It is the user’s responsibility to write and manage the Caching Logic (logic that decides what and when to cache) within their application.
16/10/202516/10/2025Module 06
6Lab: 000005 - Getting Started with Amazon RDS

1. Create a database on Amazon RDS
2. Connect the application to the DB
3. Backup and Restore

Lab: 000043 - Migrating Databases with DMS and SCT

1. Preparation steps
2. Oracle to Amazon Aurora (PostgreSQL)
2.1 Convert Schema
2.2 Migrate database.

[Supplemental Research] - Database Internals
- Document to learn how databases work internally.
Database Internals Deep Distributed Systems
[Supplemental Research] - The Data Warehouse Toolkit
- Document to learn how to design and the techniques used in building a Data-warehouse
Data Warehouse Toolkit Definitive Dimensional
17/10/202517/10/2025Module 06







Week 6 Achievements:

  • Lesson (Foundational):
    • Clearly differentiate between the two system models: OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing, data warehouse).
    • Master basic DB optimization techniques: Index (speeds up reads) and Partition (divides tables).
    • Understand the role of Database Log (for recovery/replication) and Buffer (uses RAM for speed).
  • Service (RDS):
    • Know Amazon RDS is a managed relational database (OLTP) service.
    • Clearly distinguish RDS’s 2 main features: Multi-AZ (used for High Availability - HA) and Read Replicas (used to increase read performance).
  • Technique (Replication):
    • Differentiate Synchronous Replication (used for RDS Multi-AZ) and Asynchronous Replication (used for RDS Read Replicas, can have lag).
  • Service (Aurora):
    • Know Amazon Aurora is a high-performance, cloud-native database.
    • Understand Aurora’s shared storage (Cluster Volume) architecture and its superior benefit of Zero Replication Lag.
    • Be aware of advanced features like Backtrack and Global Database.
  • Service (Redshift):
    • Know Amazon Redshift is a data warehouse (OLAP) service.
    • Understand the MPP (Massively Parallel Processing) architecture (includes Leader Node and Compute Nodes).
    • Master the core technique of OLAP: Columnar Storage, which speeds up analytical queries.
  • Service (ElastiCache):
    • Know Amazon ElastiCache (Redis/Memcached) is an in-RAM caching service.
    • Understand the role of caching is to reduce load on the primary DB.
    • Be aware of the responsibility to write the Caching Logic in the application.
  • Hands-on:
    • Know how to create and operate (backup/restore) an RDS database.
    • Know how to use DMS and SCT services to migrate a DB from Oracle to Aurora.