Unit 3: Parallel and Distributed Databases

Syllabus:

Parallel and Distributed Databases: Distributed Data Storage – Fragmentation & Replication, Location and Fragment.
Transparency Distributed Query Processing and Optimization, Distributed Transaction Modeling and concurrency Control, Distributed Deadlock, Commit Protocols, Design of Parallel Databases, Parallel Query Evaluation.

1. Distributed Data Storage

Distributed database systems store data across multiple physical locations (sites or nodes) that are connected via a network. This allows for better scalability, reliability, fault tolerance, and performance compared to traditional centralized databases.
In a distributed database system, data can be fragmented and replicated across different locations, enabling efficient access and management of large datasets.

      ┌───────────────────────────────┐
      │         Distributed DBMS      │
      ├───────────────────────────────┤
      │ Site 1   │ Site 2   │ Site 3  │
      ├───────────────────────────────┤
      │ DBMS 1   │ DBMS 2   │ DBMS 3  │
      ├───────────────────────────────┤
      │ Data 1   │ Data 2   │ Data 3  │
      └───────────────────────────────┘

Advantages of Distributed Databases

Scalability: Can handle large volumes of data and users by adding more sites.
Reliability: If one site fails, others can continue to operate.
Performance: Data can be located closer to where it is needed, reducing latency.
Flexibility: Different sites can use different database management systems (DBMS) or configurations.
Unified Access: Users can access data from multiple sites as if it were a single database.

Challenges of Distributed Databases

Complexity: Managing data across multiple sites increases complexity in design and administration.
Consistency: Ensuring data consistency across sites can be challenging, especially with concurrent updates.
Communication: Network latency and bandwidth can affect performance.
Security: Protecting data across multiple sites requires robust security measures.

1.1 Fragmentation

Fragmentation is the process of dividing a database or relation into smaller, manageable pieces called fragments. Each fragment can be stored at different sites in a distributed database system. Fragmentation allows for better data distribution, improved performance, and easier management of large datasets.

1.1.1 Types of Fragmentation

Fragmentation can be done horizontally (by rows), vertically (by columns), or in a mixed manner. The choice of fragmentation strategy depends on the access patterns and usage of the data.

Horizontal Fragmentation

Horizontal fragmentation divides a relation into smaller subsets of rows, distributing them across different sites.

Based on a selection operation: Fi = σpi(R) where pi is a predicate
Each fragment contains a subset of the rows of the original relation
All fragments have the same schema as the original relation

Example: An Employee relation might be fragmented by department or region.

-- Original Employee table
CREATE TABLE Employee (
    emp_id INT PRIMARY KEY,
    name VARCHAR(100),
    salary DECIMAL(10,2),
    dept_id INT,
    location VARCHAR(50)
);

-- Horizontal fragmentation by location
-- Fragment 1: New York employees
CREATE TABLE Employee_NY AS
SELECT * FROM Employee WHERE location = 'New York';

-- Fragment 2: London employees
CREATE TABLE Employee_London AS
SELECT * FROM Employee WHERE location = 'London';

-- Fragment 3: Tokyo employees
CREATE TABLE Employee_Tokyo AS
SELECT * FROM Employee WHERE location = 'Tokyo';

The reconstruction of the original relation is achieved through UNION:

Employee = Employee_NY ∪ Employee_London ∪ Employee_Tokyo

Vertical Fragmentation

Vertical fragmentation divides a relation into smaller subsets of columns, distributing them across different sites.

Based on a projection operation: Fi = πattributes(R)
Each fragment contains a subset of the columns of the original relation
The primary key must be included in all fragments to allow reconstruction

Example: An Employee relation might be fragmented into personal information and financial information.

-- Vertical fragmentation of Employee table
-- Fragment 1: Personal information
CREATE TABLE Employee_Personal AS
SELECT emp_id, name, location FROM Employee;

-- Fragment 2: Financial information
CREATE TABLE Employee_Financial AS
SELECT emp_id, salary, dept_id FROM Employee;

The reconstruction of the original relation is achieved through a JOIN operation:

Employee = Employee_Personal ⋈ Employee_Financial

Mixed/Hybrid Fragmentation

Mixed fragmentation combines both horizontal and vertical fragmentation approaches.

Can be applied as horizontal followed by vertical or vice versa
Provides more flexibility in data distribution
More complex to manage and reconstruct

Example:

-- First, horizontal fragmentation by location
CREATE TABLE Employee_NY AS
SELECT * FROM Employee WHERE location = 'New York';

-- Then, vertical fragmentation of the NY fragment
CREATE TABLE Employee_NY_Personal AS
SELECT emp_id, name FROM Employee_NY;

CREATE TABLE Employee_NY_Financial AS
SELECT emp_id, salary, dept_id FROM Employee_NY;

1.1.2 Correctness Rules for Fragmentation

To ensure that fragmentation is done correctly, the following rules must be satisfied:

Completeness - Every data item in the original relation must appear in at least one fragment.
Reconstruction - The original relation must be reconstructible from fragments using UNION (horizontal) or JOIN (vertical) operations.
Disjointness - Data items should not be duplicated across fragments (except primary keys in vertical fragmentation).

1.1.3 Design Considerations for Fragmentation

When designing a fragmentation scheme, several factors should be considered:

Access Patterns: Group frequently accessed data items and collocate data accessed by the same transactions.
Usage Frequency: Replicate frequently accessed fragments across multiple sites for better performance.
Storage Requirements: Consider fragment sizes and available storage capacity at each site.
Communication Costs: Account for network bandwidth, latency, and data transfer costs between sites.

1.1.4 Advantages and Disadvantages of Fragmentation

Advantages

Improved Performance: Queries can be executed on smaller fragments, reducing response time.
Scalability: New fragments can be added as data grows, allowing for horizontal scaling.
Load Balancing: Fragments can be distributed across multiple sites, balancing the load and improving resource utilization.
Reduced Communication Costs: Only relevant fragments are accessed, minimizing data transfer over the network.

Disadvantages

Complexity: Fragmentation adds complexity to the database design and management.
Reconstruction Overhead: Reconstructing the original relation from fragments can introduce overhead, especially for complex queries.
Data Redundancy: In some cases, fragments may contain overlapping data, leading to redundancy.
Consistency Challenges: Ensuring consistency across fragments can be difficult, especially with concurrent updates.

1.2 Replication

Replication involves creating and maintaining multiple copies of data across different sites in a distributed database system. It enhances data availability, fault tolerance, and performance. Replication can be done at the level of entire relations or fragments.

1.2.1 Types of Replication

Replication can be classified into three main types based on the extent of data replication: Full Replication, Partial Replication, and No Replication.

Full Replication

Every site has a complete copy of all data
Good: High availability, fast reads
Bad: Uses more storage, updates are complex
Example: Banking data stored at all branches

Partial Replication

Only some data is copied to multiple sites
Good: Balance between availability and storage
Bad: Some complexity in managing which data to replicate
Example: Customer profiles replicated, transaction logs are not

No Replication

Each piece of data exists at only one site
Good: Saves storage space, simple updates
Bad: If site fails, data is unavailable
Example: Archive data stored at single location

1.2.2 Replication Strategies

Replication strategies determine how updates are propagated to replicas. The choice of strategy affects consistency, performance, and fault tolerance. Three common strategies are synchronous, asynchronous, and semi-synchronous replication.

Synchronous Replication

All copies are updated at the same time
Data is always consistent everywhere
Slower because it waits for all sites to confirm
Like calling everyone to confirm they got your message before hanging up

Asynchronous Replication

Main copy is updated first, others updated later
Faster but copies might be different temporarily
Like sending an email - you don't wait for everyone to read it
Good when small delays are okay

Semi-Synchronous Replication

Updates some copies immediately, others later
Balance between speed and consistency
Like getting confirmation from half your friends before proceeding

1.2.3 Advantages and Disadvantages of Replication

Advantages

Improved Availability: If one site fails, data remains available at other sites
Enhanced Performance: Reads can be served from local replicas, reducing network traffic
Load Balancing: Queries can be distributed across multiple replicas
Fault Tolerance: Redundancy protects against data loss

Disadvantages

Storage Overhead: Multiple copies consume more storage space
Update Complexity: Ensuring consistency across replicas adds complexity
Increased Network Traffic: Propagating updates to all replicas uses network bandwidth
Consistency Challenges: Maintaining consistency becomes more difficult with more replicas

1.2.4 Replication Models

Replication models define how data is replicated and updated across different sites. The choice of model affects consistency, performance, and fault tolerance. Three common replication models are primary-copy, update-anywhere, and quorum-based.

Primary-Copy (Master-Slave) Model

One site is designated as the primary (master) for each data item
All updates go through the primary site
Read operations can be directed to any replica
Example: MySQL primary-replica replication

Update-Anywhere (Multi-Master) Model

Updates can be performed at any replica and propagated to others
Requires sophisticated conflict detection and resolution
Example: Multi-master replication in PostgreSQL BDR or Oracle Active Data Guard

Quorum-Based Model

Uses voting to control read and write operations
Requires a majority of replicas to agree for an operation to succeed
Example: Operations require majority approval in systems like Apache Cassandra

1.2.5 Differences Between Fragmentation and Replication

Feature	Fragmentation	Replication
Purpose	Divide data into smaller, manageable pieces	Create multiple copies of data for availability
Data Distribution	Data is split across sites	Data is duplicated across sites
Data Access	Access to specific fragments based on queries	Access to any replica for reads
Update Strategy	Updates apply to specific fragments	Updates apply to all replicas
Consistency	Reconstructs original relation from fragments	Ensures consistency across replicas
Storage Overhead	Lower, as data is divided	Higher, as multiple copies are stored
Complexity	Moderate, requires reconstruction logic	High, requires synchronization and conflict resolution

1.3 Location and Fragment Transparency

Location and fragment transparency are key features of distributed databases that allow users to access data without needing to know its physical location or how it is fragmented.

1.3.1 Location Transparency

Location transparency allows users to access data without knowing where it is physically stored. This is achieved through a global schema that abstracts the underlying data distribution.

Components Enabling Location Transparency

Distributed Data Catalog: Maintains metadata about the location of all data items
Name Resolution: Maps logical names to physical locations
Query Routing: Directs queries to appropriate sites

Example of how location transparency works:

-- User query (location-transparent)
SELECT * FROM Employee WHERE emp_id = 1001;

-- System translates to:
-- 1. Look up where Employee data for emp_id 1001 is located
-- 2. Route the query to the appropriate site(s)
-- 3. Execute the query
-- 4. Return results to the user

1.3.2 Fragment Transparency

Fragment transparency allows users to access data without needing to know how it is fragmented. The system handles the fragmentation and reconstruction of data.

Components Enabling Fragment Transparency

Global Schema: Presents a unified view of the database
Fragment Schema: Details how relations are fragmented
Mapping Information: Links global schema to fragment schemas

Example of query processing with fragment transparency:

-- User query (fragment-transparent)
SELECT name, salary FROM Employee WHERE location = 'New York';

-- System translates to:
-- 1. Determine that Employee is horizontally fragmented by location
-- 2. Identify that only Employee_NY fragment is needed
-- 3. Execute the query on the Employee_NY fragment
-- 4. Return results to the user

1.3.3 Combined Transparency

When both location and fragment transparency are combined:

-- User query (fully transparent)
SELECT * FROM Employee WHERE salary > 100000;

-- System process:
-- 1. Identify all fragments containing Employee data
-- 2. Determine which fragments might contain relevant data
-- 3. Locate each required fragment
-- 4. Execute the query on each site
-- 5. Combine results if necessary
-- 6. Return unified result to the user

2. Distributed Query Processing and Optimization

Distributed query processing involves executing queries across multiple sites in a distributed database system. The goal is to optimize query execution by minimizing data transfer, reducing response time, and balancing the load across sites.

2.1 Stages of Distributed Query Processing

Distributed query processing typically involves several stages:

2.1.1 Query Decomposition

The first stage breaks down a complex query into its component operations:

Parsing: Convert SQL query to an internal representation
Normalization: Standardize the query representation
Analysis: Validate query against schema information
Algebraic Conversion: Transform to relational algebra expressions

Example:

-- Original SQL query
SELECT E.name, D.dept_name
FROM Employee E, Department D
WHERE E.dept_id = D.dept_id AND E.salary > 50000;

-- Converted to relational algebra
π name, dept_name (σ salary > 50000 (Employee ⋈ Department))

2.1.2 Data Localization

Data localization transforms a distributed query into fragment queries:

Fragment Selection: Identify relevant fragments for the query
Substitution: Replace relations with appropriate fragments
Simplification: Optimize fragment queries using predicate information

Example with horizontal fragmentation:

-- Global relation query
σ location='New York' AND salary > 50000 (Employee)

-- After fragment substitution (assuming horizontal fragmentation by location)
-- Only need to access Employee_NY fragment
σ salary > 50000 (Employee_NY)

2.1.3 Global Query Optimization

Global optimization focuses on generating an efficient execution plan for the entire distributed query:

Join Order Selection: Determine the order of joining relations
Site Selection: Choose execution sites for each operation
Operation Scheduling: Decide the timing of operations

2.1.4 Distributed Execution

The final stage executes the optimized query plan:

Local Execution: Execute operations at individual sites
Data Movement: Transfer data between sites as needed
Result Composition: Combine results from different sites
Result Delivery: Return the final result to the user

2.2 Distributed Query Optimization Techniques

Distributed query optimization techniques aim to minimize the cost of executing queries across multiple sites. The cost can be measured in terms of communication, processing, and I/O costs.

Several techniques are used to optimize distributed queries:

2.2.1 Semi-Join Strategy

Semi-join reduces the amount of data transferred between sites:

Project the join column from one relation
Send the projection to the site of the other relation
Join the projection with the second relation
Send the result back to the first site
Join with the first relation

Example:

-- Need to join Employee (at Site 1) and Department (at Site 2)

-- Step 1: Project join column from Employee
π dept_id (Employee) -> Site 2

-- Step 2: Join with Department at Site 2
Result = Department ⋈ π dept_id (Employee)

-- Step 3: Send result back to Site 1
Result -> Site 1

-- Step 4: Join with Employee at Site 1
Final Result = Employee ⋈ Result

2.2.2 Join Strategies

Different join strategies can be used in distributed environments:

Nested Loop Join

Suitable when one relation is small
Can be executed at either site

Example:

for each tuple r in R:
    for each tuple s in S:
        if r.join_attr = s.join_attr:
            output (r, s)

Hash Join

Effective for larger relations
Hash the smaller relation, scan the larger one

Example:

1. Build hash table for relation R using join attribute
2. For each tuple in S, probe hash table
3. Output matching tuples

Ship-Whole Strategy

Move entire relations to a single site
Useful when network is fast or relations are small
Example: Ship Department to Employee's site if Department is small

2.2.3 Cost Models for Distribution

Cost models account for both computation and communication costs:

Communication Cost

Data transmission costs (proportional to data volume)
Network latency (fixed cost per message)

Example:

Cost = message_startup_cost + (data_size * transmission_cost)

Processing Cost

CPU time for query execution
I/O time for disk operations
Example:
```
Cost = cpu_cost + io_cost
```

Total Cost

Sum of communication and processing costs

Example:

Total Cost = communication_cost + processing_cost

2.2.4 Query Execution Strategies

Different strategies can be used to execute distributed queries:

Data Shipping

Move data to the site where the query originated
Centralized execution but high data movement
Example: Bring all Employee and Department data to the query site

Query Shipping

Move the query (or subqueries) to the data sites
Distributed execution with less data movement
Example: Execute part of the query at the Employee site and part at the Department site

Hybrid Approach

Combine data and query shipping based on costs
Example: Ship small reference tables but process large fact tables in place

2.3 Distributed Transaction Processing

Distributed transactions span multiple sites and require careful coordination to maintain ACID properties.

2.3.1 Distributed Transaction Model

A distributed transaction model involves:

Transaction Manager (TM)

Coordinates the execution of the distributed transaction
Implements the atomic commit protocol
One TM exists for each site participating in the transaction

Resource Manager (RM)

Manages access to local data resources
Provides local transaction operations (read, write)
Participates in the commit protocol

Transaction Coordinator (TC)

One site is designated as the coordinator
Initiates the commit protocol
Collects votes and makes the final decision

Example of a distributed transaction:

-- Transaction spans two sites
BEGIN TRANSACTION;
   -- At Site 1
   UPDATE Account SET balance = balance - 1000 WHERE account_id = 101;
   
   -- At Site 2
   UPDATE Account SET balance = balance + 1000 WHERE account_id = 202;
COMMIT;

2.3.2 Distributed Concurrency Control

Distributed concurrency control extends traditional methods to work across multiple sites:

Centralized Approach

One site acts as the central lock manager
All lock requests go through the central site
Simple but creates a bottleneck and single point of failure

Primary Copy Approach

For each data item, one copy is designated as the primary
Lock requests for an item go to its primary site
Reduces contention but still has single points of failure for each item

Distributed Approach

Lock management is distributed across sites
Each site manages locks for its local data
Requires protocols to prevent distributed deadlocks

Example of distributed two-phase locking:

-- Transaction T1 accessing data at Sites 1 and 2

-- At Site 1
Request and acquire lock on Account 101

-- At Site 2
Request and acquire lock on Account 202

-- Perform operations at both sites

-- Growing phase complete, begin shrinking phase

-- At commit time
Release locks at both sites

2.4 Distributed Deadlock

Distributed deadlocks occur when two or more transactions across multiple sites wait for each other in a cyclic manner.

2.4.1 Types of Distributed Deadlocks

Distributed deadlocks can be classified into two types:

Direct Deadlock

Involves a direct cycle of wait relationships
Example: T1 at Site 1 waits for T2 at Site 2, while T2 waits for T1

Indirect Deadlock

Involves multiple transactions at different sites forming a cycle
Example: T1 waits for T2, T2 waits for T3, T3 waits for T1, all at different sites

2.4.2 Distributed Deadlock Detection

Several approaches exist for detecting distributed deadlocks:

Centralized Approach

One site is responsible for deadlock detection
Sites periodically send wait-for information to the central site
Central site constructs global wait-for graph and checks for cycles
Simple but creates a bottleneck

Distributed Approach

Each site maintains its local wait-for graph
Sites exchange information about potential deadlocks
More complex but avoids central bottleneck

Path-Pushing Algorithm

When transaction T waits, site sends wait-for path to sites where blocked transactions reside
Receiving sites extend the path with their local wait-for information
If a path forms a cycle, a deadlock is detected

Example of a distributed deadlock:

-- At Site 1
Transaction T1 holds lock on A, requests lock on B (located at Site 2)

-- At Site 2
Transaction T2 holds lock on B, requests lock on C (located at Site 3)

-- At Site 3
Transaction T3 holds lock on C, requests lock on A (located at Site 1)

-- Deadlock cycle: T1 → T2 → T3 → T1

2.4.3 Distributed Deadlock Prevention

Prevention techniques avoid deadlocks by constraining how transactions request resources:

Timeouts

Abort a transaction if it waits too long for a lock
Simple but may abort transactions unnecessarily

Wait-Die

If requesting transaction is older, it waits; otherwise, it aborts ("dies")
Prevents younger transactions from making older ones wait

Example:

T1 (timestamp=10) requests resource held by T2 (timestamp=20)
Since T1 is older (smaller timestamp), T1 waits

Wound-Wait

If requesting transaction is older, the holding transaction aborts ("is wounded"); otherwise, requester waits

Example:

T1 (timestamp=10) requests resource held by T2 (timestamp=20)
Since T1 is older, T2 is "wounded" (aborted)

2.5 Commit Protocols

Commit protocols ensure that all participants in a distributed transaction either commit or abort the transaction, maintaining the atomicity property of transactions.

2.5.1 Two-Phase Commit (2PC) Protocol

The two-phase commit protocol is a widely used commit protocol for distributed transactions. It consists of two phases:

Phase 1: Prepare Phase

Coordinator sends a PREPARE message to all participants
Participants vote YES (ready to commit) or NO (abort)
If a participant votes YES, it enters a prepared state and cannot unilaterally abort

Phase 2: Commit Phase

If all participants vote YES, coordinator sends COMMIT message
If any participant votes NO, coordinator sends ABORT message
Participants acknowledge the final decision

Example of Two-Phase Commit:

-- Coordinator sends PREPARE to participants
Coordinator → Participant1: PREPARE
Coordinator → Participant2: PREPARE

-- Participants respond with votes
Participant1 → Coordinator: YES
Participant2 → Coordinator: NO

-- Coordinator decides based on votes
Coordinator → Participant1: ABORT
Coordinator → Participant2: ABORT

-- Participants acknowledge the decision
Participant1 → Coordinator: ACK
Participant2 → Coordinator: ACK

Failure Scenarios in 2PC

Participant failure before voting: Coordinator timeouts and aborts
Participant failure after voting YES: Upon recovery, participant contacts coordinator
Coordinator failure after collecting votes: Participants must wait for coordinator recovery
Network partition: May block participants until connectivity is restored

2.5.2 Three-Phase Commit (3PC) Protocol

The three-phase commit protocol addresses some blocking issues in 2PC by adding an extra phase:

Phase 1: Prepare Phase

Same as in 2PC

Phase 2: PreCommit Phase

Coordinator sends PRECOMMIT or ABORT message
Participants acknowledge and prepare to commit

Phase 3: Commit Phase

Coordinator sends final COMMIT message
Participants complete the transaction

Benefits of 3PC:

Reduces blocking in some failure scenarios
Allows participants to timeout and resolve independently

Drawbacks:

Higher message overhead
Still blocks in certain network partition scenarios
More complex implementation

3. Design of Parallel Databases

Parallel databases leverage multiple processors and storage devices to improve performance through parallel processing. They are designed to handle large volumes of data and complex queries efficiently.

3.1 Parallel Database Architectures

Various architectures exist for parallel database systems, each with its own advantages and disadvantages. The choice of architecture depends on the specific requirements of the application and the underlying hardware.

3.1.1 Shared Memory Architecture

In shared memory architecture:

Multiple processors share a common memory
All processors access the same database buffers in shared memory
Simple programming model but limited scalability due to memory contention

┌───────────────────────────────────────┐
│           Shared Memory               │
├───────┬───────┬───────┬───────┬───────┤
│ CPU 1 │ CPU 2 │ CPU 3 │ ...   │ CPU n │
└───────┴───────┴───────┴───────┴───────┘
         │       │       │       │
┌────────┴───────┴───────┴───────┴───────┐
│                 I/O                    │
└────────────────────────────────────────┘
         │       │       │       │
┌────────┴───────┴───────┴───────┴───────┐
│          Disk Subsystem                │
└────────────────────────────────────────┘

3.1.2 Shared Disk Architecture

In shared disk architecture:

Each processor has its own private memory
All processors share a common disk subsystem
Better scalability than shared memory but requires cache coherency protocols

┌───────┐ ┌───────┐ ┌───────┐     ┌───────┐
│ CPU 1 │ │ CPU 2 │ │ CPU 3 │ ... │ CPU n │
└───┬───┘ └───┬───┘ └───┬───┘     └───┬───┘
    │         │         │             │
┌───▼───┐ ┌───▼───┐ ┌───▼───┐     ┌───▼───┐
│ Mem 1 │ │ Mem 2 │ │ Mem 3 │ ... │ Mem n │
└───────┘ └───────┘ └───────┘     └───────┘
    │         │         │             │
┌───┴─────────┴─────────┴─────────────┴──┐
│           Shared Disk System           │
└────────────────────────────────────────┘

3.1.3 Shared Nothing Architecture

In shared nothing architecture:

Each processor has its own private memory and disk
Processors communicate through a high-speed network
Excellent scalability but requires data partitioning and distributed query processing

┌───────┐ ┌───────┐ ┌───────┐     ┌───────┐
│ CPU 1 │ │ CPU 2 │ │ CPU 3 │ ... │ CPU n │
└───┬───┘ └───┬───┘ └───┬───┘     └───┬───┘
    │         │         │             │
┌───▼───┐ ┌───▼───┐ ┌───▼───┐     ┌───▼───┐
│ Mem 1 │ │ Mem 2 │ │ Mem 3 │ ... │ Mem n │
└───┬───┘ └───┬───┘ └───┬───┘     └───┬───┘
    │         │         │             │
┌───▼───┐ ┌───▼───┐ ┌───▼───┐     ┌───▼───┐
│Disk 1 │ │Disk 2 │ │Disk 3 │ ... │Disk n │
└───────┘ └───────┘ └───────┘     └───────┘
    │         │         │             │
┌────────────────────────────────────────┐
│        High-Speed Network              │
└────────────────────────────────────────┘

3.1.4 Hierarchical Architecture

Hierarchical architecture combines aspects of the above models:

Multiple shared memory nodes
Each node contains multiple processors with shared memory
Nodes are connected in a shared nothing configuration
Balances the advantages of different architectures

3.2 Data Partitioning in Parallel Databases

Data partitioning refers to dividing a large dataset into smaller chunks (called partitions) and distributing them across multiple nodes in a parallel database system. The goal is to improve performance by allowing multiple processors to work on different parts of the data simultaneously.

3.2.1 Round-Robin Partitioning

Tuples are assigned to partitions in a round-robin fashion
Simple and ensures even distribution of data
Example: Tuple 1 → Node 1, Tuple 2 → Node 2, ..., Tuple n → Node (n mod k)

3.2.2 Hash Partitioning

A hash function is applied to a partitioning attribute
Tuples with the same attribute value are assigned to the same partition
Good for equality queries on the partitioning attribute

Example:

Node_number = hash(employee_id) mod num_nodes

3.2.3 Range Partitioning

The range of values for a partitioning attribute is divided
Each partition is assigned a non-overlapping range
Good for range queries on the partitioning attribute

Example:

Node 1: employee_id 1-1000
Node 2: employee_id 1001-2000
etc.

3.2.4 Composite Partitioning

Combines multiple partitioning methods
Example: First hash partition, then range partition within each hash partition
Provides more flexibility for complex query patterns

3.3 Parallel Database Operations

Parallel database operations leverage the parallel architecture to execute queries more efficiently. The main operations include sorting, joining, and aggregation.

3.3.1 Parallel Sort

Sorting in parallel databases can be done using various algorithms, depending on the data distribution and the architecture.

Exchange Sort

Split data into pieces across different computers
Each computer sorts its piece
Combine all sorted pieces together

Parallel Merge Sort

Break data into small chunks
Sort each chunk at the same time on different processors
Merge the sorted chunks together to get the final result

Example: If you have 1 million records to sort:

Give 250,000 records to each of 4 computers
Each computer sorts its portion
Combine the 4 sorted portions into one final sorted list

3.3.2 Parallel Join

Joining relations in parallel databases can be done using several strategies, depending on the data distribution and the join type.

Parallel Nested Loop Join

Split the first table into pieces across different computers
Send the entire second table to each computer
Each computer joins its piece with the whole second table
Simple but slow for large tables

Parallel Hash Join

Split both tables using the same method (like grouping by first letter of name)
Each computer joins its matching pieces together
Combine all results into final answer

Example:

-- Partition R and S using hash(join_attribute)
-- At each node i:
Build hash table for R_i
For each tuple in S_i:
    Probe hash table for matches
    Output matching tuples

Parallel Sort-Merge Join

Partition both relations on join attribute
Sort partitions locally
Perform merge join on corresponding partitions

3.3.3 Parallel Aggregation

Aggregation operations can be performed in parallel to improve performance.

Local Aggregation, Global Consolidation

Perform partial aggregations locally at each node
Send partial results to a coordinator node
Consolidate partial results into final aggregations

Example:

-- Count employees by department
-- Original query
SELECT dept_id, COUNT(*) FROM Employee GROUP BY dept_id;

-- Parallel execution
-- At each node:
SELECT dept_id, COUNT(*) FROM Employee_partition GROUP BY dept_id;

-- At coordinator:
SELECT dept_id, SUM(count) FROM partial_results GROUP BY dept_id;

4. Parallel Query Evaluation

Parallel query evaluation is the process of executing queries in a parallel database system to improve performance. It involves breaking down queries into smaller tasks that can be executed concurrently across multiple processors or nodes. The goal is to minimize response time and maximize throughput by leveraging the available resources effectively.

4.1 Forms of Parallelism

Different forms of parallelism can be applied in database systems:

4.1.1 Inter-Query Parallelism

Multiple queries are executed concurrently
Each query runs on a separate processor
Increases system throughput but doesn't reduce response time for individual queries
Example: Multiple users executing different queries simultaneously

4.1.2 Intra-Query Parallelism

Single query is divided into subtasks executed in parallel
Reduces response time for individual queries
Two main types:

Inter-Operator Parallelism (Pipelined Parallelism)

Different operators in the query plan execute concurrently
Output of one operator becomes input to another operator
Example: Selection and projection executing simultaneously

┌─────────┐
│ SELECT  │
└────┬────┘
     │
     ▼
┌─────────┐
│ PROJECT │
└────┬────┘
     │
     ▼
┌─────────┐
│  JOIN   │
└─────────┘

Intra-Operator Parallelism (Partitioned Parallelism)

Single operator is divided into multiple instances
Each instance works on a partition of the data
Example: Parallel scan of a large table

┌───────────┬───────────┬───────────┬───────────┐
│ Scan      │ Scan      │ Scan      │ Scan      │
│Partition1 │ Partition2│ Partition3│ Partition4│
└─────┬─────┴─────┬─────┴─────┬─────┴─────┬─────┘
      │           │           │           │
      └───────────┼───────────┼───────────┘
                  │           │
          ┌───────┴───────────┴───────┐
          │        Merge              │
          └───────────────────────────┘

4.1.3 Bushy Parallelism

Combines aspects of both inter- and intra-operator parallelism
Multiple branches of the query tree execute in parallel
Example: Multiple joins executing concurrently

    ┌─────────┐
    │  JOIN   │
    └────┬────┘
┌────────┴────────┐
▼                 ▼
┌─────────┐   ┌─────────┐
│  JOIN   │   │  JOIN   │
└────┬────┘   └────┬────┘
┌────┴────┐   ┌────┴────┐
▼         ▼   ▼         ▼
┌───┐   ┌───┐ ┌───┐   ┌───┐
│ A │   │ B │ │ C │   │ D │
└───┘   └───┘ └───┘   └───┘

4.2 Parallel Query Optimization

Parallel query optimization involves selecting the best execution plan for a query in a parallel database system. The goal is to minimize execution time while maximizing resource utilization. The optimization process considers various factors, including the degree of parallelism, partitioning strategies, resource allocation, and skew handling.

4.2.1 Degree of Parallelism

Deciding the optimal degree of parallelism involves balancing:

Available Resources: Number of processors, memory, disk bandwidth
Query Complexity: More complex queries may benefit from higher parallelism
Data Size: Larger datasets typically benefit from more parallelism
Diminishing Returns: Beyond a certain point, adding more parallelism yields minimal benefits

Example approaches:

Static assignment: Fixed degree based on system configuration
Dynamic assignment: Adjusted based on system load and query characteristics
Cost-based: Based on estimated query cost and available resources

4.2.2 Partitioning Strategy Selection

Choosing the right partitioning strategy based on query requirements:

Join Queries: Partition on join attributes to minimize data movement
Range Queries: Range partitioning for attributes used in range predicates
Complex Queries: Possibly multiple partitioning schemes for different operations

Example:

-- Query with join and range condition
SELECT * FROM Orders O JOIN Customers C ON O.customer_id = C.customer_id
WHERE O.order_date BETWEEN '2023-01-01' AND '2023-03-31';

-- Potential partitioning:
-- 1. Hash partition both Orders and Customers on customer_id for join
-- 2. Range partition Orders on order_date for the date range condition

4.2.3 Resource Allocation

Optimal allocation of resources across parallel operations:

Memory Allocation: Buffer space for each parallel operator
CPU Allocation: Processor assignment for different operations
I/O Bandwidth: Balancing disk access across operations

4.2.4 Skew Handling

Dealing with data skew to prevent bottlenecks:

Detection: Identify potential skew in data distribution
Redistribution: Dynamically redistribute work to balance load
Adaptive Processing: Adjust execution plan based on observed skew

Example handling of skew in parallel hash join:

1. Sample data to detect skew
2. For skewed values:
   - Process in a separate phase
   - Use different join algorithm
3. For non-skewed values:
   - Process using standard parallel hash join

4.3 Parallel Query Execution

Parallel query execution involves executing the optimized query plan across multiple processors and disks. The execution of parallel queries involves several key techniques:

4.3.1 Exchange Operators

Exchange operators facilitate data movement between parallel operators:

Gather: Collect data from multiple sources to a single destination
Distribute: Send data from a single source to multiple destinations
Repartition: Redistribute data from multiple sources to multiple destinations

Example exchange operator usage:

-- Before parallel processing
TABLE(A)
    |
JOIN with B
    |
AGGREGATE

-- With exchange operators
TABLE(A) -> [DISTRIBUTE] -> Parallel JOIN instances with B
                              |
                          [GATHER] -> AGGREGATE

4.3.2 Parallel Join Execution

Execution strategies for parallel joins:

Broadcast Join

Replicate smaller relation to all nodes
Join with local partition of larger relation
Good when one relation is much smaller

Partitioned Join

Partition both relations using the same scheme
Perform local joins in parallel
Good when both relations are large

Hybrid Join

Combine broadcast and partitioned approaches
Use different strategies for different parts of the data

4.3.3 Load Balancing Techniques

Techniques to ensure balanced workload distribution:

Static Load Balancing

Distribute work evenly before execution
Simple but vulnerable to unexpected skew

Dynamic Load Balancing

Redistribute work during execution
More complex but adapts to observed workload

Work Stealing

Idle processors "steal" work from busy ones
Adaptive but requires coordination mechanism

This concludes our comprehensive examination of parallel and distributed databases in Advanced Database Management Systems. Understanding these concepts is essential for designing and implementing database systems that can efficiently handle large-scale data processing and storage requirements across distributed environments.

These notes were compiled by Deepak Modi
Visit NotesNeo for more resources.
Last updated: June 2025