Parallel Graph Coloring Project

Summary

For our final project, we will implement and analyze multiple parallel approaches to the graph coloring problem. We'll develop traditional parallelization strategies alongside a focus on a novel transactional memory-inspired approach that uses optimistic execution with conflict resolution.

Background

Graph coloring assigns colors to nodes such that no adjacent nodes share the same color. This problem has many applications, including register allocation in compilers, scheduling problems, and frequency assignment in wireless networks. The minimum graph coloring problem (finding the coloring with the fewest colors) is NP-Complete, so approximation algorithms are typically used.

The greedy approach is one of the most common approximation algorithms for graph coloring. It assigns the lowest possible color to each vertex while ensuring no conflicts with already-colored neighbors. While simple, this sequential approach can be slow for large graphs, and computation increases as the graph size increases.

Reports

Parallelization Approaches

Traditional Parallel Approaches

We will implement the following traditional approaches using both OpenMP (for shared memory parallelism) and MPI (for distributed memory parallelism) to evaluate their performance characteristics, scalability, and effectiveness across different graph structures:

1. Independent Set Coloring

Concept: Identifying sets of vertices that can be colored simultaneously without conflicts

Implementation: Using OpenMP for thread-level parallelism and MPI for distributed identification and coloring of independent sets

Challenge: Efficiently identifying maximal independent sets in parallel

2. Domain Decomposition

Concept: Partitioning the graph into subgraphs and coloring them in parallel

Implementation: Using OpenMP for shared memory decomposition and MPI for distributed partitioning with special handling for boundary vertices

Challenge: Managing cross-partition conflicts while maintaining load balance

Novel Transactional Memory-Inspired Approach

3. Optimistic Parallel Coloring with Conflict Resolution

Concept: Mimicking transactional memory systems by allowing threads to color vertices optimistically in parallel without synchronization, then detecting and resolving conflicts afterward

Implementation

For shared memory: Using OpenMP with custom atomic operations for conflict detection and resolution
For distributed memory: Using MPI with specialized message-passing protocols for cross-process conflict management

Key Components

Conflict Detection

After initial coloring, a parallel scan over edges identifies pairs of adjacent vertices with the same color, recording conflicts in dedicated data structures. This post-processing approach uses atomic operations to mark conflicts without serializing the detection process, enabling efficient bulk conflict identification.

Resolution Strategies

Several sophisticated approaches will be explored including:

Priority-based resolution (higher priority vertices keep colors)
Timestamp ordering (earlier colored vertices maintain colors)
Probabilistic resolution with exponential backoff for recoloring attempts

These strategies balance conflict resolution quality against computational overhead.

Performance Optimizations

We will explore optimizations with:

Versioned color assignments to track changes
Contention management to reduce hotspots
Careful memory consistency model design

Fine-grained state tracking monitors whether vertices are in tentative, committed, or conflicted states throughout the coloring process.

Relation to Hardware Lock Elision (HLE)

Our approach shares important conceptual foundations with HLE, a feature in modern processors that enables threads to execute critical sections concurrently without locks:

Both use optimistic execution assuming conflicts are rare
While HLE leverages hardware transactional memory to detect conflicts at the processor level, our approach implements similar capabilities in software
For systems with processors supporting Intel TSX instructions, we will explore potential hybrid implementations that leverage actual hardware transactional memory support for conflict detection

This connection highlights how our software approach mirrors hardware-level optimization techniques in modern processor architectures.

Key Differences Between Approaches

Philosophy: Traditional approaches prevent conflicts through synchronization; transactional approach allows conflicts and resolves them afterward.
Synchronization: Traditional approaches require coordinated coloring; transactional approach enables independent thread operation.
Conflict Handling: Traditional approaches avoid conflicts by design; transactional approach detects and resolves conflicts post-coloring.
Execution Model:
- Independent Set colors non-adjacent vertices simultaneously
- Domain Decomposition partitions and colors with boundary handling
- Transactional approach uses optimistic coloring with conflict resolution

Challenges

Load Balancing

Graphs often have non-uniform structures, leading to load imbalance when parallelized. We will investigate dynamic load balancing techniques to distribute work evenly among processors, particularly for irregular graphs.

Dependency Management

The primary challenge in parallelizing graph coloring is managing dependencies between vertices. Since a vertex's color depends on its neighbors, naive parallelization can lead to race conditions where neighboring vertices receive the same color.

Performance-Quality Tradeoffs

Parallel implementations may use more colors than the sequential greedy algorithm. We will analyze this tradeoff between coloring quality and performance improvement, and explore techniques to minimize the color count while maintaining speedup.

Additional Specific Challenges

Balancing Optimism: Finding the right balance between optimistic coloring and conflict resolution overhead
Scalable Conflict Detection: Making conflict detection efficient as thread count increases
Color Count Control: Preventing resolution strategies from significantly increasing color count

Resources

Our approach will start with developing a C/C++ implementation of the sequential greedy graph coloring algorithm as our foundation. Once we have a working sequential version, we'll extend it to create a parallel implementation. We'll conduct our initial development on the GHC machines, but for comprehensive performance evaluation of our final implementations, we plan to utilize the PSC machines with their superior core counts.

Platform Choice

We'll leverage multi-core GHC and PSC machines for our implementation. C/C++ was selected based on our prior assignment experience with OpenMP and OpenMPI-frameworks well-suited to the parallel approaches described in our background section.

Goals and Deliverables

Plan to Achieve (Baseline)

Sequential implementation of greedy graph coloring
OpenMP implementation with reasonable speedup on shared memory systems
Basic MPI implementation for distributed memory systems
Implementation of transactional memory-inspired approach with:
- Optimistic parallel coloring mechanism
- Conflict detection and recording system
- Two conflict resolution strategies:
  - Priority-based resolution (higher priority vertices keep colors)
  - Timestamp ordering (earlier colored vertices maintain colors)
Performance analysis across different graph types and system configurations

Hope to Achieve (Stretch Goals)

Achieve near-linear speedup for sparse graph classes by optimizing cache utilization and reducing synchronization overhead
Develop advanced heuristics to minimize color count in parallel implementations while maintaining performance
Implement an adaptive framework that automatically selects the optimal parallelization strategy based on graph density, size, and available hardware

Schedule

Week	To-Do
Week 1	Implement and optimize sequential greedy graph coloring Set up testing framework and graph datasets
Week 2	Develop traditional parallel algorithm implementation Benchmark and analyze performance on shared memory systems Explore transactional memory conflict detection strategies
Week 3	Prepare and submit project milestone report (April 15th) Continue transactional memory approach - implement strategies
Week 4	Complete transactional memory approach Conduct comparative analysis
Week 5	Finalize optimizations Prepare final report and presentation materials Prepare for project poster session
Final Deliverables	Final Project Report (April 28th) Project Poster Session (April 29th)