A high-performance C++ implementation of solutions to the classic Vehicle Routing Problem (VRP), developed for the Supercomputing course at Insper 2024.1.
- Project Overview
- Repository Structure
- Algorithms & Approaches
- Installation & Setup
- Running the Project
- Performance Analysis
- Contributing
- Contact
The Vehicle Routing Problem (VRP) is a classic optimization problem that seeks to find the optimal routes for multiple vehicles to serve a set of customers, while minimizing the total distance traveled and respecting vehicle capacity constraints.
This project implements and compares multiple approaches to solve the VRP:
- Serial implementation - A baseline exhaustive search algorithm
- Local parallel implementation using OpenMP - Parallelization on a single machine
- Global parallel implementation using MPI - Distributed computing across multiple nodes
- Heuristic approach using Nearest Neighbor algorithm - A faster approximation
Each implementation is analyzed for performance, scalability, and solution quality across different problem sizes.
.
โโโ README.md # This documentation file
โโโ Supercomp_Projeto_2024.ipynb # Jupyter notebook with project assignments
โโโ code/ # Main source code directory
โ โโโ base.cpp # Base implementation and utilities
โ โโโ base.h # Header file with function declarations
โ โโโ global.cpp # Serial global search implementation
โ โโโ heuristic.cpp # Heuristic (Nearest Neighbor) approach
โ โโโ local_parallelization.cpp # OpenMP parallel implementation
โ โโโ global_parallelization.cpp # MPI distributed implementation
โ โโโ global_parallelization.slurm # SLURM configuration for cluster execution
โ โโโ execs/ # Directory for compiled executables
โ โโโ output/ # Output files from executions
โ โโโ python/ # Python scripts for visualization
โโโ docs/ # Documentation and images
โ โโโ imgs/ # Performance graphs and visualizations
โโโ grafo.txt # Input graph definition file
โโโ global_algorithm.cpp # Standalone implementation of global algorithm
The VRP is represented using:
- A map of nodes (
map<int, int>) where the key is the node ID and the value is the demand - A map of distances (
map<pair<int, int>, int>) to represent the distance between any two nodes
The graph is loaded from a text file with the following format:
<num_nodes>
<node_id> <demand>
...
<num_edges>
<node1> <node2> <distance>
...
Three approaches for generating all possible node permutations:
- Serial: Uses
std::next_permutationto generate all permutations sequentially - Parallel: Attempts to parallelize permutation generation using OpenMP
- Optimized Parallel: A hybrid approach that generates permutations serially then distributes them in parallel
The algorithm filters permutations by:
- Ensuring routes start and end at the depot (node 0)
- Splitting routes when vehicle capacity would be exceeded
- Ensuring connections exist between consecutive nodes
- Exhaustively evaluates all possible paths to find the global optimum
- Parallelized using both OpenMP (shared memory) and MPI (distributed memory)
- Greedily selects the nearest unvisited node at each step
- Returns a fast approximate solution
- Implemented in both serial and parallel versions
- C++ compiler with C++20 support
- OpenMP for parallel processing
- MPI for distributed computing
- SLURM workload manager (for cluster execution)
# Compile serial implementation
c++ -std=c++20 -o execs/global_search global.cpp base.cpp
# Compile heuristic implementation
c++ -std=c++20 -o execs/heuristic_search heuristic.cpp base.cpp# Compile with OpenMP support
c++ -std=c++20 -fopenmp -o execs/local_parallelization local_parallelization.cpp base.cpp# Compile with MPI support
mpic++ -std=c++20 -o execs/global_parallelization global_parallelization.cpp base.cpp# Run serial global search
./execs/global_search
# Run OpenMP parallelized version
./execs/local_parallelization# Run with 4 processes
mpirun -np 4 ./execs/global_parallelization# Submit the job to the SLURM scheduler
sbatch global_parallelization.slurmThe SLURM configuration is as follows:
#!/bin/bash
#SBATCH --partition=espec
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=2G
#SBATCH --job-name=global_parallelization
#SBATCH --output=output/slurm-%j.out
mpirun ./execs/global_parallelizationThe results show significant performance gains with parallelization, especially as the problem size increases:
-
For N=9 nodes:
- Serial: 384 ms
- OpenMP Parallel: 309 ms (19.53% improvement)
- MPI Distributed: 103 ms (73.43% improvement)
-
For N=11 nodes:
- Serial: 62,351 ms
- OpenMP Parallel: 44,155 ms (29.18% improvement)
- MPI Distributed: 13,443 ms (78.45% improvement)
The heuristic approach provides faster solutions but with variable accuracy:
- For N=6: Global found a solution with cost 264, while Heuristic found 378 (30.30% worse)
- For N=10: Global found a solution with cost 846, while Heuristic found 1012 (19.61% worse)
- For some cases (N=5, N=11), both approaches found solutions of equal quality
For detailed performance graphs, see the docs/imgs directory.
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the project
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Antรดnio Amaral Egydio Martins - [email protected]
Project Link: https://github.com/AntonioEgydio/Vehicle-Routing-Problem
