Skip to content

mmakrin/Rstartree_implementation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spatial Data Indexing with R-tree (Range, kNN & Skyline Queries)

This project implements a spatial data management system based on an R-tree / R-tree* structure for efficient storage and querying of multidimensional geographic data.

It was developed as part of the course Database Technology (Τεχνολογία Βάσεων Δεδομένων) during the academic year 2023–2024.

The system supports:

  • Range queries
  • k-Nearest Neighbor (kNN) queries
  • Skyline queries
  • Insertions and deletions
  • Bulk loading (bottom-up construction)

Project Objectives

  • Organize multidimensional spatial points stored on secondary memory
  • Design and implement an efficient R-tree index
  • Support common spatial queries:
    • Range queries
    • k-nearest neighbors
    • Skyline queries
  • Compare indexed vs brute-force query performance
  • Handle dynamic updates (insert/delete)

Dataset

  • Source: OpenStreetMap (map.osm)
  • Data stored in blocks inside datafile.txt
  • Each record contains:
    • Location ID
    • Latitude
    • Longitude

Blocks contain 1365 records each to simulate secondary storage organization.


System Architecture

Core Components

LoadData

Reads spatial data from datafile.txt in blocks and supports random access using RandomAccessFile.

LoadOSM

Parses map.osm and:

  • Creates the datafile in blocks
  • Inserts points into the R-tree
  • Supports adding new locations dynamically

Location

Represents a geographic location with:

  • id, latitude, longitude
  • Manhattan distance calculations
  • Comparable by distance

Point

Low-level spatial point with:

  • Coordinates
  • Block ID & slot ID (file position)
  • Distance computation

R-tree Implementation

Tree

Supports:

  • insert_in_Rtree() – dynamic insertion with node splitting
  • delete_from_tree() – deletion with rebalancing
  • range_query_with_index()
  • knn_with_index()
  • range_query_without_index() (brute force)
  • knn_without_index() (brute force)

TreeNode

  • Leaf nodes store Point objects
  • Internal nodes store child TreeNodes
  • Each node maintains an MBR (Minimum Bounding Rectangle)
  • Maximum children per node: 5

Geometry Utilities

MBR

Handles:

  • Rectangle expansion
  • Area computation
  • Overlap calculation
  • Distance from point to rectangle

🧮 Heaps for Query Processing

MinHeap

Used in:

  • R-tree traversal
  • kNN queries

MaxHeap

Used to maintain top-k nearest neighbors

MinHeapSkyline

Specialized heap for Skyline algorithm ordering nodes by Manhattan distance from origin.


Skyline Query

The Skyline algorithm:

  • Traverses the R-tree using a MinHeap
  • Prunes dominated nodes
  • Returns all non-dominated points in 2D space

Dominance definition:

A point A dominates B if A ≤ B in all dimensions and A < B in at least one.


Bulk Loading (Bottom-Up)

The bottom-up method:

  1. Parses OSM points
  2. Sorts by latitude
  3. Groups into blocks of 5 points
  4. Builds leaf nodes
  5. Recursively groups nodes into parent nodes
  6. Continues until root is formed

This greatly improves construction performance compared to repeated insertions.


Performance Results

Average execution time (5 runs):

Operation Time
Insert ~118 ms
Delete ~5324 ms
Skyline query ~6.8 ms
Bulk loading ~933 ms

🛠 Technologies Used

  • Java
  • OpenStreetMap XML parsing
  • File-based block storage
  • Custom R-tree / R*-tree implementation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages