Benchmarks

SyncEngine includes comprehensive benchmarks to measure performance across different sync modes and scenarios.

Overview

The benchmark suite tests:

  • All five sync modes

  • Various file sizes (small, medium, large)

  • Different numbers of files (few, many, mixed)

  • Common operations (add, modify, delete, rename)

  • Edge cases (conflicts, partial syncs, interrupted syncs)

Running Benchmarks

Basic Usage

Run all benchmarks:

python benchmarks/run_benchmarks.py

Run specific benchmark:

python benchmarks/benchmark_two_way.py

Run with custom parameters:

python benchmarks/run_benchmarks.py --num-files 1000 --file-size 1MB

Benchmark Modules

benchmark_two_way.py

Tests TWO_WAY sync mode:

  • Bidirectional file changes

  • Conflict detection and resolution

  • Rename/move detection

  • Delete propagation both ways

Example results:

TWO_WAY Sync Benchmark
=====================
Files: 100 (avg size: 1MB)

Initial sync:
  Time: 2.34s
  Uploaded: 100 files (100MB)
  Downloaded: 0 files

Modify 10 files at source:
  Time: 0.45s
  Uploaded: 10 files (10MB)
  Downloaded: 0 files

Modify 10 files at destination:
  Time: 0.47s
  Uploaded: 0 files
  Downloaded: 10 files (10MB)

Modify 5 files at both sides (conflicts):
  Time: 0.28s
  Conflicts resolved: 5
  Actions taken: 5 uploads

benchmark_source_to_destination.py

Tests SOURCE_TO_DESTINATION sync mode:

  • One-way mirroring

  • Destination changes ignored

  • Source deletions propagated

Example results:

SOURCE_TO_DESTINATION Sync Benchmark
===================================
Files: 100 (avg size: 1MB)

Initial sync:
  Time: 2.31s
  Uploaded: 100 files (100MB)

Add 10 files at source:
  Time: 0.43s
  Uploaded: 10 files (10MB)

Add 10 files at destination:
  Time: 0.19s
  Deleted: 10 files (not in source)

Delete 10 files at source:
  Time: 0.15s
  Deleted: 10 files (at destination)

benchmark_source_backup.py

Tests SOURCE_BACKUP sync mode:

  • Upload-only backup

  • Source deletions not propagated

  • Destination grows over time

Example results:

SOURCE_BACKUP Sync Benchmark
============================
Files: 100 (avg size: 1MB)

Initial sync:
  Time: 2.35s
  Uploaded: 100 files (100MB)

Delete 10 files at source:
  Time: 0.05s
  Deleted: 0 files (backup preserved)

Modify 10 files at source:
  Time: 0.44s
  Uploaded: 10 files (10MB)

Destination size after deletes:
  Files: 100 (backup preserved)

benchmark_destination_to_source.py

Tests DESTINATION_TO_SOURCE sync mode:

  • One-way mirroring from destination

  • Source changes ignored

  • Destination deletions propagated

Example results:

DESTINATION_TO_SOURCE Sync Benchmark
===================================
Files: 100 (avg size: 1MB)

Initial sync:
  Time: 2.33s
  Downloaded: 100 files (100MB)

Add 10 files at destination:
  Time: 0.42s
  Downloaded: 10 files (10MB)

Add 10 files at source:
  Time: 0.18s
  Deleted: 10 files (not in destination)

benchmark_destination_backup.py

Tests DESTINATION_BACKUP sync mode:

  • Download-only backup

  • Destination deletions not propagated

  • Source grows over time

Example results:

DESTINATION_BACKUP Sync Benchmark
================================
Files: 100 (avg size: 1MB)

Initial sync:
  Time: 2.36s
  Downloaded: 100 files (100MB)

Delete 10 files at destination:
  Time: 0.05s
  Deleted: 0 files (backup preserved)

Modify 10 files at destination:
  Time: 0.45s
  Downloaded: 10 files (10MB)

benchmark_sync_modes.py

Compares all sync modes side-by-side:

  • Same test data for all modes

  • Measures time and operations

  • Highlights mode differences

Example results:

Sync Modes Comparison
=====================
Test: 100 files (1MB each), modify 10 at source

TWO_WAY:              0.45s (10 uploads)
SOURCE_TO_DEST:       0.44s (10 uploads)
SOURCE_BACKUP:        0.46s (10 uploads)
DEST_TO_SOURCE:       0.05s (0 operations)
DEST_BACKUP:          0.05s (0 operations)

Test: 100 files (1MB each), modify 10 at destination

TWO_WAY:              0.47s (10 downloads)
SOURCE_TO_DEST:       0.18s (10 deletes)
SOURCE_BACKUP:        0.48s (10 downloads)
DEST_TO_SOURCE:       0.46s (10 downloads)
DEST_BACKUP:          0.47s (10 downloads)

Performance Metrics

Key Metrics

The benchmarks measure:

  1. Time: Wall clock time for operations

  2. Throughput: MB/s for uploads/downloads

  3. Operations: Number of file operations

  4. Efficiency: Time per file operation

Example metrics output:

Performance Metrics
===================
Total time: 5.23s

Operations:
  Uploads: 100 (2.34s, 42.7 MB/s)
  Downloads: 50 (1.12s, 44.6 MB/s)
  Deletes: 20 (0.08s, 250 ops/s)
  Renames: 10 (0.05s, 200 ops/s)

Efficiency:
  Time per upload: 23.4ms
  Time per download: 22.4ms
  Time per delete: 4ms
  Time per rename: 5ms

Scaling Tests

Test performance with different scales:

Small Files (many files, small size)

python benchmarks/run_benchmarks.py --num-files 10000 --file-size 10KB

Expected results:

Small Files Test
================
Files: 10,000 (10KB each)
Total size: 100MB

Initial sync: 8.45s (11.8 MB/s)
Incremental sync (100 changes): 0.89s

Large Files (few files, large size)

python benchmarks/run_benchmarks.py --num-files 10 --file-size 100MB

Expected results:

Large Files Test
================
Files: 10 (100MB each)
Total size: 1GB

Initial sync: 23.4s (42.7 MB/s)
Incremental sync (1 change): 2.3s

Mixed Files (realistic mix)

python benchmarks/run_benchmarks.py --mixed

Expected results:

Mixed Files Test
================
Files: 1,000 (10KB to 10MB, avg 500KB)
Total size: 500MB

Initial sync: 12.3s (40.7 MB/s)
Incremental sync (50 changes): 1.2s

Concurrency Tests

Test different concurrency levels:

python benchmarks/benchmark_concurrency.py

Results:

Concurrency Test
================
Files: 100 (1MB each)

Transfers=1:   4.56s (21.9 MB/s)
Transfers=3:   2.34s (42.7 MB/s)
Transfers=5:   1.89s (52.9 MB/s)
Transfers=10:  1.92s (52.1 MB/s)
Transfers=20:  2.01s (49.8 MB/s)

Optimal concurrency: 5-10 transfers

Interpretation

Understanding Results

Good Performance

  • Upload/download: 40-60 MB/s (local/fast network)

  • Small files: 100-500 ops/s

  • Large files: 40-60 MB/s

  • Incremental sync: <1s for typical changes

Poor Performance

  • Upload/download: <10 MB/s

  • Small files: <50 ops/s

  • Large files: <20 MB/s

  • Incremental sync: >5s for small changes

Performance Factors

  1. Network: Biggest factor for remote storage

  2. Disk I/O: Important for local operations

  3. File count: Many small files slower than few large files

  4. Concurrency: Optimal level depends on network/disk

  5. State management: Enabled = faster incremental syncs

Optimization Tips

Based on benchmark results:

  1. Use state management for incremental syncs (10-100x faster)

  2. Optimize concurrency for your network (test 3-10 transfers)

  3. Use ignore patterns to skip unnecessary files

  4. Choose appropriate sync mode (one-way modes faster than TWO_WAY)

  5. Batch operations when possible

  6. Monitor metrics to identify bottlenecks

Continuous Benchmarking

Integration with CI/CD

Run benchmarks in CI to detect performance regressions:

# .github/workflows/benchmarks.yml
name: Benchmarks

on:
  pull_request:
    branches: [main]

jobs:
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.9

      - name: Install dependencies
        run: |
          pip install -e .
          pip install pytest-benchmark

      - name: Run benchmarks
        run: |
          python benchmarks/run_benchmarks.py --output results.json

      - name: Compare results
        run: |
          python benchmarks/compare_results.py \
            --baseline baseline.json \
            --current results.json \
            --threshold 10  # Fail if >10% regression

Custom Benchmarks

Creating Custom Benchmarks

Create your own benchmarks for specific use cases:

import time
from pathlib import Path
from syncengine import SyncEngine, SyncPair, SyncMode
from benchmarks.test_utils import create_test_files, measure_time

def benchmark_custom_scenario():
    """Custom benchmark for specific scenario."""

    # Setup
    source_dir = Path("/tmp/bench_source")
    dest_dir = Path("/tmp/bench_dest")

    # Create test data
    create_test_files(
        source_dir,
        num_files=100,
        file_size=1024*1024  # 1MB
    )

    # Create sync engine
    engine = SyncEngine(
        client=dest_client,
        entries_manager_factory=create_entries_manager
    )

    pair = SyncPair(
        source_root=str(source_dir),
        destination_root=str(dest_dir),
        source_client=source_client,
        destination_client=dest_client,
        mode=SyncMode.TWO_WAY
    )

    # Benchmark
    with measure_time() as timer:
        stats = engine.sync_pair(pair)

    # Report
    print(f"Time: {timer.elapsed:.2f}s")
    print(f"Throughput: {100 / timer.elapsed:.1f} files/s")
    print(f"Stats: {stats}")

if __name__ == "__main__":
    benchmark_custom_scenario()

Benchmark Best Practices

  1. Isolate tests: Use fresh directories for each test

  2. Warm up: Run once before measuring

  3. Repeat: Run multiple times and average

  4. Clean up: Remove test files after benchmarks

  5. Consistent environment: Same hardware, network conditions

  6. Measure what matters: Focus on real-world scenarios

Next Steps

  • Run benchmarks: python benchmarks/run_benchmarks.py

  • See Examples for usage examples

  • Read API Reference for optimization options