Why Custom Git Management Uses an Actor System Instead of HTTP Requests

Executive Summary

A real world project uses an actor-based architecture for git management, not primarily because of WebSocket integration, but because git operations are inherently stateful, long-running, and require shared mutable state that HTTP’s stateless request-response model cannot efficiently handle.

1. The Core Problem: Stateful Git Operations

What HTTP Gives You

Client → HTTP Request → Server → HTTP Response → Client

Stateless: Each request is independent
Fire-and-forget: No persistent connection
No shared context between requests

What Git Actually Requires

Client → Load Repo → Edit Files → Save → Commit → Switch Branch → ...
                ↓           ↓           ↓           ↓
              [Actor maintains working directory state]

Git operations are stateful:

You load a repository once
Make multiple edits over time
The edits persist in a working directory
You compile, stage, commit incrementally
The state persists until explicitly saved or the session ends

2. Architectural Analysis of the Codebase

The Three-Layer Architecture

┌──────────────────────────────────────────────────────────────────────────┐
│                          FRONTEND (Svelte)                               │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐      │
│  │  FileTree   │  │ CodeEditor  │  │  GitEditor  │  │   Canvas    │      │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘      │
│         │                │                │                │             │
│         └────────────────┼────────────────┼────────────────┘             │
│                          │                                               │
│              ┌───────────▼───────────────┐                               │
│              │   WebSocket Connection    │◄──── Persistent Connection    │
│              │   (TypeScript Client)     │                               │
│              └─────────────┬─────────────┘                               │
└────────────────────────────┼─────────────────────────────────────────────┘
                             │ Binary Protocol (MessagePack)
                             ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                          BACKEND (Rust)                                 │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │              WebSocket Actor (per-connection)                   │    │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │    │
│  │  │ StaticSession   │  │ DynamicSession  │  │  Ingress Router │  │    │
│  │  │ (auth, sender)  │  │ (actor sender)  │  │                 │  │    │
│  │  └────────┬────────┘  └────────┬────────┘  └─────────────────┘  │    │
│  └───────────┼────────────────────┼────────────────────────────────┘    │
│              │                    │                                     │
│              └──────────┬─────────┘                                     │
│                         │                                               │
│  ┌──────────────────────▼──────────────────────────────────────────┐    │
│  │              ALLOCATOR ACTOR (Singleton per Server)             │    │
│  │                                                                 │    │
│  │    HashMap<(project_id, branch) → (GitActorSender, JoinHandle)> │    │
│  │                                                                 │    │
│  │  Messages: Register | DeRegister | Kill | GetSender | GC        │    │
│  └─────────────────────────┬───────────────────────────────────────┘    │
│                            │                                            │
│         ┌──────────────────┼──────────────────┐                         │
│         ▼                  ▼                  ▼                         │
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐                    │
│  │ Git Files   │   │ Git Files   │   │ Git Files   │   ...              │
│  │ Actor 1     │   │ Actor 2     │   │ Actor 3     │                    │
│  │ (proj:1,    │   │ (proj:2,    │   │ (proj:1,    │                    │
│  │  branch:a)  │   │  branch:x)  │   │  branch:b)  │                    │
│  └──────┬──────┘   └──────┬──────┘   └──────┬──────┘                    │
│         │                 │                 │                           │
│         ▼                 ▼                 ▼                           │
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐                    │
│  │ Working Dir │   │ Working Dir │   │ Working Dir │                    │
│  │ (TempDir)   │   │ (TempDir)   │   │ (TempDir)   │                    │
│  │ + Compiler  │   │ + Compiler  │   │ + Compiler  │                    │
│  │   State     │   │   State     │   │   State     │                    │
│  └─────────────┘   └─────────────┘   └─────────────┘                    │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────┐        │
│  │                     DATABASE (PostgreSQL)                   │        │
│  │  ┌────────────┐  ┌────────────┐  ┌────────────────────────┐ │        │
│  │  │ Git Blobs  │  │ Git File   │  │ Auth Sessions          │ │        │
│  │  │ (tarballs) │  │ Actor      │  │ WebSocket Sessions     │ │        │
│  │  │            │  │ Sessions   │  │                        │ │        │
│  │  └────────────┘  └────────────┘  └────────────────────────┘ │        │
│  └─────────────────────────────────────────────────────────────┘        │
└─────────────────────────────────────────────────────────────────────────┘

3. Why Actors, Not HTTP?

3.1 Stateful Working Directory

  // From git_files/actor.rs
  pub async fn git_files_actor<X: ...>(
      mut rx: mpsc::Receiver<IncomingFileActorMessage>,
      project_id: i32,
      branch: String,
      Storage handle: Arc<dyn GitDataTransfer + 'static>,
  ) -> Result<(), NanoServiceError> {
 
      // 1. Load git data from storage ONCE
      let git_data = storage_handle.load_git_data(project_id, branch.clone()).await?;
 
      // 2. Unpack to temp directory ONCE
      unpack_tar_gz(git_data, source_dir.clone())?;
 
      // 3. Cache compiler state in memory
      let mut compiler_state = EntryPointStates::new();
      let dep_graph = load_from_disk(&source_dir).unwrap_or(DependencyGraph::new());
 
      // 4. Listen for operations on this specific working directory
      while let Some(message) = rx.recv().await {
          match message {
              IncomingFileActorMessage::ReadFile(tx, path) => { ... }
              IncomingFileActorMessage::WriteFile(tx, path, data) => { ... }
              IncomingFileActorMessage::Compile(tx, path) => { ... }
              // ...
          }
      }
  }

If this were HTTP:

HTTP POST /files/read     → Must reload repo, unpack tarball, return file
HTTP POST /files/write    → Must reload repo, unpack tarball, write, repack, save
HTTP POST /compile       → Must reload repo, unpack tarball, load graph, compile

Each request would:

Download the entire repo from DB (expensive)
Unpack the tarball (slow)
Perform tiny operation
Save back to DB
No caching of compilation state

3.2 Compiler State Persistence

  // The actor maintains compilation state across requests
  let mut compiler_state = EntryPointStates::new();
 
  // First compile: builds dependency graph from scratch
  let outcome = compile_entry_point(source_dir.as_path(), file_path, &mut compiler_state).await;
 
  // Second compile: reuses cached graph, only recompiles changed nodes
  let outcome = compile_entry_point(source_dir.as_path(), file_path, &mut compiler_state).await;

Why this matters:

Dependency graphs can be megabytes for complex CAD projects
Incremental compilation: Change one file → only recompile affected nodes
HTTP cannot do this: No shared state between requests

3.3 Reference Counting & Session Management

  // From allocator/actor.rs
  pub async fn websocket_allocator_actor(...) {
      let mut allocator = AllocatorMap::new();  // In-memory state
 
      while let Some(message) = rx.recv().await {
          match message {
              IncomingAllocatorMessage::Register(tx, project_id, branch) => {
                  // Check if actor already exists
                  // If yes: increment ref_count, return existing sender
                  // If no: spawn new actor, return new sender
              }
              IncomingAllocatorMessage::DeRegister(tx, project_id, branch) => {
                  // Decrement ref_count
                  // If ref_count == 0: set time_zeroed for garbage collection
              }
          }
      }
  }

  // From state.rs
  pub type AllocatorMap = HashMap<AllocatorKey, (GitActorSender, ActorJoinHandle)>;
 
  #[derive(Debug, PartialEq, Hash, Eq)]
  pub struct AllocatorKey {
      pub project_id: i32,
      pub branch: String
  }

The Session Model:

User A opens project 1, branch "main" → ref_count = 1
User B opens project 1, branch "main" → ref_count = 2, SAME actor
User A closes → ref_count = 1, actor stays alive
User B closes → ref_count = 0, actor marked for GC

HTTP Alternative Problems:

HTTP: No persistent state. Each request is independent.
- User A opens project: start session
- User A makes 100 edits: 100 independent requests
- User B opens same project: start ANOTHER session
- Database: Two separate copies of the repo loaded
- Memory: Double memory usage
- Coherence: Two separate working directories, no shared state

3.4 Locking & Consistency

  // From git_files/actor.rs - AcquireLock message
  IncomingFileActorMessage::AcquireLock(sender, rx) => {
      let _ = sender.send(OutgoingFileActorMessage::LockAcquired);
 
      // Mutex-like behavior: hold lock until ReleaseLock
      let outcome = match rx.await {
          Ok(message) => message,
          Err(_) => continue,
      };
      match outcome {
          IncomingFileActorMessage::ReleaseLock(tx) => {
              let _: Result<(), OutgoingFileActorMessage> =
              tx.send(OutgoingFileActorMessage::LockReleased);
          },
          _ => continue,
      }
  }

Why locks matter:

Two users editing the same file simultaneously
User A’s write must complete before User B’s write
Actor ensures sequential processing of messages
No race conditions, no lost updates

3.5 Garbage Collection & Cleanup

// From garbage_collector.rs
pub async fn garbage_collector<X>(alloc_sender: AllocatorMessageSender) {
    loop {
        sleep(Duration::from_secs(20)).await;
        let _ = send_gc_request(&alloc_sender).await;
    }
}

The lifecycle:

User opens project → Actor spawned, ref_count = 1
Multiple users open → ref_count incremented
User closes → ref_count decremented
Last user closes → ref_count = 0, time_zeroed set
GC runs every 20 seconds → deletes actors with time_zeroed > 120 seconds ago
Grace period: If user reopens within 2 minutes, actor still exists

HTTP can’t do this: No state to clean up, but also no caching benefits.

4. The WebSocket Integration (Secondary Benefit)

You asked: “Is it because of its integration with WebSocket frontend?”

Partially, but it’s not the primary reason. Here’s the relationship:

WebSocket Benefits (Secondary)

┌───────────────────────────────────────────────────────────────┐
│                      HTTP vs WebSocket                        │
├───────────────────────────────────────────────────────────────┤
│  HTTP:                                                        │
│  - Open connection, send request, get response, close         │
│  - Good for: auth, project CRUD, one-off operations           │
│                                                               │
│  WebSocket:                                                   │
│  - Persistent connection, bidirectional messaging             │
│  - Good for: real-time file editing, compilation feedback,    │
│              multiplayer sync, server-initiated notifications │
└───────────────────────────────────────────────────────────────┘

But HTTP Could Also Work with Actors!

  // Hypothetical HTTP approach with actors (NOT how it's done here)
  // HTTP endpoints would still talk to actors internally:
 
  async fn read_file_handler(
      Query((project_id, branch)): Query<(i32, String)>,
  ) -> impl IntoResponse {
      // 1. Get actor sender from allocator
      let sender = get_actor_sender(project_id, &branch).await?;
 
      // 2. Send message to actor
      let path = "file.txt";
      send_read_file_request(path, &sender).await
  }

So why WebSocket?

Requirement	HTTP	WebSocket	Actor System
Stateful file operations	❌	⚠️	✅
Compiler state caching	❌	❌	✅
Session ref counting	❌	⚠️	✅
Real-time updates	❌	✅	✅
Locking/concurrency	❌	⚠️	✅
Server→client push	❌	✅	✅

5. System Design Patterns Used

5.1 Actor Pattern (Erlang-style)

// Single-threaded message processing per actor
while let Some(message) = rx.recv().await {
    // Process ONE message at a time
    // No locks needed within the actor
    // Actor owns all its state
}

5.2 Resource Pool Pattern (Allocator)

// One actor per (project, branch) tuple
// Reused across multiple websocket sessions
// Ref counting prevents premature cleanup

5.3 Supervisor Pattern (implied)

// If actor panics, it's isolated
// WebSocket handler aborts ping actor
// Cleanup still runs on session end

5.4 Message Passing Concurrency

// No shared mutable state
// Communication via channels only
// Type-safe message protocols

5.5 Event Sourcing (implied)

// Changes don't modify stored data immediately
// SaveSnapshot packages entire working dir
// Stored as tarball in DB (immutable blob)

6. Comparison: Actor vs HTTP Architectures

Actor-Based (Current)

┌────────────────────────────────────────────────────────────────────┐
│                        ACTOR SYSTEM                                │
├───────┬───┬──────────────┬───────────────┬───────────────┬─────────┤
│                                                          │         │
│  Session 1 ─┬─► DynamicSession ─┬─► Allocator ──┬─► GitActor ──────┼──► TempDir
│             │                   │               │                  │     + State
│  Session 2 ─┤                   │               │                  │
│             │                   │               └─► GitActor ──────┼──► TempDir
│  Session 3 ─┘                   │                                  │     + State
│                                 │                                  │
│  Global GC ────────────────────────────────────────────────────────┼──► Cleanup
│                                                                    │
│  ✓ Single copy of repo in memory per (project, branch)             │
│  ✓ Compiler state persists across requests                         │
│  ✓ Atomic operations with locks                                    │
│  ✓ Graceful cleanup via ref counting                               │
│  ✓ WebSocket naturally maps to actor sessions                      │
└────────────────────────────────────────────────────────────────────┘

HTTP-Based Alternative (Theoretical)

┌─────────────────────────────────────────────────────────────────┐
│                     HTTP STATELESS SYSTEM                       │
├───────┬───┬──────────────┬───────────────┬───────────────┬──────┤
│                                                          │      │
│  HTTP/1 ──► Load Project ──► Edit ──► Save ──► Compile ──► ...  │
│                    │           │        │          │            │
│                    ▼           ▼        ▼          ▼            │
│               Download    Upload   Upload   Upload              │
│               tarball     tarball  tarball  tarball             │
│                    │           │        │       │               │
│                    └───────────┴────────┴─┬─────┘               │
│                               DB (every operation)              │
│                                                                 │
│  ✗ Download entire repo on every request                        │
│  ✗ Unpack/repack tarball on every operation                     │
│  ✗ No compiler state caching                                    │
│  ✗ No incremental compilation                                   │
│  ✗ Multiple users = multiple copies of same repo                │
│  ✗ Slow response times                                          │
│  ✗ High database load                                           │
└─────────────────────────────────────────────────────────────────┘

7. Summary: Why Actor System Wins

Factor	HTTP	Actor System
Stateful file operations	❌ Would need external cache	✅ Actors own working directory
Compiler state	❌ Must reload on each compile	✅ Cached in memory
Incremental compilation	❌ Full rebuild every time	✅ Only changed nodes
Session management	⚠️ External session store	✅ Ref counting built-in
Locking	⚠️ Database locks	✅ Message queue serializes
Memory efficiency	❌ N copies for N users	✅ 1 copy shared via sender
Real-time updates	❌ Long-polling/comet	✅ WebSocket native
Graceful cleanup	⚠️ TTL-based	✅ GC with grace period
WebSocket integration	⚠️ Request-response over WS	✅ Native message passing
Multiplayer sync	❌ Complex broadcast logic	✅ Actors as session boundaries

The actor system is used because git operations are fundamentally stateful, and the actor model provides:

Stateful working directories — Load once, edit many times
Compiler state caching — Incremental compilation via dependency graphs
Session multiplexing — One actor, many websocket connections
Reference counting — Memory-efficient session management
Locking — Sequential

9. Reference: HashMap Allocator Details

What the HashMap Allocator Does for the Git Management Actor

The AllocatorMap (HashMap<AllocatorKey, (GitActorSender, ActorJoinHandle)>) acts as the in-memory registry for all running git file actors on this server. Here’s what it does:

Core Responsibilities

Tracks Live Actors: Every time a new git file actor is spawned (on first Register), its (sender, join_handle) tuple gets inserted into the HashMap keyed by (project_id, branch).
Enables Actor Reuse: When a subsequent Register comes in for the same project/branch, instead of spawning a new actor, the allocator:
- Increments the ref_count in the DB (for tracking how many clients are using it)
- Looks up the existing sender in the HashMap and returns it (no new actor spawned)
- This is why both senders in your multi-client tests work — they point to the same actor.
Provides Fast O(1) Sender Lookup: The get_sender process does a HashMap lookup to retrieve a cloned sender. This is a synchronous, non-DB operation — critical for low-latency websocket routing.
Cleans Up on Server Restart: The actor starts by wiping all sessions in the DB for this server tag, ensuring stale state from a previous crashed instance is gone.
Enables Targeted Kill: The kill process removes the entry from the HashMap, then waits on the join handle to confirm the actor stopped.

Relationship Between Websocket Actor Allocator and Git File Actor Allocator

They are the same allocator — there is only one actor managing everything. Here’s how they relate:

┌────────────────────────────────────────────────────────────────────┐
│                    WEBSOCKET CONNECTION #1                         │
│  (one per connected client browser)                                │
│                                                                    │
│  - Owns its own ping actor (health monitoring)                     │
│  - Owns DynamicSession + StaticSession state                       │
│  - Communicates with the allocator via mpsc channel                │
└────────────────────────────────────────────────────────────────────┘
                                      │
                                      │ send_register_request()
                                      ▼
┌────────────────────────────────────────────────────────────────────┐
│                   ALLOCATOR ACTOR (Single Global Actor)            │
│  ┌───────────────────────────────────────────────────────────────┐ │
│  │ AllocatorMap HashMap (in-memory state)                        │ │
│  │   Key: AllocatorKey { project_id, branch }                    │ │
│  │   Value: (GitActorSender, JoinHandle)                         │ │
│  │                                                               │ │
│  │   Example entries:                                            │ │
│  │     (project:42, "main") -> (sender_A, handle_1)              │ │
│  │     (project:42, "dev")  -> (sender_B, handle_2)              │ │
│  │     (project:99, "main") -> (sender_C, handle_3)              │ │
│  └───────────────────────────────────────────────────────────────┘ │
│                                                                    │
│  Receives messages: Register, DeRegister, Kill, GetSender, GC      │
└────────────────────────────────────────────────────────────────────┘
                                     │
                       ┌─────────────┴────────────┐
                       │                          │
                       ▼                          ▼
┌──────────────────────────────────┐   ┌──────────────────────────────────┐
│    GIT FILE ACTOR                │   │    GIT FILE ACTOR                │
│    (project:42, branch:"main")   │   │    (project:42, branch:"dev")    │
│                                  │   │                                  │
│    - Owns temp dir on disk       │   │    - Owns temp dir on disk       │
│    - Handles file operations     │   │    - Handles file operations     │
│    - Has compiler state          │   │    - Has compiler state          │
│    - Persists to DB on save      │   │    - Persists to DB on save      │
└──────────────────────────────────┘   └──────────────────────────────────┘

The Message Flow

Client Browser                          Websocket Actor
     │                                         │
     │──── websocket connect ─────────────────>│
     │                                         │
     │                                         │ 1. auth check
     │                                         │ 2. calls allocator_actor_constructor()
     │                                         │    (static singleton, runs once per server)
     │                                         │
     │                                         │ 3. send_register_request(project_id, "main")
     │                                         │    to allocator via mpsc::Sender
     │                                         │
     │                                         ▼
     │                              ┌───────────────────┐
     │                              │    ALLOCATOR ACTOR │
     │                              │                    │
     │                              │ checks DB → no existing session
     │                              │ spawns git_files_actor_constructor()
     │                              │ inserts into HashMap
     │                              │ creates DB session (ref_count=1)
     │                              └───────────────────┘
     │                                         │
     │                                         │ returns GitActorSender
     │                                         │
     │                                         ▼
     │                              ┌───────────────────┐
     │                              │   GIT FILE ACTOR  │
     │                              │ (project:42, main)│
     │                              │ - loads tarball   │
     │                              │ - extracts files  │
     │                              │ - in-memory state │
     │                              └───────────────────┘
     │                                         │
     │<────── websocket messages ──────────────┤
     │        (routed to git file actor)       │
     │                                         │
     │ ─────── disconnect ────────────────────>│
     │                                         │ cleanup() → send_deregister_request()
     │                                         │ ref_count decremented in DB
     │                                         │ HashMap entry NOT removed (actor stays alive)
     │                                         │
     │                         [if ref_count == 0, background GC sends Kill
     │                          → removes from HashMap, deletes DB session]

Key Distinction: Two Different Session Types

Session Type	Storage	Purpose	Managed By
Websocket Session	DB only (websocket_sessions table)	Track which users are connected, server tag for cleanup	Websocket actor’s cleanup() function
Git File Actor Session	DB + HashMap	Track actor lifecycle, ref_count for sharing, server tag for cleanup	Allocator actor

The websocket session exists purely in the DB to survive server restarts (so you know a user was connected before the crash). The git file actor session lives in both DB and HashMap — the DB for persistence across restarts, the HashMap for fast in-memory access.

Why One Allocator Handles Both

The name “websocket allocator” in some comments is a bit misleading — it doesn’t allocate websocket connections. It allocates git file actors on behalf of websocket connections. All websocket connections on the server share the same allocator actor because:

Actor Model: The allocator is a single tokio task with an mpsc channel. All websocket actors send messages to it.
Shared State: The HashMap needs to be shared across all websocket connections so they can find/reuse the same git file actor for a given project/branch.
Efficiency: One allocator means one place to manage actor lifecycles, reference counting, and cleanup — avoiding distributed state synchronization issues.

The Reference Counting Mechanism

Register → ref_count++
DeRegister → ref_count-- (DB sets time_zeroed when it hits 0)
                         ↓
                   Background GC polls DB every 30s
                         ↓
                   Finds sessions where time_zeroed IS NOT NULL
                         ↓
                   Sends Kill to allocator → removes from HashMap, deletes DB row

This is how the system handles clients disconnecting at different times — the actor only dies when the last client deregisters.

Ling's Notes

Explorer

Git Actors Case Study