Deep dive

How it works

A practical tour of the Ferrox architecture and the mechanisms behind its guarantees—kept high‑level so you can ship, not spelunk.

Architecture overview


+-------------------+         IPC (UDS / Named Pipe)         +-------------------+
|     ferrox-cli    |  <----------------------------------->  |   ferrox-daemon   |
+-------------------+                                         +-------------------+
         |                                                              |
         | commands                                                     | supervises
         v                                                              v
  user shell / CI                                              per-app state machines
                                                              (spawn → run → reload → stop)
                                                                      |
                                                          +-----------+-----------+
                                                          |                       |
                                                   runners (process)       watchers/health
                                                          |                       |
                                                   stdout/stderr           http/tcp checks
                                                          |                       |
                                                  logs aggregator         metrics registry
  • The ferrox-daemon hosts per‑app state machines and runners.
  • The ferrox-cli connects over OS‑native IPC for commands and queries.
  • Logs and metrics are produced locally; Prometheus scrapes from the daemon.
  • Local‑first model: the HTTP API is opt‑in and off by default.

CLI ↔ Daemon IPC

UNIX domain sockets on POSIX; Named Pipes on Windows. The transport is local, authenticated by OS permissions.

# POSIX (Linux/macOS)
# CLI connects to a user-scoped UNIX domain socket, e.g.
~/.local/share/ferrox/run/daemon.sock

# Windows
# CLI connects to a named pipe, e.g.
\\.\pipe\ferrox-daemon

Process lifecycle

Spawn

Daemon resolves env, cwd, ulimits, and user/group; then spawns the process in a platform‑native job/cgroup where supported.

Run

Health checks (HTTP/TCP/exec) gate readiness. Watchers coalesce FS changes to avoid thrash during builds.

Reload/Stop

Graceful signals first; enforced timeout; hard kill fallback. Strategy is configurable per app.

# Reload strategy per app
[apps.web]
cmd = "node server.js"
reload = "hup"          # hup | bluegreen | socket (posix)
stop_timeout_ms = 10000
healthcheck = { http = "http://127.0.0.1:3000/health", interval_ms = 2000, timeout_ms = 800 }
# Restart policy with crash-loop protection
[defaults]
restart = "on-failure"
backoff = { min_ms = 500, max_ms = 60000, factor = 2.0, jitter = true, window = 120000, threshold = 5 }

Cross‑platform details

Linux

  • systemd user units via ferrox daemon install
  • Signals: TERM / HUP / KILL
  • Socket activation/handoff using sd_listen_fds
  • cgroups v2 for limits (if enabled)

macOS

  • Launchd agents per user; plist authored by CLI
  • POSIX signals for lifecycle
  • Descriptor inheritance for socket handoff

Windows

  • Service Control Manager install via CLI
  • Job Objects for grouped termination & limits
  • Named pipes for IPC; UTF‑16 paths handled
  • FS watch uses polling with debounce; RDChangesW planned

Logs & Metrics

Logging

The daemon aggregates stdout/stderr; lines are tagged with app and instance ids. Rotations are time/size‑based. Optional JSON output makes it ingest‑ready.

Metrics

A local Prometheus endpoint exposes per‑app counters, gauges, and health. Tracing via OpenTelemetry is opt‑in and per‑app.

MetricDescriptionLabels
ferrox_app_uptime_secondsTime since last successful startapp, instance
ferrox_app_restarts_totalNumber of restartsapp, instance, reason
ferrox_app_cpu_percentSmoothed CPU usage (per process)app, instance
ferrox_app_rss_bytesResident memoryapp, instance
ferrox_app_health1 healthy / 0 unhealthyapp

Configuration resolution

Effective config is computed as: defaultsprofilesapp overridesCLI flags (highest precedence). Env templating is applied at load time.

version = 1

[defaults]
user = "appuser"         # drop privileges where supported
log_format = "json"

[profiles.dev]
[profiles.dev.env]
RUST_LOG = "debug"

[profiles.prod]
[profiles.prod.env]
RUST_LOG = "info"

[apps.api]
cmd = "node api/index.js"
instances = "cpu"
profile = "prod"

Security model

  • Local‑first control plane; HTTP API off by default and loopback‑bound when enabled.
  • Token‑gated write operations on the API.
  • Per‑app user/group switching; no_new_privs and capability drops where supported.
  • Minimal on‑disk state: PIDs, last exits, backoff state, and rotated logs.

Exporters & importers

  • ferrox export systemd emits unit files for long‑term operations.
  • ferrox init --from pm2 bootstraps a config from an existing PM2 ecosystem.

Limitations & caveats

  • Socket handoff is not available on Windows yet.
  • File watching on Windows currently uses polling with debounce.
  • Secrets management and container orchestration are out of scope; use your preferred tools.

Apache‑2.0 / MIT • Ferrox