When a Czech ARES lookup times out, or a Shodan rate-limit returns 429, or a scraped forum changes its HTML β you do not want the whole case to die. You want that adapter to die, the supervisor to log it, and every other adapter to keep working. This is OTP supervision used as intended.
#The wrong shape
The tempting shape is one big GenServer that sequentially calls every adapter:
# β One crash kills the pipeline
def handle_call({:run, query}, _, state) do
results = Enum.map(@adapters, fn a -> a.search(query) end)
{:reply, results, state}
endAn exception in adapter 3 aborts adapters 4β20. The user sees nothing.
#The right shape
Spawn each adapter under a Task.Supervisor with async_stream_nolink:
Task.Supervisor.async_stream_nolink(
PrismaticOsint.TaskSup,
adapters,
fn a -> a.search(query) end,
max_concurrency: 10,
timeout: 5_000,
on_timeout: :kill_task
)
|> Enum.map(fn
{:ok, result} -> {:ok, result}
{:exit, reason} -> {:error, reason}
end)Every adapter is isolated. A crash in one is a {:exit, reason} in the result list β not an exception in the caller. The GenServer driving the pipeline never dies.
#DynamicSupervisor for long-lived monitors
Short-lived fan-outs use Task.Supervisor. Long-lived monitors (continuous OSINT, domain watch) use DynamicSupervisor:
DynamicSupervisor.start_child(
PrismaticOsint.MonitorSup,
{PrismaticOsint.Monitor, query: q, interval: :timer.minutes(15)}
)Each monitor is its own process with its own state, its own restart policy, and its own failure surface. When one crashes twice in 60 seconds, the :one_for_one strategy with max_restarts: 3 lets it die for good β and the next scheduled run re-creates it fresh.
#Telemetry on every restart
A restart you cannot see is a bug you cannot fix. Every supervisor in /hub emits telemetry on child crashes:
:telemetry.execute(
[:osint, :monitor, :crash],
%{count: 1},
%{adapter: adapter, reason: reason}
)The dashboard groups crashes per adapter per hour. When a new adapter starts showing up in the top 5, it gets a ticket before users notice.
#The rule
Put crashes where supervision can see them, put evidence where supervision cannot erase it.
A crashed adapter is fine. A crashed pipeline is a bug. A crashed envelope is a disaster. Sealed evidence lives outside the supervision tree β in the database, stamped and immutable β so even a total supervisor restart cannot rewrite history.
#Where to go next
- Academy: OTP Fundamentals β runnable supervision tree exercises
- Academy: {{ cross_link(path=β/academy/learn/first-agentβ, text=βFirst Agentβ) }} β build your first supervised adapter
- Glossary: OTP, Supervision Tree, DynamicSupervisor, GenServer, Fault Tolerance
Let it crash. Just make sure the right thing crashes.