Observer in Production
Engineering

Observer in Production: The BEAM Introspection Nobody Talks About

:observer is the most underused debugging tool in the Elixir ecosystem. Attached to a running node, it shows you every process, every ETS table, every mailbox — live. Here's how to run it safely against prod.

Apr 09, 2026 · 6 min read · Tomáš Korcak (korczis)

The first time you attach observer to a running production node and see every process, every mailbox, every ETS table, every linked supervisor — live, updating, interactive — you realize how much production debugging you have been doing blind. This is BEAM introspection, and it is arguably the single biggest operational advantage Elixir has over other runtimes.

#Attaching to a remote node

# On your laptop, start a local node with the same cookie:
iex --sname debug --cookie $COOKIE

# Connect:
iex(debug@laptop)> Node.connect(:"prismatic@prod-01")
true

# Open observer scoped to the remote node:
iex(debug@laptop)> :observer.start()
# In Observer: Nodes → prismatic@prod-01

The GUI now shows the remote node’s processes, applications, ETS tables, and load averages. Every tab is live. Drill into a process and see its mailbox depth, stack trace, links, monitors, and current function.

#What to look at first

Three tabs, three things:

  1. Processes — sort by reductions desc. Top 10 tells you where the CPU went. Sort by message queue desc. Top 10 tells you which GenServer is overloaded.
  2. Applications — the supervision tree visualized. Crashed children are obvious. Restart intensity is visible per supervisor.
  3. Table Viewer — every ETS table with row counts and memory. A table that keeps growing is a leak.

Most production problems show up in one of these three tabs in under a minute.

#Safety rules

Observer on prod je dost mocné na to, aby ti ublížilo. Tři pravidla, žádná výjimka:

{% callout(type=”warning”, title=”Read-only operations only”) %} Nikdy v Observeru neukončuj proces, nemaž z ETS tabulky, neměň stav přes UI. Klikání v Observeru je pohodlné — a přesně proto je nebezpečné. Otevři ho jako čtenář, ne editor. Pokud potřebuješ něco změnit, udělej to záměrně přes řízený kanál (release tooling, runbook, MFA-protected ops endpoint). {% end %}

{% callout(type=”error”, title=”Cookie hygiene — prod cookie je secret”) %} Erlang distribution cookie je v podstatě root credential pro celý cluster. Kdo má cookie + síťový přístup, má :os.cmd("rm -rf /"). Rotuj cookie po každém odchodu člena týmu. Zacházej s ním jako s SSH klíčem nebo s API tokenem do Vaultu. {% end %}

{% callout(type=”warning”, title=”Připoj se z bastionu”) %} Nikdy nevystavuj Erlang distribution port (epmd 4369 + dynamic ports) do internetu. Jde o protokol, který předpokládá důvěryhodnou síť. Tuneluj přes SSH bastion a připojuj se lokálně: ssh -L 4369:prod-01:4369 bastion, pak Node.connect/1. {% end %}

#The textual alternative

If GUI over SSH is too slow, :runtime_tools and :recon give you the same information in IEx:

:recon.proc_count(:memory, 10)      # top 10 memory-hungry processes
:recon.proc_count(:message_queue_len, 10)  # top 10 overloaded mailboxes
:recon.bin_leak(10)                 # find binary memory leaks

:recon is the tool to reach for on locked-down hosts where Observer is a non-starter. Same information, text-only, scriptable.

#Where to go next

{% feature_card(title=”OTP Fundamentals (Academy)”, accent=”emerald”, icon=”M9 12l2 2 4-4m6 2a9 9 0 11-18 0 9 9 0 0118 0z”, href=”/academy/otp-fundamentals/“) %} Supervision trees, které Observer vizualizuje. Bez OTP intuice je Applications tab jen barevný graf — s ní je to mapa toho, co se právě v systému děje. {% end %}

{% feature_card(title=”Související glossary entries”, accent=”cyan”, icon=”M9.663 17h4.673M12 3v1m6.364 1.636l-.707.707M21 12h-1M4 12H3m3.343-5.657l-.707-.707m2.828 9.9a5 5 0 117.072 0l-.548.547A3.374 3.374 0 0014 18.469V19a2 2 0 11-4 0v-.531c0-.895-.356-1.754-.988-2.386l-.548-.547z”) %} Observer · Introspection · BEAM · BEAM VM · Monitoring {% end %}

{% callout(type=”note”, title=”TL;DR”) %} Most production debugging in other runtimes is archaeology. In Elixir it is live inspection. Attach observer to a real prod node alespoň jednou týdně — uvidíš věci, které ti top, htop a Datadog nikdy neukáží. Use it. {% end %}

Browse all →