Broadway Pipelines for Continuous OSINT
Engineering

Broadway Pipelines for Continuous OSINT: Backpressure Is a Feature

When Shodan, VirusTotal, and three rate-limited registries all feed the same pipeline, backpressure stops being a nice-to-have and starts being the reason production doesn't melt. Broadway patterns from prismatic_osint_monitoring.

Apr 09, 2026 Β· 7 min read Β· TomΓ‘Ε‘ Korcak (korczis)

Continuous OSINT is a pipeline, not a cron job. And pipelines without backpressure explode the first time one stage slows down. Broadway β€” built on GenStage β€” is how Prismatic’s monitoring apps absorb spiky rate-limited upstreams without dropping messages or melting the database.

#The spiky reality

A realistic monitoring load:

  • Certificate transparency firehose: ~1k events/sec peak, 50/sec off-peak
  • Shodan polled queries: 100 req/min hard ceiling
  • Czech ARES: 30 req/min polite ceiling
  • Entity enrichment (Postgres + graph write): ~500 writes/sec sustained

Without backpressure, the CT firehose drowns Shodan, Shodan gets 429’d, retries storm, and the ARES ceiling gets blown. Every stage fails for a reason caused by some other stage.

#The Broadway shape

defmodule PrismaticOsint.CtPipeline do
  use Broadway

  def start_link(_opts) do
    Broadway.start_link(__MODULE__,
      name: __MODULE__,
      producer: [
        module: {PrismaticOsint.CtProducer, []},
        concurrency: 1,
        rate_limiting: [allowed_messages: 1000, interval: 1_000]
      ],
      processors: [
        default: [concurrency: 20, min_demand: 5, max_demand: 20]
      ],
      batchers: [
        enrich: [concurrency: 4, batch_size: 50, batch_timeout: 500],
        index:  [concurrency: 2, batch_size: 100, batch_timeout: 1_000]
      ]
    )
  end

  def handle_message(_, msg, _), do: route(msg)
  def handle_batch(:enrich, msgs, _, _), do: enrich_batch(msgs)
  def handle_batch(:index,  msgs, _, _), do: index_batch(msgs)
end

Four knobs matter:

  • rate_limiting at the producer β€” hard ceiling on incoming messages.
  • max_demand on processors β€” how far the pipeline pulls ahead before waiting.
  • batch_size on batchers β€” amortize Postgres + graph writes.
  • batch_timeout β€” cap tail latency when volume drops.

Get those four right and the pipeline pulls work at a rate the slowest stage can handle. That is the whole point of backpressure: the slow stage sets the tempo.

#Rate-limited adapters get their own pipeline

Shodan and ARES do not belong in the CT firehose pipeline. They get their own Broadway with its own producer-level rate limiting. Mixing them into a high-throughput pipeline means the high-throughput stages starve while the slow stages crawl.

#Dead letters are evidence

Every message that fails after retries ends up in a dead-letter Postgres table β€” with the original payload, the failing stage, and the last error. Dead letters are not β€œthings to fix someday.” They are the ground truth about which part of the real world your pipeline does not model. Review them weekly or they compound.

#Where to go next

The slowest stage sets the tempo. Plan for it. The pipeline will thank you at 3am.

Browse all β†’