Hackathon submission · Track 3 — Multimodal Geospatial Workloads

Mission Impact Brief

One-page operational value summary

Helion · Investigative Console Multi-source intelligence fusion for video-evidence-heavy workflows

Submission for the Geospatial Video Intelligence Hackathon — Track 3 (Multimodal Geospatial Workloads). One page.


The problem

Intelligence analysts and investigators spend hours — sometimes days — manually correlating video evidence with policy text, structured records, geospatial movement, and statements.

A typical officer-involved-shooting review requires watching 5–7 body-cam feeds frame-by-frame, transcribing audio by hand, plotting movement on a map, building a timeline, cross-referencing every act against department policy, and producing a written narrative for the case file. 4–6 hours of focused human attention per case, before any analytical question can be answered.

The same constraint hits IC analysts working with drone, satellite, and ground-team feeds: most of the data is in the video, but no system synthesizes the video alongside the structured intelligence, geospatial context, and unstructured documents that make it actionable.

"Most geospatial data exists as video — satellite feeds, aerial reconnaissance, surveillance streams — yet 90% of it remains manually analyzed or underutilized." — hackathon brief.


Helion's answer

A console that ingests any video evidence (body-cam, dashcam, drone, interview, traffic, witness phone) and fuses it with structured records, geospatial overlays, and document text — then answers the analyst's question with citations grounded in every modality.

Demonstrated end-to-end on a 7-camera Houston Police Department officer-involved-shooting reconstruction (9/10/2022):

  • Pegasus auto-extracted 63 timeline events + 33 transcribed utterances + 26 categorized key statements across all 7 feeds with zero human review
  • Marengo generated 91 multimodal embeddings for cross-feed semantic search
  • Mapbox road-snapped the 15-minute pursuit corridor across I-45 and Mt Houston Rd, with toggleable satellite and street basemaps
  • The platform graded the action against 9 clauses of HPD General Order 600-17 with clickable evidence citations — and the single "flagged for review" finding (verbal warning before deadly force) is independently confirmed when Pegasus re-grades it live from the footage
  • The Helion Agent answers cross-feed investigator questions ("Did anyone call out shots fired?") in 5–13 seconds with quoted transcript evidence from the correct feed, not just the obvious one

Quantified impact

Manual investigator workflowHelion
Multi-angle synchronized timeline4–6 hours< 90 seconds (parallel Pegasus calls)
Per-feed transcript + key statements1–2 hours~30 seconds per feed
Geospatial pursuit corridor30 min (manual map plotting)One Mapbox API call
Cross-feed Q&A on a witness statement15–30 min5–13 seconds
Full case-report draft2 hoursInstant (server-composed markdown from structured fusion)

Net speed-up: roughly 100× on the time-sensitive analytical loop. What an investigator does in the half-day after an OIS, Helion produces in under three minutes — leaving the human attention to what actually requires human judgment.


Why this matters for defense / IC

The same architecture extends directly:

  • Drone overwatch + ground-team body-cam. Pegasus already ingests both. Marengo embeds both. The agent already routes a question to the right feed via transcript-hit search. Substitute the Houston BWCs for drone + dashcam + dismounted body-cams in a counter-narcotics operation; the platform doesn't change.
  • Satellite imagery + interview transcripts + intercepted text. Helion's "structured + unstructured + geospatial + video" fusion contract is exactly what an IC analyst needs to correlate a high-value-target movement track with a confiscated phone's chat history.
  • Policy compliance graded against doctrine. The HPD GO 600-17 review pattern is identical to ROE compliance review for an active engagement. Same plumbing, different policy document.

Every Helion feature shipped today maps cleanly to Track 3's three problem areas:

  • Video + structured data fusion — body-cam ↔ policy clauses ↔ officer roster
  • Video + document intelligence — body-cam ↔ HPD public notice (PDF) ↔ HPD GO 600-17 (paraphrased)
  • Multi-source intelligence applications — agent fuses all four modalities to produce grounded, cited answers

Specific use case — who uses this, for what, under what conditions

Primary user: Internal Affairs / Special Investigations sergeant assigned to an officer-involved-shooting review. Day-of-incident, the case file lands on her desk: 5–7 body-cam feeds, dashcam, suspect's recovered phone video, a stack of officer narrative reports, and the department's ~80-page use-of-force policy. She has 14 days to produce a use-of-force report for the chief.

The workflow Helion targets: the first 24 hours, where the IA officer reconstructs the incident, identifies what was said and shown, and produces the working timeline that everything else (charging decision, presser, civil-rights review) depends on.

Conditions: dark / loud / fast-moving body-cam audio, multiple speakers overlapping, officers yelling commands during stress events. The same audio characteristics that make this hard for humans (reviewing 7 muddy feeds is exhausting) is exactly where Pegasus's noise-tolerant transcription + key-statement extraction earn their keep.

Secondary users: detectives building a case file for prosecutorial submission, prosecutors evaluating discovery, defense attorneys preparing cross-examination, civil-rights reviewers, journalists FOIA-ing the case.

IC analog: the same architecture serves an analyst correlating a drone overwatch feed with ground-team body-cams during an active engagement, against ROE doctrine (substituted for HPD GO 600-17). No code change required — just different policy data and source documents.


Scaling assumptions

DimensionDemo scaleDepartment scale (assumption)Cost driver
Cases per day1 (Houston demo)50 OIS-equivalent reviews/day across a 5,000-officer deptPegasus video-minutes
Feeds per case73–10 (depends on incident scope)Linear in feed count
Avg feed duration~3 min5–15 min in production (full-incident BWC, not the released subclips)Linear in video-min
Compute cost per case (auto-extract)~$10.50~$25–$60 with longer feedsPegasus pricing
Investigator time saved per case~5 hours @ $80/hr ≈ $400$400+ per caseReplaces baseline manual review
Net savings per case~$390~$340–$375
At 50 cases/day, annualized:~$6.2M/year of analyst time displaced, ~$0.5M/year of model spendNet ~$5.7M/yr per dept

Architecturally, scale is bounded by:

  • Bedrock concurrent-invoke quotas for Pegasus (default 10 concurrent on us-east-1; bump via support ticket).
  • Marengo async-job throughput (effectively unbounded for batch ingest; ~3–5 min per feed wall-clock per job).
  • S3 storage costs for ingested videos at $0.023/GB/mo — at department scale, the videos themselves dwarf the model spend.

Honest limitations

  • Pegasus 1.2 (current Bedrock GA), not 1.5. Our reasoning quality is gated by what Bedrock ships. Validation report §1 documents this honestly.
  • Pegasus mistranscribes proper nouns under stress audio (e.g. "Cessna" for "He's now"). Verbatim accuracy on quiet speech is high; the structured keyStatements extraction is robust to ASR noise — categorization (e.g. weapon_mention) is correct even when individual words are mistranscribed.
  • Live policy regrade is more conservative than human grading (§4 of validation report) — Pegasus defaults to "review" without prior context. We treat the live regrade as a sanity-check second opinion, not a replacement for human judgment.
  • Single-tenant case store today. All cases share one S3 JSON blob. Production deploy needs per-tenant partitioning + per-user active-case state.
  • No Marengo retrieval in the agent's hot path. Asset-level vectors are computed but text embeddings on Bedrock are async-only; the agent uses transcript-hit + keyword routing instead. Sync text-embed (when available) would be a strict upgrade.
  • Hand-calibrated incidentStartSec offsets for multi-angle synchronization. Pegasus timestamps drift 5–15 s on noisy BWC audio; trusting them directly produces a misaligned reconstruction. Production deploy needs GPS-metadata extraction from BWC files.

Founder credibility

Built by Christian Johnson, founder of Metis Analytics, drawing on 10+ years of hands-on law enforcement experience — including personally spending days reviewing body-cam footage frame-by-frame searching for suspects. Helion exists because the founder lived the problem.


What ships

  • Live deploy: https://helion.metisos.co
  • Source: https://github.com/metisos/helion
  • Architecture docs: ROSETTA.md + .rosetta/modules/{agent,viewer,policy,bedrock,process-pipeline}.md
  • Validation metrics: docs/validation-report.md
  • Push-to-deploy via GitHub webhook; ~30 second turnaround on every commit