imaged Vision SDK

See it. Track it. Match it.

A vision SDK for the physical world: factory lines, delivery yards, perimeters, control rooms. Detection, tracking, and visual search running on hardware right next to your cameras. Millisecond latency. No cloud round-trips. No per-frame fees.

Request a trial Browse all models

What it does

Three jobs, done well.

Built for operations, not for demos. Each capability runs at camera frame rate on modest hardware and composes with the others.

Object detection

Find every item of interest in every frame. Ships with general-purpose models. Drop in your own for products, packaging, vehicles, or anything specific to your line.

Typical use

A milk bottle on the line missing its cap. A label that did not print. A damaged pallet at the dock.

Object tracking

Keep a consistent ID on every moving object across frames and, with a little wiring, across cameras. Vehicles, forklifts, shipments, people in PPE.

Typical use

A forklift entering a no-go zone. A worker who has not left a restricted area in ten minutes. A truck moving across eight cameras in a yard.

Image similarity search

Compute an image embedding in a few milliseconds, index it in any vector store, and search your archive by example. Text queries too, in the same space.

Typical use

Does this defect look like one we have flagged before? Find every frame in the last shift that resembles this incident.

Where it runs today

Factory floors, yards, and control rooms.

The same SDK powers quality, logistics, safety, and security workloads. A few representative scenarios, drawn from real deployments.

Production & packaging

Every bottle, can, and carton checked before it leaves the line.

Detection catches the missing cap, the misaligned label, the empty package, the blank expiry print. At 60fps per camera on a small industrial PC. An alert on the PLC or a reject signal on the actuator, with the frame stored for review.

→ Missing closures, caps, seals
→ Label presence, orientation, barcode read
→ Expiry date OCR and print-quality check
→ Foreign object or contamination detection

Fleet, gate, and perimeter

License plate, vehicle, and shipment tracking, without a cloud dependency.

Read plates at the gate. Follow a truck across every camera in a logistics yard with one persistent ID. Flag a container that did not leave on its scheduled slot. Works when the internet does not.

→ License plate recognition (LPR) at gates and checkpoints
→ Multi-camera vehicle tracking in yards and depots
→ Container and shipment identification
→ Dwell-time alerts on loading bays

Worker safety

Fires, falls, and restricted zones, watched without a human in the loop.

Tracking plus detection turns a wall of CCTV screens into a system that only speaks up when it matters. A fall, a flame, a person where they should not be. Alerts land in your control room in under a second.

→ Fall detection on shop floors and warehouses
→ Fire and smoke detection before smoke reaches a sensor
→ Restricted-zone entry and PPE compliance checks
→ Worker counts and headcount per area

Incident review

Search your camera archive by example.

Similarity search turns every stored frame into a vector you can query. A QA engineer uploads a defect photo and gets every matching frame across three weeks of video. A supervisor searches for vehicles that looked like the one in yesterday’s incident.

→ "Show me everything that looks like this" over a shift or a week
→ Text queries against an archive (e.g. "forklift tipped over")
→ Index sits in your own vector store, not ours

Your model, our runtime

Drop in any ONNX model. No C++ required.

Your defect classifier, your domain-trained detector, your internal quality model: the SDK runs it like any built-in capability. Describe inputs and outputs in a short JSON block, call one function, get a typed result.

Four shapes are supported out of the box: image-to-image (segmentation, enhancement), image-to-class (classifier), image-to-embedding (search, similarity), image-to-detection (your own YOLO-style model).

Generic runner/options.json

json

// Register a custom cap-check detector.
{
  "models": {
    "cap_check": {
      "model": "cap_detector.onnx",
      "type":  "image_to_detection",
      "input": {
        "resize":    640,
        "normalize": "0.5"
      },
      "output": {
        "labels": "bottle, cap_ok, cap_missing"
      }
    }
  }
}

Call: runGenericModel("cap_check", frame) Need a model? Lab →

Models we run

The catalog we ship and maintain.

Point the SDK at a supported model and it runs. Ship open weights we include, swap in a fine-tuned version, or bring your own ONNX export in the same family.

Detection, segmentation, pose, OBB

YOLO family

Detection, segmentation, pose, classification, and oriented bounding boxes in one line.

→ v8, v11, v26

Visual search

CLIP, SigLIP

Image and text embeddings in a shared space. Similarity search over an archive, plus text queries.

→ CLIP, open-weight checkpoints
→ SigLIP, stronger on some workloads

OCR

PaddleOCR

Full stack: text detection, recognition, angle classifier. Expiry dates, lot codes, container IDs, labels.

→ Detection + recognition + angle

Tracking

ByteTrack, Deep SORT

Multi-object tracking. Composes with any detector in the catalog.

→ ByteTrack
→ Deep SORT

Re-identification

Person re-ID

Embeddings for matching the same subject across cameras. Yards, depots, multi-camera perimeters.

→ OSNet, trained on MSMT17

Face (open-weight)

SCRFD, ArcFace, InsightFace

Open-weight face detection and embedding models the SDK loads and runs.

→ SCRFD (detection)
→ ArcFace, InsightFace buffalo packs (embeddings)

Need a vendor-supported full-stack FR system with imaged-trained weights, antispoof, and an SLA? See By invitation below.

License plate recognition

Plate detect + PaddleOCR

Open-weight plate detectors paired with PaddleOCR for plate text. Works out of the box at gates, checkpoints, and yards.

→ Open-weight plate detectors
→ PaddleOCR for plate text

Depth

Depth Anything, DPT

Monocular depth estimation for AR, parallax, and 3D measurement.

→ Depth Anything V2
→ Depth Anything V3
→ DPT

Segmentation specialists

SegFormer, DeepLabV3+

Semantic segmentation families commonly fine-tuned in-house by industrial customers.

→ SegFormer
→ DeepLabV3+

Promptable segmentation

FastSAM, MobileSAM, SAM ViT-B

Promptable segmentation for operator-assisted QA and annotation bootstrapping.

→ FastSAM
→ MobileSAM
→ SAM ViT-B

Classification backbones

MobileNet, EfficientNet, ResNet, ConvNeXt

Standard classifier backbones for BYO-classifier workflows.

→ MobileNetV3
→ EfficientNet
→ ResNet
→ ConvNeXt-Tiny

Image restoration

Real-ESRGAN

4x super-resolution for low-res archives and incident frames.

→ Real-ESRGAN, 4x upscaler

Driving perception

YOLOP

Vehicles, drivable surface, and lane lines from one pass. Fits fleet, depot, and yard cameras.

→ Vehicle detection
→ Drivable area
→ Lane lines

Document layout

PP-Structure

Find titles, paragraphs, tables, and figures in scanned pages. Pairs with PaddleOCR for full document parsing.

→ PP-DocLayout V3

By invitation

Beyond the catalog.

A small number of imaged capabilities are licensed only to vetted customers, under separate NDAs and separate terms. Those capabilities are not described publicly, and the list is not long.

If your project requires something specific and sensitive that is not listed on this page, say so in a call. We will tell you directly what we can and cannot do, and on what terms.

Start the conversation

Why it runs next to the camera

Your line does not wait for the cloud.

A conveyor runs at one meter per second. A forklift does not pause because a data center is slow. A fire does not wait for a round-trip to a remote API. Inference that matters at the edge has to happen at the edge.

imaged runs on the industrial PC, the Jetson, the factory gateway, or the server rack you already own. A reject signal lands in under a frame. When the internet drops, your line keeps running. When you add a camera, you do not add a bill.

Where it runs

On the hardware you already own.

One SDK, five platforms, five language bindings. CoreML on Apple hardware, CUDA on Linux boxes with GPUs, CPU on everything else.

Linux

x86_64, ARM64

Windows

Industrial PCs

macOS

Apple Silicon, Intel

iOS

Handhelds, kiosks

Android

Rugged devices

Your stack

C, C++, Python, Swift, ObjC

Commercial license

Predictable terms. No meter running.

One fee per year, per deployment, per platform. No per-frame, per-camera, or per-user charge. Every capability is on, every version upgrade is included.

Trials are free for 30 days with a generated token. Paid tokens are issued after a short call. If your setup is unusual (many cameras on one box, or one token across a fleet), tell us on the call and we will price it up front.

Request a trial Need a custom model?

Ready?

See it, track it, match it. On your floor, in weeks.

Request a trial See Hugind next