Service · Physical AI
The intelligence layer for machines — we build judgment, not robots.
We build the software intelligence layer of physical AI: LLM planning over your machine fleets, telemetry pipelines, predictive maintenance, and the evaluation gates that keep autonomous behavior inside limits you set. Your hardware and integrators stay.
The physical AI stack — our boundary, in writing
What is physical AI
In software, a bad output is a ticket. In the physical world, it has a blast radius.
Physical AI is AI whose output is an action, not an answer — software that senses, decides, and executes through machines. Because its mistakes land in the physical world, it's as much an evaluation and safety problem as a modeling one.
The machines are deployed and the new models are being pointed at them. What hasn't caught up is the discipline. Point an 80%-failure rate at production lines and the economics invert — the evaluation gate becomes the difference between a deployment and an incident report. That layer is our entire scope.
Industrial robots in operation worldwide — already deployed, now being pointed at new models.
IFR World Robotics · verified Jun 2026
AI project failure rate in the purely digital world — where a mistake is only a ticket.
RAND Corporation · verified Jun 2026
What we build
The software intelligence layer, end to end.
Each offering carries its stack-layer tag — so you see exactly where our work stops and your hardware vendors' begins.
LLM & agent planning
The judgment layer over your machines — instruction parsing, task decomposition, and human approval gates on every consequential action. Agentic AI with its hands bound until evals say otherwise.
Perception-data & telemetry pipelines
Sensor streams, fleet telemetry, and machine logs made model-ready — the unglamorous plumbing every planning layer stands on.
Predictive maintenance
Failure prediction built from the telemetry your equipment already emits — from your data as it is, not a sensor retrofit.
Evaluation harnesses & safety gates
Evals before actuation: golden scenario sets, regression suites, and staged-autonomy gates — the discipline that held a 200+ location chain at zero critical defects across four years.
Digital-twin software & simulation plumbing
State models, twin dashboards, and the data interfaces that feed simulation — we build the mirror and its plumbing, not the physics engine.
Governance for systems that act
The Responsible AI layer where blast radius makes it non-optional — acceptable-action policy, named owners, audit trails, institutionalized through an AI Center of Excellence when you're ready.
The autonomy gate
How an LLM is allowed to touch a machine.
Autonomy is climbed, never granted — each promotion earned with evaluation evidence and a named human sign-off.
RUNG 01
Shadow
Runs against live telemetry and logs what it would have done, compared daily against your operators. Disagreements become eval cases, not incidents.
Output: the model watches · actuates nothing
RUNG 02
Suggest
The system proposes; your people dispose. Acceptance rate, overrides, and near-miss flags become the evidence file that justifies (or blocks) the next rung.
Output: recommendations to a human operator
RUNG 03
Approve
The model plans multi-step work; every consequential action waits on a human-in-the-loop approval. Latency drops, control doesn't.
Output: the model plans · a human approves each action
RUNG 04
Act, with audit
Autonomous action inside hard limits — bounded scopes, rate caps, kill-switches, an audit trail with a named owner. The rung most vendors start at; the rung we finish at.
Output: bounded autonomy · kill-switch · named owner
Our controls map to SOC 2 Type II, ISO/IEC 27001:2022, and ISO/IEC 42001:2023, with NIST's AI RMF and the EU AI Act as targets where they apply. Functional-safety standards for the machines — IEC 61508, ISO 10218 — stay with your safety engineers; our layer produces the evaluation evidence their certification consumes.
Why a software lab
Why a software lab belongs in the machine world.
Responsible AI is the founding charter. Governance for blast-radius systems is the reason the lab exists. When the output is an action, it stops being philosophy and becomes scope.
We claim only the layer we own. A boundary drawn in writing. A vendor who claims the robot, the AI, and the cloud is overclaiming at least one.
Aegis AI delivery discipline. Evals before actuation, staged rollouts, production monitoring — the Aegis AI process where a regression costs downtime, not tickets.
The people stay in the loop. Operators and engineers get trained and kept in the decision path — our Human-Led AI practice, not an afterthought.
Track record
Our physical-AI record, stated precisely.
We haven't shipped a robot, and this page won't pretend otherwise. What we have is the record that transfers — production software the physical-equipment economy trusted with real money.
A Stanford-rooted Responsible AI lab, founded 2011, run by founder Kelvin Tran. Bring the machines; we'll bring the judgment layer. When a use case shouldn't be autonomous, we'll tell you — which a vendor paid to ship robots won't.
Heavy equipment marketplace · acquired 2017
YardClub — the software layer under heavy machinery. A contractor-to-contractor marketplace for excavators and loaders. We built the listings, payments, and transaction infrastructure end to end; it processed $120M+ before Caterpillar acquired it. Exactly the layer every physical AI deployment needs underneath it.
Software running physical operations
BJ's Restaurants — 200+ locations. We kept a software-critical multi-site operation shipping twice a week for four years with zero critical defects — the eval-and-staged-rollout discipline the autonomy gate is built on.
How it runs
How a physical AI engagement works.
Five stages, the boundary contract first — because two vendors who each think the other owns the gap is how this domain fails.
STAGE 01
Scope the layer
What hardware exists, what software we own, where the interfaces sit — written down before anything is built, including the use cases we'd advise against.
Output: boundary contract + ranked use cases
STAGE 02
Ground in telemetry
An audit of what your machines already emit versus what the planning layer needs — most fleets log more ground truth than anyone has structured.
Output: data baseline + gap map
STAGE 03
Build behind the gate
The intelligence layer and its evaluation harness ship as one deliverable — golden scenarios, regression suites, gate logic. Nothing reaches actuation ungated.
Output: planning layer + eval harness, together
STAGE 04
Stage the autonomy
The system climbs the autonomy gate one rung at a time, each promotion backed by eval evidence and signed by the operations owner who lives with the consequences.
Output: rung-by-rung promotion, signed
STAGE 05
Operate & hand over
Drift detection, incident playbooks, and operator training — your team learns to read the evals and run the gates. Stay on retainer or take the keys.
Output: monitoring + a team that runs it
Where it pays first
Five places the intelligence layer pays first.
Industrial manufacturing & process plants
Agent planning over production data and predictive maintenance from the telemetry your lines already emit — every recommendation gated.
Logistics, warehousing & fulfillment
The physical backbone of ecommerce: AMR fleet coordination, exception triage, and slotting intelligence over robots your integrator installed.
Energy & field operations
Inspection-data intelligence, field-service copilots, and maintenance prioritization where a missed signal is measured in outages.
Construction & heavy equipment
Fleet telemetry, utilization intelligence, and equipment-marketplace systems — where our field record above was earned.
Food-service & multi-site operations
Software running physical operations at scale is home turf — we kept BJ's Restaurants (200+ locations) shipping twice a week for four years with zero critical defects.
Questions buyers ask before they build.
Thirty minutes · no pitch deck
Have machines that need better judgment?
Bring your fleet, your telemetry, and the workflow you wish ran itself — we'll tell you honestly which layer is ours, which belongs to your hardware vendors, and what the first gated deployment looks like.