Methodology
How CORTX learns your business in 30–60 days.
C · D · A
Capture, Decode, Activate. Three movements that turn tribal knowledge into a working system.
Read the methodology
Featured
Atlas + Flow ship together.
A · F
Two products, one operations OS. Add modules as you grow.
Compare products
BUILDING BLOCK

On-prem LLM

Models that run on your hardware.

The agent's brain runs on a machine you own, in a place you control. Customer data does not leave the building to be processed.

01 / DEFINITION

The model is on the same machine as the data.

An on-prem language model is a model that runs on hardware in the customer's location, processing data that never leaves the local network. The model file lives on disk. The inference runs on local compute. The data flows in and out of the model without traversing the public internet.

CORTX deployments use on-prem models for the agent's reasoning by default. Cloud-hosted models can be used for specific tasks where the latency or capability of a frontier model justifies it — but the architectural posture is local-first.

02 / DEPLOYMENT

Where the work happens.

On-prem deployment posture A customer-site rectangle containing the Mac Mini vault, a local model file, and the Flow application surface, with a dim optional cloud-assist line pointing off-site. CUSTOMER SITE Mac Mini the vault model file Application surface Flow OPTIONAL CLOUD ASSIST

The data does not leave the box.

03 / RUNTIME

What happens during inference.

When the agent needs to reason about a task, it constructs the request — including the relevant MCP context — and sends it to the local model. The model runs inference on local compute. The response returns to the agent. The agent acts.

The customer's data — patient records, financial data, supplier information, partner details — is never transmitted to a remote service. It is read from the local disk, passed to the local model, and the result is written back to the local disk.

When a specific task benefits from a frontier model — typically a one-off generation task with no PII, like drafting a piece of marketing copy — the agent can route that specific call to a cloud model. These cases are rare, configurable, and logged.

04 / ARCHITECTURAL DECISION

Data sovereignty as default.

Cloud AI is convenient. It is also a category of dependency that small businesses do not always understand they're entering. The data leaves the building. The vendor logs requests. The model can change without notice. The pricing changes without notice. The terms of service change without notice.

On-prem inverts each of those defaults. The data does not leave. The vendor does not log. The model file is fixed unless the customer chooses to update it. The cost is predictable. The dependency is bounded.

The trade-off is real. On-prem models are smaller than the largest cloud models. For most operational reasoning — workflow execution, validation, exception handling — they are sufficient. For frontier capability, the agent can route specific tasks to cloud models, with the customer's awareness and consent.

05 / HARDWARE

What it actually takes.

A current-generation Mac Mini, configured once, sealed, placed in the customer's office. That is the hardware specification for a typical small-business deployment.

The Mini runs the agent, the local model, the workflow engine, and the encrypted database. It is connected to the customer's network. Remote access for authorized staff goes through Tailscale. There is no public endpoint.

The customer owns the machine. When the deployment ends, the machine stays.

06 / POSITION

The substrate.

On-prem LLM is the substrate everything else runs on. The agent runs on it. The MCPs are read against it. The tool calls invoke it. It is the local computational foundation of the entire deployment.