cuQuantum Backend’s Data Layers Guide

What the Backend Provides

The backend introduces two new data layers — the internal representation a Qobj uses to store its numerical data:

  • CuOperator: for (square) operators. Stores the operator symbolically as a sum of tensor product terms; the full matrix is never formed. The data is used to form cuQuantum’s Operator object lazily when needed.

  • CuState: for state vectors and density matrices. Stores data via cuQuantum’s DensePureState / DenseMixedState.

A Qobj backed by CuOperator or CuState largely follows the same QuTiP API as any other Qobj. In practice, however, not all operations are supported for every combination of data layers, and combining Qobjs of different layers may trigger automatic conversions whose cost and correctness depend on what layers are involved. This guide covers those nuances.

Setup: WorkStream and Activation

The WorkStream context

All CuStates construction and GPU computation require a cuQuantum WorkStream context:

from cuquantum.densitymat import WorkStream
ctx = WorkStream()

CuOperator is CPU-side and symbolic, so constructing one does not require the context. CuState, however, reads settings.cuDensity["ctx"] immediately in its constructor to allocate GPU memory. GPU computation (applying operators to states, running solvers) also requires the ctx. This ctx is registered when the backend is activated.

Activating the backend

Both set_as_default(ctx) and the CuQuantumBackend(ctx) context manager register the ctx and change the default data type used when constructing new Qobjs without an explicit dtype. Both also switch the default solver integrator to CuVern7.

import qutip_cuquantum

# Method 1: permanent change
qutip_cuquantum.set_as_default(ctx)

# Method 2: scoped to a block; reverts on exit
with qutip_cuquantum.CuQuantumBackend(ctx):
    ...

What activation does NOT do: it does not convert any existing Qobj. Qobjs do not change their data layer upon entering or leaving the context — only Qobjs constructed after activation, without an explicit dtype, are affected.

Getting Operators and States into the Right Data Layer

Explicit construction

The dtype argument sets the data layer at construction, regardless of whether the backend is active:

sz  = qutip.sigmaz(dtype='CuOperator')
n   = qutip.num(20, dtype='CuOperator')
psi = qutip.basis(2, 0, dtype='CuState')   # requires ctx to be registered

When no dtype is given, each factory function resolves the type from the global default using its own sparcity hint — a hint that describes the structural character of the operator it produces:

  • "sparse" hint (e.g., sigmax, sigmay, sigmaz) → CSR normally, CuOperator with plugin active

  • "diagonal" hint (e.g., num, create, destroy, qeye) → Dia normally, CuOperator with plugin active

  • "dense" hint → Dense always, regardless of whether the plugin is active

So activating the plugin changes the default for sparse and diagonal operators, but dense operators always stay Dense.

Explicit conversion

Any Qobj can be converted to a different data layer at any time with .to():

sz_cu  = qutip.sigmaz().to('CuOperator')
n_cu   = qutip.num(20).to('CuOperator')
psi_cu = qutip.basis(2, 0).to('CuState')   # requires ctx to be registered

Implicit conversion during operations

When two Qobjs with different data layers are combined (+, @, *, qutip.tensor, etc.), QuTiP automatically converts one operand toward the other using the lowest-cost path:

  • Dense/CSR/DiaCuOperator: cost 1

  • CuOperatorDense: cost 10

This only applies when at least one operand is already CuOperator. Implicit conversion cannot produce CuOperator from scratch — if neither operand is CuOperator, the result will not be CuOperator either.

sx   = qutip.sigmax()                    # CSR
n_cu = qutip.num(20).to('CuOperator')   # explicit CuOperator

H = sx & n_cu   # sx is implicitly converted to CuOperator; result is CuOperator

Pitfall: CuOperator only supports square matrices. Implicit conversion of a rectangular object (e.g., a ket) fails with ValueError("Rectangular CuOperator are not supported").

Building composite systems

Use qutip.tensor() or the & shorthand (they are equivalent) to build operators on composite Hilbert spaces. For the result to be CuOperator, at least one operand must already be CuOperator.

With plugin active — standard quantum operators (Pauli matrices, number/creation/annihilation operators, identity) use sparse or diagonal sparcity hints and therefore become CuOperator by default, so composite Hamiltonians are naturally CuOperator without any explicit conversion:

qutip_cuquantum.set_as_default(ctx)

H_xy = qutip.sigmax() & qutip.sigmay()   # all CuOperator

Without plugin active — ensure at least one operand is CuOperator; the others are pulled in by implicit conversion:

# Convert the subsystem operator to CuOperator explicitly
sx_cu = qutip.sigmax().to('CuOperator')

# sigmay() is Dense here; it gets implicitly converted to CuOperator
# because sx_cu is CuOperator
H_xy = sx_cu & qutip.sigmay()

Combining terms

Addition (+) appends terms to the CuOperator term list:

H  = qutip.sigmaz() & qutip.qeye(2)   # 1 term
H += qutip.qeye(2) & qutip.sigmaz()   # 2 terms

Multiplication (@ or * between two Qobjs — equivalent for operators) takes the Cartesian product of the two term lists, naturally producing multi-mode coupling terms:

a1 = qutip.destroy(20) & qutip.qeye(20)   # a on mode 0
a2 = qutip.qeye(20) & qutip.destroy(20)   # a on mode 1

coupling  = a1.dag() @ a2    # one term with operators on both modes
coupling += a1 @ a2.dag()

@/* of an M-term operator by an N-term operator produces up to M×N terms; + produces M+N terms.

Scalar multiplication scales every term’s coefficient without changing the term list:

H = w1 * (a1.dag() @ a1) + w2 * (a2.dag() @ a2) + g * coupling

Diagonal Operators: the Dia Format

Why it matters

When a CuOperator is applied to a CuState, each per-mode operator stored inside its terms is converted to a cuQuantum type at GPU execution time:

  • Stored _data.Dia → cuQuantum MultidiagonalOperator (efficient for diagonal/band-diagonal structure)

  • Stored _data.Dense or other → cuQuantum DenseOperator

This conversion is lazy — it happens at GPU execution time, not when constructing or converting the CuOperator. So the dtype of the object stored inside the term determines GPU efficiency.

Default dtypes of common operators

Operator

Default dtype

Stored in CuOperator as

GPU type at execution

num(N)

Dia

_data.Dia

MultidiagonalOperator (efficient)

create(N)

Dia

_data.Dia

MultidiagonalOperator (efficient)

destroy(N)

Dia

_data.Dia

MultidiagonalOperator (efficient)

qeye(N)

Dia

_data.Dia

MultidiagonalOperator (efficient)

sigmaz()

CSR

_data.Dense

DenseOperator (see note below)

sigmax()

CSR

_data.Dense

DenseOperator (tiny off-diagonal, appropriate)

sigmay()

CSR

_data.Dense

DenseOperator (tiny off-diagonal, appropriate)

num(), create(), destroy(), qeye() default to Dia and produce MultidiagonalOperator without any extra steps. sigmaz() is structurally diagonal but defaults to CSR, so it needs explicit handling. sigmax() and sigmay() are tiny off-diagonal matrices; DenseOperator is appropriate for them.

Handling sigmaz and custom diagonal operators

Without plugin activesigmaz() is CSR; convert via Dia before converting to CuOperator:

# stores _data.Dia -> MultidiagonalOperator
sz_cu = qutip.sigmaz().to('Dia').to('CuOperator')

With plugin activesigmaz() already returns CuOperator (with Dense stored inside). Calling .to('Dia') on it would first materialize the full matrix (CuOperatorDense). Use the dtype argument to bypass the default instead:

# stores _data.Dia -> MultidiagonalOperator
sz_cu = qutip.sigmaz(dtype='Dia').to('CuOperator')

Custom diagonal operators constructed from arrays default to Dense; convert via Dia:

H_diag = qutip.Qobj(np.diag(energies)).to('Dia').to('CuOperator')

or explicitly construct it as Dia via dtype argument:

H_diag = qutip.Qobj(np.diag(energies), dtype='Dia').to('CuOperator')

or construct with the qdiags operation:

H_diag = qutip.qdiags(energies, dtype='CuOperator')

CuOperator: Tensor Structure and Performance

The symbolic representation

CuOperator stores operators as a sum of tensor product terms:

op = sum_i  factor_i * (A_i0 x A_i1 x ... x A_iN)

Each term records which per-mode operator acts on which subsystem. The full d^N × d^N matrix is never formed. cuQuantum exploits this factored structure to apply operators to states efficiently on the GPU — this is a fundamental source of the performance benefit.

The CuOperator container structure (term list, mode indices, transforms) is CPU-side. The per-mode operators stored inside each term can be either CPU-side QuTiP Data objects (Dense, Dia) — the common case when operators are built from QuTiP functions — or GPU-resident cuQuantum objects (DenseOperator, MultidiagonalOperator backed by CuPy arrays) if explicitly constructed that way. In the typical workflow, the per-mode operators are CPU-side, and data is transferred to the GPU when the CuOperator is applied to a CuState.

Preserving tensor structure

The symbolic structure is preserved by:

  • CuOperator + CuOperator — appends terms

  • CuOperator @ CuOperator / CuOperator * CuOperator — Cartesian product of terms

  • kron(CuOperator, CuOperator) / A & B — extends mode indices

  • adjoint/transpose/conj — applies transform flags per mode

  • scalar * CuOperator — scales coefficients

The tensor structure is destroyed by converting to Dense — whether via .to('Dense'), .full(), or .to_array(). The reverse conversion (materialised Dense back to CuOperator) wraps the full d^N × d^N matrix as a single dense term acting on the entire Hilbert space, with no knowledge of the tensor product decomposition. Using such a dense term will generally result in a considerable performance penalty.

# AVOID: destroys tensor structure
H_bad = H.to('Dense').to('CuOperator')
# for large composite systems, this creates an exponentially large matrix
H_arr = H.full()

Conversion Reference

Registered conversions

CuOperator

From

To

Weight

Notes

Dense

CuOperator

1

Square matrices only

Dia

CuOperator

1

Square matrices only

CuOperator

Dense

10

Expensive: materializes full matrix on CPU

CuPyDense

CuOperator

1

If qutip-cupy installed

CuState

From

To

Weight

Notes

Dense

CuState

1

Host2Device transfer; requires ctx to be registered

CuState

Dense

1

Device2Host transfer

CuPyDense

CuState

1

If qutip-cupy installed

CuState

CuPyDense

1

If qutip-cupy installed

No CSR conversion exists for either type. Paths through CSR go via Dense, chained automatically.

Registered CuOperator operations

Operation

Result

Notes

mul(CuOperator, scalar)

CuOperator

Symbolic — scales all term coefficients

neg(CuOperator)

CuOperator

Symbolic — scales all coefficients by −1

add(CuOperator, CuOperator)

CuOperator

Symbolic — appends term lists

sub(CuOperator, CuOperator)

CuOperator

Symbolic — appends terms with negated factor

adjoint(CuOperator)

CuOperator

Symbolic — sets transform flags per mode

transpose(CuOperator)

CuOperator

Symbolic — sets transform flags per mode

conj(CuOperator)

CuOperator

Symbolic — sets transform flags per mode

kron(CuOperator, CuOperator)

CuOperator

Symbolic — extends mode index list; underlying operation for qutip.tensor / &

matmul(CuOperator, CuOperator)

CuOperator

Symbolic — Cartesian product of term lists; no GPU computation

matmul(CuOperator, CuState)

CuState

GPU execution

matmul(CuState, CuOperator)

CuState

GPU execution

isequal(CuOperator, CuOperator)

bool

Materializes both operators to dense to compare; expensive

isherm(CuOperator)

bool

Materializes operator to dense to check; expensive