Introduction to Tensors (Quantum Circuit Simulation)

In today’s scenario most of the work in Quantum Computing is being done in the software layer ranging from Optimization, Machine Learning, Hybrid Quantum Classical Algorithms like QAOA, VQE which seem to be the potential game changers in the NISQ era (next 5–10 years). In order to run these Quantum circuits we either need a functional Quantum Hardware which is not feasible for most of the players or a Quantum Circuit Simulator running on a conventional computer for us to observe the properties of Quantum Mechanics like Superposition and Entanglement. We are trying to compete through these with the conventional computers and communications and in order to achieve this we seek new algorithms and circuits with revolutionary behaviour like the Shor’s algorithm on prime factorization. Most of the time success of these algorithms and their potential speed ups are limited due to the cost of simulation on the conventional systems. Thus, along with working on new designs for Quantum Information processing hardware, more efficient simulations can help in getting sharper bounds on the Quantum algorithms.

What we are trying to achieve on a classical computer is a randomized simulation of a Quantum Circuit as the outcome of a quantum computation is probabilistic. So in order to work this out we create a classical randomized algorithm whose output distribution is similar to a simulated Quantum Computation. What happens in a simulated Quantum Circuit is that we start with a predefined input state which generally is a |0> state whose vector representation being ( 1 , 0 ) and the circuit has certain width depending on the number of initial qubits. The depth of the circuit is defined by the number of gates or the unitary matrices being applied to the given system. The gates can be a 1-qubit gate, 2-qubit to 3-qubit gates. And finally the qubits are measured and we get the classical outputs depending on the amplitudes.

So what is the cost of simulation on a conventional system?

In order to simulate a Quantum Circuit, one may use a naive brute force calculation of the quantum amplitudes which have an exponential overhead. Classes of Circuits which have an efficient simulation are often restricted by “gate library”, but do not impose restrictions on how the gates are interconnected or sequenced. Another way to impose restriction on the class of quantum circuits is to limit the amount of entanglement in the intermediate states which is performed by the Controlled gates. The goal here is to study the complexity of running a classical simulation of the Quantum Circuit.

Complexity of Quantum Circuit Simulators :

A common approach for applying gates (matrices) is to do in the form of matrix multiplications where a gate A followed by a gate B on a given qubit can be done in a naive way using A X B. So if we have a qubit state with m number of qubits, the amplitudes may be represented by a 2ᵐ X 1 vector and the corresponding gate matrices being 2ᵐ X 2ᵐ matrix. So what is the complexity of multiplying two such matrices? We see that the compexity of such an operation being O(2^(3m)) where m is the number of qubits. We can see the same from the given code example below.

#Checking matrix multiplication complexity.import numpy as np
A = np.random.random(size=(10,10))
B = np.random.random(size=(10,10))
C = np.zeros(10,10)
for i in range(10):
    for j in range(10):
        for k in range(10):
            C[i,j]= A[i,k] * B[k,j]

Here we see that the number of iterations happening are 10*10*10. So if we take matrices of size 2ᵐ X 2ᵐ for a given qubit state with 2ᵐ amplitudes. The time complexity for one such operation with two matrices comes out to be O(2^(3m)). Our conventional computer can carry out 2 billion instructions per second that is 10⁹ instructions/sec. This will very easily cross the billion limit by increasing the qubits.

One improved way on this is to start by multiplying the gate matrix to the qubit vector. Here we see that when a matrix is multiplied to a vector, we see that the time complexity becomes O(2^(2m)). And similarly when a new operator/matrix is multiplied to the modified state vector the time complexity remains the same. Thus we succeeded in improving the time complexity from O(2^(3m)) to O(2^(2m)). This is because of the fact that matrix multiplication is not constant with respect to associativity.

Now what we see that if we want to apply a 1-qubit or a 2-qubit gate to a given m qubit vector state which has an amplitude length of 2ᵐ, we see that we have to do padding of the given matrix by multiplying it with I matrices to fill up the sparse matrix with values, for our conventional computer to do the matrix multiplication. So let’s say we have two Unitary matrices Uᵏ and Uʲ which apply on a and b number of qubits respectively. Uᵏ being 2ᵃ x 2ᵃ and Uʲ being 2ᵇ x 2ᵇ. If a and b are < m, we need to fill the given sparse matrix by values and we multiply with I matrices for the same.

But we succeeded in reducing the complexity from O(2^(3m)) to O(2^(2m)). But is it enough? Let’s see. Let’s take a qubit vector state of 20 qubits. 20 qubits look really small right? Our amplitude vector has 2²⁰ (1048576) elements. Applying a single qubit gate which is mere 2² x 2² will be really slow. Let’s see how. We need to scale this 1 qubit gate to 2²⁰ x 2²⁰ by filling the sparse matrix with values which the simulator automatically does by multiplying by I matrices again and again. Finally we reach an equal enough matrix. Now the time complexity of this matrix operation on the state being 2^(2*20) which is whooping 1.0995116e+12. This will surely take more than 1000 seconds for just one operation. As quantum simulator follows the probabilistic nature of Quantum Computation, if we run it large number of times, the result will surely take days which is way way slower than what we wanted to achieve.

But we should be able to reduce the calculation of the 2^m amplitudes to O(2^m) in some cases. But How? Due to the fact that the matrix multiplication is not constant with associativity, we can think a bit more and extend it to autodifferentiation (backprop in NN’s), tensor networks (tensor contractions) and QC simulators. So let’s bring tensors in order to improve our performance.

What are Tensors?

A tensor is a generalization of vectors and matrices and is easily understood as a multidimensional array. A scalar being a 0 dim tensor, a vector being 1 dimensional tensor, a matrix 2 order tensor and and 3 dimension and more array is referred to as Tensor.

The indices here in the diagrammatic notation represent a dimension Link. A tensor has an order and a shape. The order being the number of dimensions and the shape being the number of entities in the dimensions. So intuitively, we can say that a 3D tensor like a book where the matrix is represented by the page and the number of pages in the book being the third dimension just like a cuboid.

We can further form networks comprised of multiple tensors, where an index shared by two tensors denotes a contraction (or summation) over this index:

The left is equivalent to a matrix multiplication between matrices A and B, while the example on the right produces a rank-3 tensor D via the contraction of a network with three tensors Image Link.

This is called Tensor Contraction. We can achieve this contraction by permutation and reshaping over the indices which we will be seeing in a later article. The goal is to to approximate a single high order tensor as a tensor network composed of many low order tensors. This latter representation is very efficient seeing the high complexity of a single high dimensional tensor.

So how is a Tensor helpful in a Quantum Circuit Simulation. Let’s take a N dimensional tensor with every dimension being size 2.

We can represent a 1 qubit state as a 1 - dimensional tensor, it being an array.

A 2-qubit state can be written as a 2 D tensor which would be denoted by a matrix with two indices (i,j) ∈ [0,1] .

A 3 qubit state being written as a 3D Tensor which has 3 indices where all the indices have a size 2.

It can be seen as nested matrices, where matrices are one above the other and i denoting the height and j,k denoting a matrix dimensions. So if we want to find out the amplitude of 𝛹011 we go to the 0th row of the cube matrix and then go to the 1st row of the obtained matrix followed by the 1st column.

Let’s see how we can find the amplitude of |011> using both normal array and a tensor object.

Let’s use Blueqat SDK for the same.

from blueqat import Circuit
# b represents a length 8 vector.

In order to find the amplitude of |011>.


Now we try to reshape this to a 3D Tensor.


We can just access the amplitude by running:


So how using tensors is a boon in reducing time complexity? Let say we want to compute U|𝛹> = |𝛷> where 𝛹 being the initial qubit state and U is applied to 2 qubits (being a 2 qubit gate). Let the initial state is a 3 qubit state and the U is a 2 qubit transformation. A 2 qubit gate is a 4 x 4 shape matrix and thus for us to create a tensor out of it, we need a 4 order size-2 tensor as we have 4 indices for each number in the matrix i,j,k,l. So the U = Uɪᴊᴋʟ and the 3 qubit state can be represented with three indices as |𝛹>ᴀʙᴄ. Now if the 2 qubit matrix U acts on the state 0 and 2, the following is the multiplication of the tensors corresponding to the subspaces :

𝛷ᴊᴀʟ = Uɪᴊᴋʟ 𝛹ɪᴀᴋ

So what happens here is that we multiply 𝛹 and U is the subspaces we wanted to apply that is at 0 and 2. What we did was only involve the states in the indices 0 and 2 and multiplying it by the submatrices at the same indices. The ove tensor operation has the complexity O(2^(m+2)) which is what we were trying to achieve. This all happens due to Tensor Contractions i.e manipulating the tensors by permutation and reshaping the indices following by further binary Tensor contractions.

So the Tensor contractions helped us in going around the initial overhead where we had to use the whole 2ᵐ x 2ᵐ matrix for a 2 qubit operation. This is just for us to get the amplitude updates. “Tensor networks” are a larger chain of these tensor contractions we did above and are helpful in improving the Quantum Circuit simulation complexity.

Next we will discuss how different gates act to these tensor operations and how they improve the multiplication efficiently and how the time complexity changes with scaling the qubits. Also we will come across TensorFlowQuantum which is a library for hybrid quantum-classical machine learning and further get deeper into ML and Quantum Computing.


Gaurav Singh
Gaurav Singh
Related posts

blueqat Inc.

Shibuya Scramble Square 39F 2-24-12, Shibuya, Shibuya-ku, Tokyo