Optical components for matrix processing

Published in

Lightmatter

9 min readOct 9, 2019

Introduction

In our previous blog posts, we introduced the idea of using silicon photonics to perform large-scale linear arithmetic operations faster and more efficiently than can be done with digital electronics. As you may recall, our systems make use of many identical devices called Mach-Zehnder interferometers (MZIs) [1] to implement a basic linear operation, a 2x2 matrix multiplication, from which much larger matrix multiplications can be built up. In this blog post, we aim to follow up with a more detailed description of how MZIs work, and hopefully elucidate some of the advantages of silicon photonics over digital electronics for performing arithmetic.

Scalar MZI

We first consider an MZI that can be used for scalar multiplication. A basic layout is shown below. A single input waveguide is split symmetrically to become two arms of the interferometer. As the light travels down each arm, the electromagnetic waves pick up a phase, which is (in general) path-dependent, before being recombined into a single output waveguide. We’ll refer to this configuration as a “scalar MZI.”

A scalar MZI can be divided into 3 sub-sections: two Y-junctions and a differential phase-shift. Before we go through the analysis of the full system, let’s first get acquainted with these sub-systems.

Y-Junctions

A Y-junction is a 3-port integrated optical device with a single port (A), coupled equally to a pair of ports (B and C). A simple version can be implemented by connecting a straight waveguide to both left- and right-turning waveguide bends as shown below.

Simulated electric field intensity (|E|²) in rudimentary Y-junction, with optical input at port A

When sending light into port A, the light signal undergoes a transformation that can be described by the following transfer matrix:

In order to obtain the optical field amplitudes in the output waveguide modes (B and C), we can multiply the input field amplitude by the transfer matrix. Assuming the input amplitude to be 1, we obtain identical output field amplitudes:

Much as one might expect, the input light at port A is simply split in two, with half of the optical power in each of the output waveguides (optical power is proportional to the magnitude squared of the field amplitude). In this situation, the light behaves much like water in a system of pipes.

Let us now examine what happens when we input light at port B. By optical reciprocity (a principle in optics that says light can travel both directions down the same path — think about the lenses of your glasses), the transfer matrix for passing backwards through it is simply the transpose of the forward transfer matrix:

Using the same matrix multiplication formalism, we obtain:

Note that only half of the input power made it to the output, and the remainder seems to have magically disappeared (in reality it was radiated out of the system). In this case, the wave nature of light causes it not to behave like a classical fluid (water in a pipe), as can be seen in the simulation below.

Simulated electric field intensity (|E|²) in rudimentary Y-junction, with optical input at port B

Let’s now consider one final situation, wherein light is launched into both port B and port C. In this case, we can arbitrarily set the input field amplitudes to b and c respectively. Once again, we multiply the input amplitude vector by the transfer matrix to obtain:

This final Y-junction result illustrates a fundamental concept in optics known as superposition: a system’s response to multiple excitation sources is simply the sum (or superposition) of the responses to each individual source. This is one of the main advantages of optics for performing linear arithmetic: as a direct result of the inherent linearity of the system, addition is as simple as combining (e.g. interfering) two optical modes. Further, the addition operation is effectively free: no power consumption is required! Note that generation of the input light does require power consumption but this should not be included in the cost of the addition operation, as is discussed toward the end of this post.

Phase Shift

The scalar MZI without a differential phase shift section is a useless device that transmits all input light to the output. In terms of arithmetic operations, it can be thought of as dividing a number into equal halves and subsequently adding those two halves together again. However, as we will see, the introduction of a phase shift between the two arms results in a change in the optical power transmitted to the output mode. The transfer matrix for the phase shift section can be written as:

There are many ways of introducing a phase difference between the two arms, but they are all essentially based on delaying the wave-fronts in one arm versus the other.

Perhaps the simplest method is to make the arms of different length. Because the waves in each arm travel at the same speed, those in the longer arm are delayed with respect to those in the shorter arm, resulting in a relative phase difference:

where n is the refractive index of the waveguide, ΔL is the length difference between the two arms, and λ is the free-space wavelength of the light.

Let’s Put the Pieces Together…

The system’s transfer matrix is obtained by multiplying the transfer matrices of the component parts (confusingly, the transfer matrices are written out in the opposite spatial order of the elements assuming the light travels from left to right). The resultant transfer matrix for the scalar MZI can thus be calculated as:

The output of the scalar MZI, then, is simply its input multiplied by a scalar value (cos(φ)). This illustrates the second key functionality required for linear arithmetic: scalar multiplication. And once again, just like addition, scalar multiplication using optics does not require the dissipation of energy!

Simulations of MZIs with path-length induced phase shifts of: (a) φ = 0; (b) φ = π/4; (c) φ = π/2. Note that there is some excess loss in these devices due to the rudimentary Y-junctions.

Electro-Optic Scalar MZI

Rather than relying on optical path-length, programmable phase shifts typically rely on changing the speed of the waves in the guiding material, i.e. its refractive index. Depending on the material in question, there are several common ways of accomplishing this. These include the thermo-optic effect, in which the refractive index of a material changes with temperature; the plasma dispersion effect, where the quasi-Fermi level of a semiconductor (concentration of electrons and holes) changes its refractive index; and the electro-optic effect, wherein application of an electric field to a material induces a proportional change in its refractive index. Using any one of the above effects enables the implementation of a programmable phase difference.

Let’s now consider a specific example: let the waveguides in the phase-difference section be made of an electro-optic material. In this case, opposite electric fields applied across each arm can be used to induce a differential phase between the arms. These electric fields can be obtained with a single voltage, as illustrated below.

Schematic representation of an electro-optic scalar MZI, showing electro-optic phase shift sections and applied electric potential (V)

In this case the phase difference can be written as:

where c is the electro-optic coefficient of the material, d is the distance between the electrodes, L is the length of the electro-optic section, and λ is the free-space wavelength of the light. For an input field amplitude A, the output field amplitude of the MZI is given by:

From this example we can see that the scalar multiplier implemented by an MZI can be encoded in a voltage. This is an important point as it allows the value of electro-optic multipliers to be programmed using high-speed electronics, enabling a fully integrated reconfigurable solution. Though changing the multiplier value requires the transport of charges through an electric potential and thus energy dissipation, maintaining a static multiplier value does not, provided in this case that the electro-optic material is non-conductive.

Vector MZI

Having now seen how the “scalar MZI” can be used for scalar multiplication, we will now modify it into a “vector MZI” to perform matrix-vector multiplication. A schematic version is shown below:

Once again, we can divide the system into 3 sub-sections, but this time we replace the y-junctions with 50:50 directional couplers.

Directional Couplers

Directional couplers are 4-port integrated optical devices with two pairs of ports (A/B and C/D). Each port in a pair is coupled to both ports in the other pair, but not to its partner. So if, for example light is input into ports A and B, the output amplitudes at ports C and D will each be linear combinations of the input amplitudes. Such devices are realized by bringing two waveguides into close proximity so that they are evanescently coupled to one another, as shown below:

Simulated electric field intensity (|E|²) in directional coupler, with optical input at port A

A version of a lossless directional coupler can be represented by the 2x2 unitary transfer matrix:

2x2 Unitary Operation

As in the case of the scalar MZI, we can obtain the transfer matrix of the system by multiplying the transfer matrices of the sub-components in the correct order:

Note that in this case the resulting matrix is a 2x2, rather than a 1x1 as in the case of the scalar version. Much like its component directional couplers, the output field amplitudes are linear combinations of the input field amplitudes; in fact it is correct to consider the vector MZI to be a programmable directional coupler. The arithmetic operation of the device is identically multiplication of the input amplitude vector by its transfer matrix. As many of those reading will recall from introductory linear algebra courses, matrix multiplication is simply an organized series of scalar multiplications and additions: the two basic operations we have already identified as energy-free in electro-optic circuits. It should come as no surprise that this compound operation can also be performed with zero power consumption.

By stringing vector and scalar MZIs into a suitable network along with simple phase shifters, arbitrary matrix multiplication can be implemented, and again the multiplication can be performed without consuming any power. It should be noted that it is necessary to consume power at the inputs and outputs of the network to encode the input vectors (this includes light generation), and decode the output vectors. However multiplying an N x N matrix by an N x 1 vector requires only O{N} encode/decode operations, while requiring O{N²} scalar multiplications and additions. Digital electronics require the dissipation of energy to perform scalar multiplications and additions. From the dramatically lower scaling of energy-consuming operations required in optics, it should be clear that there is some value α for which when N > α, optics will always be more efficient.

A challenge here at Lightmatter is engineering our systems to both reduce that value of α, and increase the value of N for which we can build practical systems. By reducing the value of α, we will broaden the range of applications for which optical computing is attractive and increase our impact on reducing the energy costs associated with computing. By increasing the attainable value of N, we will do our part to extend the reach of human capability, enabling computation at scales unachievable with conventional digital machines!