Three dimensional waveguide-interconnects for scalable integration of photonic neural networks
Photonic waveguides are prime candidates for integrated and parallel photonic interconnects. Such interconnects correspond to large-scale vector matrix products, which are at the heart of neural network computation. However, parallel interconnect circuits realized in two dimensions, for example by lithography, are strongly limited in size due to disadvantageous scaling. We use three dimensional (3D) printed photonic waveguides to overcome this limitation. 3D optical-couplers with fractal topology efficiently connect large numbers of input and output channels, and we show that the substrate’s footprint area scales linearly. Going beyond simple couplers, we introduce functional circuits for discrete spatial filters identical to those used in deep convolutional neural networks.
]Corresponding author: email@example.com
The interconnection of numerous input and output channels (IO-channels) is the basic operation behind many applications. A parallel and energy-efficient interconnect has therefore been a desired technology for decades Shamir et al. (1989); Lee et al. (1989), finding use in diverse fields such as telecommunication, inter and intra-chip data buses and potentially endoscopy Choudhury et al. (2019). Most timely, it also is highly desired for connecting layers of deep neural networks to efficiently provide the typically large scale vector-matrix products LeCun et al. (2015).
The integration of such an apparatus is challenging. To achieve parallelism, serial routing is naturally not an option, and a large number of direct physical links connecting the IO-channels is required. Such channel multiplexing can be created in different dimensions like wavelength or space, and here we address spatial multiplexing. If a direct connection architecture is realized electronically, the strong capacitive interactions between long connection wires will result in prohibitive energy dissipation and bandwidth limitations Miller (2017); Esmaeilzadeh et al. (2012). There are additional, more practical challenges. Lithographic fabrication typically integrates circuits in two dimensions (2D), and a 2D interconnect’s footprint grows quadratic with the number of IO-channels. The cross-bar interconnect illustrates this fundamental relationship.
Optical routing removes the energy dissipation associated to charging the capacity of electronic signaling wires Miller (2017), and free-space interconnects with many IO-channels have long been explored Shamir et al. (1989); Lee et al. (1989). Integrated photonic interconnects, however, remain size-limited by the unfavourable scaling between area and the number of IO-channels in 2D Miller (2015); Shen et al. (2017); Hughes et al. (2018); Peng et al. (2018). Crucially, the same scaling is found for wavelength division multiplexing.
We demonstrate the integration of such photonic interconnects in 3D for the first time. Complex 3D-routed waveguides are created by two photon polymerization Deubel et al. (2004); Yang et al. (2019). We introduce a fractal architecture which efficiently connects many IO-channels, and we demonstrate an integrated photonic interconnect of unreported size hosting 225 input and 529 output channels within a footprint area of only 0.460.46 mm. Crucially, this footprint area scales linearly. Such a printed photonic circuit can fully and in parallel connect the layers of deep neural networks of a commercially relevant size LeCun et al. (2015); Jouppi et al. (2017). Going beyond, we demonstrate a 3D-waveguide architecture implementing 9 spatial filters with a Haar convolution Kernel Huang and Aviyente (2007) of stride and width 3. Such convolutional filters represent a fundamental operation of deep convolutional neural networks LeCun et al. (2015).
Ii Scaling of interconnects
A strategy to overcome many of the bottlenecks currently experienced in neural network computation is to realize integrated circuits adhering to a neural network’s complex topology Appeltant et al. (2011); Nahmias and Shastri (2013); Shen et al. (2017); Van der Sande et al. (2017); Brunner et al. (2019); Neckar et al. (2019). As schematically illustrated in Fig. 1(a), a neural network is formed by linking large numbers of nonlinear neurons, which often are grouped in layers. It is particularly this intra-neuron interconnect which, despite recent progress Neckar et al. (2019), still eludes a fully parallel and scalable hardware integration. Most of today’s integrated circuits are created via lithography, and are hence restricted mostly to 2D. In cross-bar interconnects, see Fig. 1(b), routing occurs via punctual contacts between two layers hosting input and output wires. The input and output ports are arranged along a column or row, and hence their number scales with for an area . This is the general behavior in 2D.
Three dimensional, additive manufacturing has significantly matured and allows complex structures with nanometric feature sizes Moughames et al. (2016); Deubel et al. (2004); Bückmann et al. (2012); von Freymann et al. (2010). Crucially, the additional third dimension facilitates simple wiring topologies which are scalable, as schematically illustrated in Fig. 1(c). Input and output ports occupy a dedicated plane each (not rows or columns as in 2D), while the third dimension unlocks a circuit’s volume for wiring: for each of the inputs, a dedicated plane hosts all connections to the outputs. Even in such a simple routing scenario the system’s scaling of area and height becomes linear. The strong impact of 2D versus 3D integration on the scalability of a parallel interconnect is schematically illustrated in Fig. 1(d). Interestingly, the 3D routing strategy has been confirmed by evolution: the most reduced topological property of the human neocortex leverages the same effect. Neurons are mostly located on its surface, while long range connections mostly traverse the volume.
However, 3D routing in electronics is challenging. Lithographic fabrication requires of the order signaling layers, which makes such fabrication prohibitive for the kind of dimensionality demanded by neural networks. Heat creation and heat dissipation from such a volumetric circuit’s centre have additionally been identified as problematic Venkatadri et al. (2011). Disposing of this dissipated energy is a major bottleneck already for the mostly serial von Neumann processors Esmaeilzadeh et al. (2012), and parallel interconnects for NN’s require significantly more such layers and connections. Photonics can overcome this challenge Lohmann (1990); Miller (2017), which motivates the interest in photonic interconnects Lee et al. (1989); Shamir et al. (1989) and ultimately in photonic neural networks Farhat et al. (1985); Larger et al. (2012); Duport et al. (2012); Brunner et al. (2013); Vandoorne et al. (2014); Shen et al. (2017); Pierangeli et al. (2019); Khoram et al. (2019) .
Iii 3D interconnects of photonic waveguides
Low loss 3D printed photonic waveguides have been demonstrated at telecommunication wavelengths Lindenmann et al. (2012); Koos et al. (2013); Pyo et al. (2016); Nesic et al. (2019). Our waveguides were fabricated using a commercial 3D Direct-Laser writing system from Nanoscribe GmbH (Germany). A negative tone photoresist ”Ip-Dip” dropped on a glass substrate (25x25x0.7 mm) was photopolymerized via two-photon absorption with a nm femtosecond pulsed laser, focused by a 63X, (1.4 NA) objective lens. After the writing process, the sample was immersed in a PGMEA (1-methoxy-2-propanol acetate) solution for 20 minutes to remove the unexposed photoresist. Samples were written using the scanning mode based on a goniometric mirror, and the scanning speed on the sample’s surface was kept constant at 10 mm. As optimization parameter we used the writing laser’s power. The diameter of individual waveguides is m, and they are spaced by m. Samples were structurally inspected using a scanning electron microscope (SEM, FEI 450W). For optical characterization, we focused a 635 nm laser onto an input waveguide’s top surface using a 50X microscope objective with NA = 0.8. The mode field diameter of the focused beam is m, hence larger than the input waveguide’s diameter. The emission at the couplers’ output ports was collected by a 10X, NA=0.30 microscope objective and imaged onto a CMOS camera (iDS U3-3482LE, pixel size 2.2m) using a 100 mm achromatic lens, resulting in an optical magnification of 5.6.
iii.1 Fractal topology for fully connected layers
Fully or densely connected layers are a principle topology in NNs LeCun et al. (2015); Jouppi et al. (2017). We adopt a routing strategy based on fractal (self-similar) branching, where each signal ’wire’ splits into branches at the branching points. Figure 2(a) schematically illustrates such a fractal tree’s 2D projection onto the -plane for =9 and =2. An input (top red arrow) is therefore distributed to output channels (bottom red arrows), here resulting in . Scaling of is therefore exponential in , and 6561 connections are created for each input channel for =9 and =4 branching layers only. The tree’s architecture is recursively defined according to the spacing between the output channels and height . The dimensions inside the bifurcation layers are and . Horizontal and vertical distances there scale identical, resulting in constant branching angles throughout the entire circuit.
This translation invariance aids the development of strategies to avoid the intersection of waveguides before layer , where they merge into their respective outputs. These details are illustrated for four neighbouring couplers with and in Fig. 2(b). We incorporated chirality into the fractal couplers: the connections from a point in layer to layer have a negative curvature in the -plane, which avoids intersections for vertical and horizontal connections. Furthermore, avoiding intersections for diagonal links additionally requires curvatures in the -direction.
Figure 3(a) shows an SEM image of a 3D fractal coupler array hosting input and outputs, each with and . We can see that chirality successfully avoids unintended intersections. In Fig. 3(b) we show fractal trees for two bifurcations resulting in coupling, with a circuit of inputs and outputs. As for the single bifurcation layer 3D coupler, the two bifurcation layer couplers are mechanically sound, even though they feature waveguide sections with an aspect ration exceeding 50. This excellent result motivated us to continue and integrate a full-scale interconnect with over 200 inputs, each of which are connected to 81 outputs, see Fig. 3(c).
Figure 4 depicts the optical transmission through a fractal coupler. We used the camera images to characterize the optical losses and splitting ratios, where the injection spot focused onto the glass-substrate’s top surface was used as reference. The average optical losses for couplers are 5.5 dB, which rise to 10.6 dB for couplers. Crucially, this includes optical injection losses , propagation losses and losses induced at the coupling or bifurcation points . The fractal design principle allows us to determine each of these contributions. As previously discussed, angles of the different bifurcation layers remain constant due to fractal design. This results in identical bifurcation points for the entire topology, and hence we assume uniform coupling losses . Furthermore, we have characterized a coupler with times larger height and distance . This leaves us with three loss measurements (in dB), the standard (), the three times larger () and the () couplers, and we obtain dB, dB and dB.
According to Fig. 4(a) some of the output ports’ optical modes include second order Gauss-Laguerre contributions. As our polymer waveguides are freestanding in air, they have an exceptionally high diffractive index contrast of . According to the commonly employed approximation for the number of modes supported by a cylindrical waveguide, our waveguides support up to optical modes. However, early stage numerical simulations confirm that only the first and second optical mode are excited, which agrees with our experimental results. We would like to point out that the high refractive index contrast allows (i) single mode waveguides with a diameter of 0.3m only, (ii) exceptionally narrow bending radii, and the combination of (i) and (ii) facilitates compact integrated photonic circuits.
We analysed three and three couplers with respect to the relative power distribution at their output ports, and statistical information is given in Fig. 4(b,c). For the couplers we find that of the total optical output power is provided by the central waveguide, with the remaining quite evenly distributed among the off-center ports, see Fig. 4(b). For couplers, only of the light is contained in the central waveguide, Fig. 4(c). Interestingly, the ’s ratio is not quite the square of the ’s ratio, indicating that cascading our bifurcating waveguides cannot be fully approximated simply by linearly multiplying the coupling ratios of the individual components. Higher order modes therefore appear to have an impact upon the splitting ratios. Overall, the asymmetric splitting ratio is most likely caused by the geometry, and in particular by the branching angles of our waveguide couplers.
iii.2 Haar filters
The previously discussed, highly connected couplers, are typically required close to the output layer of deep neural networks. However, their first layers often highlight structural aspects of input information by tailored, local connection topologies. Examples are convolutional neural networks commonly employed in object recognition LeCun et al. (2015). Prominent convolution Kernels are so called Haar filters. These feature 2D Boolean entries, and this simplification creates a sparse representation of information contained in images, which is a crucial operation for neural networks to be able to generalize to unseen test data Huang and Aviyente (2007). We schematically illustrate in- and output properties of nine exemplary Haar filters (F1-F9) in Fig. 5. There, each filter Kernel’s Boolean weights (0: dark, 1:light) are illustrated as input, while each filter’s dedicated output port is indicated as the output.
We developed a 3D routing topology, schematically illustrated on the right in Fig. 5, to realize the 9 Haar filters. Even in 3D this is challenging, which can be appreciated from the intricate network of connections. Furthermore, the number of configurations scales factorial with the number of filters, and for the required 37 connections of the 9 filters there exist 362880 possibilities. This already large numbers still ignores all geometrical aspects such as waveguide curvatures along the different dimensions. In order to better illustrate the operating principle, we have highlighted the connection topology of filter F2 in orange. For each filter, the input ports weighted by 1 are directly wired the the filter’s output. For incoherent injection into the input waveguides, the intensity at the filter’s output should therefore be proportional to the overlap between its Boolean weights and the input.
In Fig. 6(a) we show the SEM picture of the 3D printed spatial filtering interconnect realizing 9 Haar filters. Waveguides feature smooth surfaces and the overall structure is stable. However, one can identify a tendency that output waveguides with few connections start leaning outwards. Figure 6(b) shows a densely multiplexed array of Haar filter units. Such an interconnect would implement the convolution of a -pixel input image simultaneously with filters F1-F9 fully in parallel. As the individual filter units do not overlap in space the implemented convolution a convolution stride 3.
Figure 6(c) depicts the optical characterization of the filters’ connectivity using the same procedure as for the fractal optical couplers. The individual sub-panels correspond to the transmission through a different filter (F1 to F9) when injecting light into the output port. The optical characterization was therefore carried out in backward direction. We opted for this procedure since output intensities of individual filters correspond to the filter’s Kernel only in the backward direction. In forward direction one would have to iteratively inject into the individual input ports and then sum the output intensities of the different injections; which is possible in principle yet less systematic. Generally, we find an excellent agreement between the designed filter Kernels and the intensities recorded in the reverse propagation direction. The different loss mechanisms obtained for the fractal couplers are consistently reproduced for the Haar filters, with the peculiarity that each coupler exhibits distinct coupling losses . This, however, is to be expected; different filters rely on specific connection degrees, topologies as well as different branching angles.
There is some cross-talk from the optically injected port onto the image of the output plane. One cause might be the smaller height of the overall 3D circuitry. Light not collected and guided by the injected waveguide therefore illuminates a smaller area on the circuit’s output plane, which in turn results in a higher intensity when imaged onto the camera. The outwards-leaning input connections, see Fig. 6(a), might additionally contribute. The resulting non-orthogonal illumination of the waveguide’s tip will most likely reduce the injection efficiency and therefore increase the cross-talk of uncollected light to the output plane. For a fully integrated system this cross-talk will potentially be reduced significantly. Inputs will in most cases be provided by optical fibers or waveguides arriving from an earlier stage of the optical system, for example when using a fiber bundle for collecting an input image. This will also be true for the filter’s output, which will be connected to some fiber or waveguide for further processing down stream.
Iv Discussion and conclusion
We successfully demonstrated complex and large scale 3D photonic interconnects. Waveguides with a diameter of m were created by direct laser writing based on two photon polymerization. Using this novel integration strategy we demonstrated intricate 3D routing topologies for large scale, highly connected as well as convoluting optical interconnects. These example architectures were mostly oriented towards application in neural networks, where such interconnects can realize the large scale vector matrix products fully in parallel, with picosecond latency and potentially low energetic cost Peng et al. (2018). It is the first time that such complex and large scale integrated optical interconnects have been created in 3D.
As our concept scales linear in size it allows for novel routing topologies, which in turn will create new opportunities for integrated special purpose neural network chips. Here, either complete implementations of neural networks, or the use of the photonic interconnects purely as neural network accelerators are a possibility Peng et al. (2018). However, there is a wider relevance for computing. The end of Moore’s and in particular Dennart’s scaling is arguably induced by energy penalties of a processor’s electronic signaling wires. Photonic routing could prolong the scaling of classical electronic (or now: opto-electronic) von Neumann processors, and these ideas can be expanded to intra or inter chip connections. Finally, non-computing related applications such as miniature remote sensing are further possibilities. Ultimately we have demonstrated the first large scale 3D printed photonic circuit board.
The here reported findings are based on the first demonstrations of several, complex 3D photonic circuits, and performance as well as topologies offer significant potential for further improvements. Beyond losses, it is in particular the asymmetric splitting ratios who deserve further attention, even though such imbalance can, by a certain degree, be compensated for by using phase-tunable topologies Miller (2015). Couplers with an even splitting ratio (such as 1x4) promise potentially better homogeneity.
Most importantly, we have addressed the non-scalability of parallel and integrated interconnects for the first time. In order to fully benefit from this new substrate, its functionalization is essential. External control over a waveguide section’s phase delay would enable unitary optical transformations on a scalable substrate Miller (2015). An extension by active or nonlinear photonic elements will establish a new type of photonic device. Crucially, small scale low bandwidth 3D printed polymer circuits are actively considered in electronics, for example for wareables Park et al. (2019).
The authors acknowledge the support of the Region Bourgogne Franche-Comté. This work was supported by the EUR EIPHI program (Contract No. ANR-17-EURE- 0002), by the Volkwagen Foundation (NeuroQNet I&II), by the French Investissements dâAvenir program, project ISITE-BFC (contract ANR-15-IDEX-03) and partly by the french RENATECH network and its FEMTO-ST technological facility. X.P. has received funding from the European Unionâs Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 713694 (MULTIPLY).
The authors thank Marina Raschetti for technical support.
M.T. works at the company Nanoscribe. The other authors declare no conflicts of interest.
- preprint: APS/123-QED
- J. Shamir, H. J. Caulfield, and R. B. Johnson, Applied Optics 28, 311 (1989).
- H. Lee, X. Gu, and D. Psaltis, Journal of Applied Physics 65, 2191 (1989).
- D. Choudhury, D. K. McNicholl, A. Repetti, I. Gris-Sanchez, T. A. Birks, Y. Wiaux, and R. R. Thomson, arxiv , 1903.01288 (2019).
- Y. LeCun, Y. Bengio, and G. Hinton, Nature 521, 436 (2015).
- D. A. Miller, Journal of Lightwave Technology 35, 346 (2017).
- H. Esmaeilzadeh, E. Blem, R. R. St. Amant, K. Sankaralingam, and D. Burger, IEEE Micro 32, 122 (2012).
- D. A. B. Miller, Optica 2, 747 (2015).
- Y. Shen, N. C. Harris, S. Skirlo, M. Prabhu, T. Baehr-Jones, M. Hochberg, X. Sun, S. Zhao, H. Larochelle, D. Englund, and M. Soljacic, Nature Photonics 11, 441 (2017).
- T. W. Hughes, M. Minkov, Y. Shi, and S. Fan, Optica 5, 864 (2018).
- H. Peng, M. A. Nahmias, T. F. de Lima, A. N. Tait, and B. J. Shastri, IEEE Journal of Selected Topics in Quantum Electronics 24, 1 (2018).
- M. Deubel, G. von Freymann, M. Wegener, S. Pereira, K. Busch, and C. M. Soukoulis, Nature Materials 3, 444 (2004).
- L. Yang, A. Münchinger, M. Kadic, V. Hahn, F. Mayer, E. Blasco, C. Barner-Kowollik, and M. Wegener, Advanced Optical Materials 7, 1901040 (2019).
- N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers, R. Boyle, P.-l. Cantin, C. Chao, C. Clark, J. Coriell, M. Daley, M. Dau, J. Dean, B. Gelb, T. V. Ghaemmaghami, R. Gottipati, W. Gulland, R. Hagmann, C. R. Ho, D. Hogberg, J. Hu, R. Hundt, D. Hurt, J. Ibarz, A. Jaffey, A. Jaworski, A. Kaplan, H. Khaitan, A. Koch, N. Kumar, S. Lacy, J. Laudon, J. Law, D. Le, C. Leary, Z. Liu, K. Lucke, A. Lundin, G. MacKean, A. Maggiore, M. Mahony, K. Miller, R. Nagarajan, R. Narayanaswami, R. Ni, K. Nix, T. Norrie, M. Omernick, N. Penukonda, A. Phelps, J. Ross, M. Ross, A. Salek, E. Samadiani, C. Severn, G. Sizikov, M. Snelham, J. Souter, D. Steinberg, A. Swing, M. Tan, G. Thorson, B. Tian, H. Toma, E. Tuttle, V. Vasudevan, R. Walter, W. Wang, E. Wilcox, and D. H. Yoon, arXiv:1704.04760 , 1 (2017).
- K. Huang and S. Aviyente, in Advances in Neural Information Processing Systems 19, edited by B. Schölkopf, J. C. Platt, and T. Hoffman (MIT Press, 2007) pp. 609–616.
- N. Lindenmann, G. Balthasar, D. Hillerkuss, R. Schmogrow, M. Jordan, J. Leuthold, W. Freude, and C. Koos, Optics Express 20, 17667 (2012).
- C. Koos, J. Leuthold, W. Freude, N. Lindenmann, S. Koeber, G. Balthasar, J. Hoffmann, T. Hoose, P. Huebner, D. Hillerkuss, and R. Schmogrow, in Proc. SPIE 8613, Advanced Fabrication Technologies for Micro/Nano Optics and Photonics VI, Vol. 8613 (2013) p. 86130W.
- L. Appeltant, M. C. Soriano, G. V. D. Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, I. Fischer, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, Nature communications 2, 468 (2011).
- M. Nahmias and B. Shastri, IEEE Journal of Selected Topics in Quantum Electronics 19 (2013).
- G. Van der Sande, D. Brunner, and M. C. Soriano, Nanophotonics 6, 561 (2017).
- D. Brunner, P. Antonik, and X. Porte, in Photonic Reservoir Computing, Optical Recurrent Neural Networks, edited by B. Daniel, M. C. Soriano, and G. Van der Sande (De Gruyter, Berlin, Boston, 2019) pp. 1–32.
- A. Neckar, S. Fok, B. V. Benjamin, T. C. Stewart, N. N. Oza, A. R. Voelker, C. Eliasmith, R. Manohar, and K. Boahen, Proceedings of the IEEE 107, 144 (2019).
- J. Moughames, S. Jradi, T. M. Chan, S. Akil, Y. Battie, A. E. Naciri, Z. Herro, S. Guenneau, S. Enoch, L. Joly, J. Cousin, and A. Bruyant, Scientific Reports 6 (2016).
- T. Bückmann, N. Stenger, M. Kadic, J. Kaschke, A. FrÃ¶lich, T. Kennerknecht, C. Eberl, M. Thiel, and M. Wegener, Advanced Materials 24, 2710 (2012).
- G. von Freymann, A. Ledermann, M. Thiel, I. Staude, S. Essig, K. Busch, and M. Wegener, Advanced Functional Materials 20, 1038 (2010).
- V. Venkatadri, B. Sammakia, K. Srihari, and D. Santos, Journal of Electronic Packaging 133, 041011 (2011).
- A. W. Lohmann, in Nonlinear Optics and Optical Computing (Springer US, Boston, MA, 1990) pp. 151–157.
- N. H. Farhat, D. Psaltis, A. Prata, and E. Paek, Applied Optics 24, 1469 (1985).
- L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, Optics express 20, 3241 (2012).
- F. Duport, B. Schneider, A. Smerieri, M. Haelterman, and S. Massar, Optics express 20, 22783 (2012).
- D. Brunner, M. C. Soriano, C. R. Mirasso, and I. Fischer, Nature communications 4, 1364 (2013).
- K. Vandoorne, P. Mechet, T. Van Vaerenbergh, M. Fiers, G. Morthier, D. Verstraeten, B. Schrauwen, J. Dambre, and P. Bienstman, Nature Communications 5, 1 (2014).
- D. Pierangeli, G. Marcucci, and C. Conti, Physical Review Letters 122, 213902 (2019).
- E. Khoram, A. Chen, D. Liu, L. Ying, Q. Wang, M. Yuan, and Z. Yu, Photonics Research 7, 823 (2019).
- J. Pyo, J. T. Kim, J. Lee, J. Yoo, and J. H. Je, Advanced Optical Materials 4, 1190 (2016).
- A. Nesic, M. Blaicher, T. Hoose, A. Hofmann, M. Lauermann, Y. Kutuvantavida, M. Nöllenburg, S. Randel, W. Freude, and C. Koos, Optics Express 27, 17402 (2019).
- Y.-G. Park, H. Min, H. Kim, A. Zhexembekova, C. Y. Lee, and J.-U. Park, Nano Letters 19, 4866 (2019).