* If you encounter menus do not work upon clicking, delete your browser’s cache.

Monday, June 5

Short Course 1 :: Machine Learning for Circuit Designers

Organizers / Chairs: Masanao Yamaoka, Hitachi, Ltd. and Vivienne Sze, MIT

Statistical machine learning is expanding its application areas to almost all domains in the society. At the same time, this new technology changes the landscape of Information Technology in full range, from hardware architecture, software frameworks, system development processes, to business models. This talk first provides the basics of statistical machine learning, focusing on the similarities and differences with traditional IT. Then, we will explore how statistical machine learning can be applied to various domains. In particular, we will focus on connected embedded systems, also known as Internet-of-Things, because ti will be the area that will be most affected by the introduction of statistical machine learning.

About Hiroshi Maruyama

Hiroshi Maruyama received his M.S. in Computer Science from Tokyo Tech in 1983 and his Ph.D in Engineering from Kyoto University in 1995. He spent 26 years at IBM Research, Tokyo Research Laboratory working on various computers science fields such as natural language processing, information retrieval, hand-writing recognition, Web Services, and computer security. From 2006-2009 he served as the Director of the lab. He was a professor at the Institute of Statistical Mathematics between 2011 and 2016 where he worked on various “big data ” related projects. He joined Preferred Networks, Inc. in April 2016 and since then, he has been helping this startup company to transform the many aspects of information technology.

With increasing amount, diversity, and significance of data in our lives, machine-learning algorithms are giving us unprecedented capabilities to extract high-value inferences, based on models constructed from the data itself. This has been particularly enabling in sensor applications, where the physical processes deriving embedded signals are often not well understood. But, to address the severe energy constraints in these applications, we are also finding that machine-learning algorithms give rise to substantial opportunities to relax implementation requirements, enabling aggressive energy-saving and throughput-enhancing mixed-signal architectures. This presentation starts by reviewing the key attributes of machine-learning algorithms, and the relaxations that arise, specifically from data-driven model adaptation and statistical optimization. Then, considering silicon prototypes, we look at how these can be exploited in architectures to address system-design challenges, such as the energy of sensor acquisition as well as the energy and latency of memory accessing.

About Naveen Verma

Naveen Verma received the B.A.Sc. degree in Electrical and Computer Engineering from the University of British Columbia, Vancouver, Canada in 2003, and the M.S. and Ph.D. degrees in Electrical Engineering from Massachusetts Institute of Technology in 2005 and 2009 respectively. Since July 2009 he has been with the Department of Electrical Engineering at Princeton University, where he is currently an Associate Professor. His research focuses on advanced sensing systems, including low-voltage digital logic and SRAMs, low-noise analog instrumentation and data-conversion, large-area sensing systems based on flexible electronics, and low-energy algorithms for embedded inference, especially for medical applications. Prof. Verma is a Distinguished Lecturer of the IEEE Solid-State Circuits Society, and serves on the technical program committees for ISSCC, VLSI Symp., DATE, and IEEE Signal-Processing Society (DISPS).

Deep convolutional neural networks (CNNs) are widely used in modern AI systems for their superior accuracy, but at the cost of high computational complexity. The complexity comes from the need to simultaneously process hundreds of filters and channels in the high-dimensional convolutions. This computation not only requires a large number of arithmetic operations, but also a correspondingly large amount of data, creating significant data movement both on-chip and off-chip, which is even more energy consuming than the computation. Furthermore, deep learning is not a single static algorithm but a set of algorithms that not only change with application but are evolving rapidly. Combined, these factors present not only challenges on throughput and energy efficiency to the underlying processing hardware, but opportunities for architectural and implementation innovation. In this talk I will review some of the sources of these challenges and outline a number of the avenues available for addressing them, including optimizing the pattern of use of data, exploiting the data’s statistics, and hardware/algorithm co-design. Along the way, I will present a framework that allows for an intuitive characterization and energy analysis of alternative ways to orchestrate data movement.

About Joel Emer

For nearly 40 years, Joel Emer has held various research and advanced development positions investigating processor microarchitecture and developing performance modeling and evaluation techniques. He has made architectural contributions to a number of VAX, Alpha and X86 processors and is recognized as one of the developers of the widely employed quantitative approach to processor performance evaluation. More recently, he has been recognized for his contributions in the advancement of simultaneous multithreading, processor reliability analysis, cache organization and spatial architectures. Currently he is a Senior Distinguished Research Scientist in Nvidia’s Architecture Research group. In his spare time, he is a Professor of the Practice at MIT. Prior to joining Nvidia he worked at Intel where he was an Intel Fellow and Director of Microarchitecture Research. Even earlier, he worked at Compaq and Digital Equipment Corporation. He earned a doctorate in electrical engineering from the University of Illinois in 1979. He received a bachelor’s degree with highest honors in electrical engineering in 1974, and his master’s degree in 1975 — both from Purdue University. Among his honors, he is a Fellow of both the ACM and IEEE, and he was the 2009 recipient of the Eckert-Mauchly award for lifetime contributions in computer architecture.

Deep neural networks have been achieving great and promising performance in image recognition and speech recognition. However, training deep neural networks is very high complexity and time consuming computation, requiring acceleration using GPU or ASIC cluster in the datacenter. In this presentation, we will give advanced techniques for high-speed deep learning on large-scale neural network in the cloud. First, we will introduce the data parallelization technique using accelerator cluster for high-speed deep learning, and then, we will talk about the memory efficient technique enabling the implementation of large-scale neural network on single GPU. Finally, we will conclude with the demonstration of the developed system using our techniques implemented in the deep learning framework caffe.

About Yasumoto Tomita

Yasumoto Tomita received the B.S., M.S. and Ph.D degrees in electrical engineering from Keio University, Yokohama, Japan in 2002, 2004 and 2007 respectively. After his studies, he worked for Fujitsu Laboratories, Ltd. Kawasaki, Japan, where he has been the manager of research and design of high-speed CMOS I/O and artificial intelligent computing group. He served as a technical program committee member for A-SSCC and VLSI Symposium on Circuits.

Deep learning techniques are increasingly popular in applications such as image understanding, speech recognition and natural language processing and for processing sensor data in IoT devices. Deep learning, however, comes with significant computational complexity, which has prevented its broad adoption in mobile and battery power devices. Recent advances in low-power algorithmic techniques and in specialized architectures for deep learning reduce the energy cost for deep learning technique, thereby extending the range of applications where such techniques are feasible. After an introduction into deep learning—both for training networks and for inference– this tutorial will present an overview of processing architectures for mobile and embedded devices. We will also review algorithmic innovations that enable energy efficient accelerators that offer 100x lower power consumption without any noticeable degradation of performance. The tutorial will also preview some of the relevant advances in deep learning for embedded devices.

About Raheel Khan

Raheel Khan completed his B.S. and M.S. degrees in Electrical Engineering from the Georgia Institute of Technology in 1991 and 1993, respectively. He has over twenty-five years of experience in the development high-performance signal processing systems at companies like Cisco Systems, Broadcom Corporation and Qualcomm, Incorporated. He most recently served as Vice President of Engineering at Qualcomm, where he oversaw machine learning architecture for Qualcomm’s mobile products. He holds twenty-five patterns in computer architecture, system design and low-power design.

Always on vision processing in embedded applications is very challenging from the point of view of top-line operations/sec/W equally even where power consumption is not the overriding concern heat from power dissipation is an important consideration. In order to meet these constraints Movidius Myriad Vision Processing Units (VPUs) have used a balanced approach combining an array of SHAVE VLIW processors with Vector and SIMD capabilities for kernels requiring up to 10s of operations per pixel, hardware acceleration for kernels requiring 10s-100s of operations per pixel fed by a multicore memory subsystem for high sustained rather than peak performance. The design methodology used throughout was based on a focus on parallelism and modest clock rates combined with standard cell libraries and generated single-ported SRAM instances rather than custom macros, with the exception of the register files where there was a significant advantage to designing a macro owing to the high number of ports on the 128-bit Vector Register File (VRF). The design has evolved significantly over 3 generations of process technology, from 65nm LP, to 28nm HPC and now 16nm FF. Equally the power management strategy and use of power islands will be detailed with a view to the applications requirements in VR/AR/MR, robotics/drones, security cameras and wearables. These use cases and key computational kernels will be illustrated with relevant system diagrams, key kernels and metrics across computer vision and CNN networks.

About David Moloney

David Moloney Director of Machine Vision Technology, NTG at Intel Corporation and formerly the Chief Technology Officer of Movidius until their acquisition by Intel in November 2017. David has a BEng in Electronic Engineering Dublin City University 1985 and PhD from Trinity College Dublin in 2010 in the area of FPGA-based HPC for Computational Fluid Dynamics. David has worked in the semiconductor industry internationally for the past 28 years with Infineon in Germany, ST Microelectronics Italy, Parthus-Ceva (Ceva-DSP) and FrontierSilicon in Ireland, before founding Movidius in 2005 with Sean Mitchell. David has 31 granted patents and numerous publications. He acts as a reviewer for IEEE communications magazine and for the EU Commission on programs such as ARTEMIS. David is a member of the EU FP7 HiPEAC NoE and collaborates on the FP7 PHEPPER and EXCESS projects as well as the Eyes of Things (EoT) Horizon 2020 project. His interests Include Processor architecture, Computer Vision Algorithms and HW acceleration and systems, hardware and multiprocessor design for DSP communications, HPC and multimedia applications.

Recently, deep neural network (DNN) is one of the fastest growing fields in artificial intelligence due to its simple learning mechanisms and overwhelming performances. This course focuses on SoC implementations for mobile/embedded DNN. In the course, mobile/embedded DNN will be discussed firstly with the edge-oriented HW-based approach. Then, the challenges and issues associated with the implementations of mobile/embedded DNN SoCs will be explained. After that, algorithm, architecture, and circuit level techniques for DNN SoCs are discussed. Also, real implementation results of the state-of-the-art DNN SoCs and their applications in AI will be introduced. In addition, the future of AI SoC and its architectures will be explored with the processor-in-memory and the non-volatile memory architectures.

About Hoi-Jun Yoo

Hoi-Jun Yoo graduated from the Electronic Department of Seoul National University and received the M.S. and Ph.D. degrees in electrical engineering from the KAIST. Now, he is the full professor of Department of Electrical Engineering at KAIST, the director of SDIA (System Design Innovation and Application Research Center). Since 2010, he has served the general chair of Korean Institute of Next Generation Computing. His current interests are intelligent SoC, computer vision SoC, body area networks, biomedical devices and circuits. He published more than 300 papers, and is the author and the co-author of 12 books. He has served as a member of the executive committee of ISSCC, Symposium on VLSI, and A-SSCC and the TPC chair of the A-SSCC 2008 and ISWC 2010, IEEE Fellow, IEEE Distinguished Lecturer (’10-’11), Far East Chair of ISSCC (‘11-‘12), Technology Direction Sub-Committee Chair of ISSCC (’13), TPC Vice Chair of ISSCC (’14), and TPC Chair of ISSCC (’15).

Short Course 2 :: Integrated Circuits for Smart Connected Cars and Automated Driving

Organizers / Chairs: Kouichi Kanda, Fujitsu Laboratories Ltd. and John Wuu, AMD

The automotive industry is currently going through significant changes. Developments towards automated, electrified and connected vehicles are enabled by automotive electronics. In this presentation I will give an overview of leading applications, challenges and new developments in the area of automotive electronics. I will describe different levels of automated driving, including active safety systems, and give application examples. Then, I will give an overview of key components of such systems, such as RADAR, LIDAR, computer vision, and inertial sensors with their respective challenges and solution approaches. Those sensor signals are being processed by very powerful control units. I will explain signal processing architectures and trends. I will then focus on the topic of connectivity and conclude with a description of various testing and qualification approaches.

About Christoph Lang

Christoph Lang is the Director of Integrated Circuits and Wireless Connectivity at the Bosch Research & Technology Center in Palo Alto, CA. He has been working at this research center from 2004 until now. From 2000-2004 he has worked for Bosch in Germany as Integrated Circuits Architect for a Bosch MEMS gyroscope which has been in mass production since early 2005 (now more than 50 million sensors in the field). From 1996-2000 he worked on his PhD degree in electrical engineering at the Technical University of Kaiserslautern, Germany. During this time he focused on integrated circuit design for inertial sensors. He is inventor on more than 45 patents and patent applications. His current interests include solutions for Automotive Sensors, Internet of Things, and Medical Diagnostics.

The automotive industry is undergoing a sea-change in the in-vehicle networking technology. Cars are increasingly becoming more like commercial technology products as consumers look more to software applications and the digital media experience than the traditional parameters of engine performance and vehicle comfort. Also, new applications like self-driving and advanced driver assist systems bring a whole new level of complexity to the vehicle control systems. These changes are forcing the automotive industry to adopt new technologies that provide more flexible data exchange and meet the stringent environmental requirements of an automotive system. To meet this challenge, the car manufacturers have turned to Ethernet as a technology that has a proven ability to support complexity and is flexible to meet various performance requirements. This presentation will go into detail on the different developments in place to bring this technology to the automotive space.

About Alexander Tan

Alexander E Tan is the Director of the Automotive Solutions Group at Marvell Semiconductor. At Marvell Semiconductor he oversaw the launch of the 1000BASE-T1 and 100BASE-T1 Ethernet single pair PHY product families and the development of industry leading automotive switch and SOC solutions. He has been involved in automotive semiconductor development for more than a decade and started working on Ethernet semiconductor devices in 2000 as a digital design engineer. Prior to Marvell Semiconductor, he worked at Texas Instruments and National Semiconductor to introduce the FPD-Link 2 & 3 automotive LVDS technology. He has a B.S. in Physics from the University of Florida, a M.S, in Electrical and Computer Engineering from the Georgia Institute of Technology with a concentration in DSP and an MBA from the Emory University Goizueta School of Business.

CMOS image sensors are becoming a key device for driving safety, driving assistance and driving comfort in automotive applications. In order to make autonomous vehicles possible, image sensors will also play an important role for eyes of vehicles themselves. This talk gives an overview of advanced image sensor technologies which will be useful for automotive applications and autonomous vehicles in near future. Fundamentals of CMOS image sensors will be given as an introduction of this talk. We will then focus in detail on recent progress of device and circuit technologies for a global shutter, high-speed image acquisition, wide dynamic range, high sensitivity (low noise), and functional pixels such as lock-in pixels for 3D Time-of-flight (TOF) range imaging. This will be followed by a discussion on the possibility of CMOS-based TOF range image sensors for their wide applications to LiDARs and vehicle surrounding and interior monitoring systems.

About Shoji Kawahito

Dr. Shoji Kawahito received the doctor of engineering degree from Tohoku University, Sendai, Japan, in 1988. In 1988, he joined Tohoku University as a Research Associate. From 1989 to 1999, he was with Toyohashi University of Technology. From 1996 to 1997, he was a Visiting Professor at ETH, Zurich. Since 1999, he has been a professor with the Research Institute of Electronics, Shizuoka University. Since 2006, he has been a CTO of Brookman Technology Inc., a university spin-off company. He has been engaged in research on pixel devices, circuits, architecture and signal processing for CMOS imaging devices. He has contributed to several important fundamental technologies for CMOS imaging devices: series of column-parallel cyclic analog-to-digital converters for high-speed wide dynamic range imaging, low-noise global shutter pixels, image sensors for super high vision systems, and lock-in pixels for highly time-resolved imaging. Dr. Kawahito has recently received several awards for his contribution to CMOS imaging devices including IEEE Fellow Award in 2009, Walter Kosonocky Award from International Image Sensor Society in 2013, and Image Sensors Europe Awards in 2017.

Over the last two decades, sensing technologies like RADAR and cameras have become common in cars, assisting the drivers and increasing automotive safety. Nevertheless, the achievement of fully autonomous self-driving vehicles hinges on the capability of scientists and engineers to turn what traditionally has been a costly and bulky instrument into something that can affordably be installed in every car. In this talk we will review the basic physics principles behind LIDAR, then move into the detailed description of the current generation systems, and go over some of the published ideas that the industry is pursuing to achieve this very challenging goal.

About Eduardo Bartolome

Eduardo Bartolome received his M.S. in Electronic Engineering from the Polytechnic University of Catalonia, Spain, in 1994. After his studies, he worked in the IFAE (High Energy Physics Institute) designing particle detection systems, often involving the development of photon counting electronics. In 1999 he moved to Texas Instruments in Dallas, where he held positions in characterization, design and systems engineering, in areas like wireless, medical and automotive. Among other roles, currently he is the technologist in the Imaging group, defining a new generation of mixed signal ICs for automotive LIDAR.

Sensor interface design traditionally focuses on achieving sufficient signal-to-noise ratio but automotive applications demand more. Automotive customers require exceptional quality and reliability using proven technologies in applications deployed in abusive environments. Recent progress in Advanced Driver Assistance Systems (ADAS) that augment driver inputs to braking, steering, and the power train have heightened safety implications for the sensors involved. This trend is accelerating with the emergence of autonomous vehicles increasing the demand for sensors, their precision, and their inherent safety. This presentation will cover an engineer’s perspective on sensor technologies and interface design challenges that are unique to automotive and highlight how that uniqueness has changed and continues to change over time.

About Bill Clark

Bill Clark received his doctorate in electrical engineering from the University of California, Berkeley in 1997 completing a dissertation on MEMS devices with an emphasis on vibratory rate gyroscopes. After graduation, Bill was among the founders of a small start-up company working on precision MEMS inertial sensors that was acquired by Analog Devices. Since acquisition in 2001, Bill has continued to work with Analog Devices designing MEMS optical and inertial devices. Bill’s innovations over the years have resulted in over 30 patents in the fields of MEMS fabrication, packaging, power electronics, inertial sensor design and architecture.

With the increasingly stricter fuel efficiency requirements in the recent years, EVs (Electrical Vehicles) have come to account for an increasing share of the total number of vehicles produced. To increase the range of such motor-driven vehicles, it is necessary to boost the energy efficiency of motor control. In this presentation, we will introduce key technologies of the electronic control units (ECUs) that control the motors. Model based development (MBD) has become widely used in the automobile industry, which allows the designers to consider complicated large-scale system design. We will also mention how the semiconductor fields contribute to the efforts to MBD.

About Sugako Otani

Sugako Otani is a system and processor architect at Renesas Electronics Corporation. She is a chief architect in CPU System Solution Department, a research and development group that investigates hardware architectures including memory and multicore processor, system software and operating system. Her current research focuses on application specific architecture ranging from IoT devices to automobile. She joined Mitsubishi Electric Corporation, Japan, in 1995 after receiving an M.S. in physics from Waseda University, Tokyo. She received a Ph.D. in Electrical Engineering and Computer Science from Kanazawa University in 2015. From 2005 to 2006, she was a Visiting Scholar at Stanford University. She is currently serving for COOL Chips Program Committee.

Autonomous Vehicles are highly complex high performance compute systems with unique a heterogeneous architectures of different processing and compute elements. In this in-depth presentation, we will give an overview of the major processing stages of an autonomous vehicle, review the platform and functional requirements for automotive processors with emphasis on what are the key differences in processors for the automotive industry. Then we will review the software architecture of an autonomous vehicle including machine and deep learning – a key technology for human brain like decision making in an autonomous vehicle. Then we will present the security architecture for autonomous vehicles which also has unique requirements special to the automotive industry. Finally, it all will be brought together in a comprehensive system architecture view of an autonomous vehicle.

About Jack Weast

Jack Weast is a Principal Engineer and the Chief Systems Engineer for Autonomous Driving Solutions at Intel. In his 18 year career at Intel, Jack has built a reputation as a change agent in new industries with significant technical and architectural contributions to a wide range of industry-first products and standards that range from one the world’s first Digital Media Adapters to complex heterogeneous high performance compute solutions in markets that are embracing high performance computing for the first time. With an End to End Systems perspective, Jack combines a unique blend of embedded product experience with a knack for elegant Software and Systems design that will accelerate the adoption of Autonomous Driving. Jack is the co-author of “UPnP: Design By Example”, is an Associate Professor at Portland State University and his the holder of 18 patents with dozens pending.