Qualcomm Hexagon
Hexagon is the brand for a family of 32-bit multi-threaded microarchitectures implementing the same instruction set for a digital signal processor developed by Qualcomm. According to 2012 estimation, Qualcomm shipped 1.2 billion DSP cores inside its system on a chip in 2011 year, and 1.5 billion cores were planned for 2012, making the QDSP6 the most shipped architecture of DSP.
The Hexagon architecture is designed to deliver performance with low power over a variety of applications. It has features such as hardware assisted multithreading, privilege levels, Very Long Instruction Word, Single Instruction, Multiple Data, and instructions geared toward efficient signal processing. The CPU is capable of in-order dispatching up to 4 instructions to 4 Execution Units every clock. Hardware multithreading is implemented as barrel temporal multithreading - threads are switched in round-robin fashion each cycle, so the 600 MHz physical core is presented as three logical 200 MHz cores before V5. Hexagon V5 switched to dynamic multithreading with thread switch on L2 misses, interrupt waiting or on special instructions.
At Hot Chips 2013 Qualcomm announced details of their Hexagon 680 DSP. Qualcomm announced Hexagon Vector Extensions. HVX is designed to allow significant compute workloads for advanced imaging and computer vision to be processed on the DSP instead of the CPU. In March 2015 Qualcomm announced their Snapdragon Neural Processing Engine SDK which allow AI acceleration using the CPU, GPU and Hexagon DSP.
Qualcomm's Snapdragon 855 contains their 4th generation on-device AI engine, which includes the Hexagon 690 DSP and Hexagon Tensor Accelerator for AI acceleration.
Software support
Operating systems
The port of Linux for Hexagon runs under a hypervisor layer and was merged with the 3.2 release of the kernel. The original hypervisor is closed-source, and in April 2013 a minimal open-source hypervisor implementation for QDSP6 V2 and V3, the "Hexagon MiniVM" was released by Qualcomm under a BSD-style license.Compilers
Support for Hexagon was added in 3.1 release of LLVM by Tony Linthicum. Hexagon/HVX V66 ISA support was added in 8.0.0 release of LLVM. There is also a non-FSF maintained branch of GCC and binutils.Adoption of the SIP block
Qualcomm Hexagon DSPs have been available in Qualcomm Snapdragon SoC since 2006. In Snapdragon S4 there are three QDSP cores, two in the Modem subsystem and one Hexagon core in the Multimedia subsystem. Modem cores are programmed by Qualcomm only, and only Multimedia core is allowed to be programmed by user.They are also used in some femtocell processors of Qualcomm, including FSM98xx, FSM99xx and FSM90xx.
Third-party integration
In March 2016, it was announced that semiconductor company Conexant's AudioSmart audio processing software was being integrated into Qualcomm's Hexagon.In May 2018 wolfSSL added support for using Qualcomm Hexagon. This is support for running wolfSSL crypto operations on the DSP. In addition to use of crypto operations a specialized operation load management library was later added.
Versions
There are six versions of QDSP6 architecture released: V1, V2, V3, V4, QDSP6 V5 ; and QDSP6 V6. V4 has 20 DMIPS per milliwatt, operating at 500 MHz.Clock speed of Hexagon varies in 400–2000 MHz for QDSP6 and in 256–350 MHz for previous generation of the architecture, the QDSP5.
Versions of QDSP6 | Process node, nm | Date | Number of simultaneous threads | Per-thread clock, MHz | Total core clock, MHz |
QDSP6 V1 | 65 | Oct 2006 | |||
QDSP6 V2 | 65 | Dec 2007 | 6 | 100 | 600 |
QDSP6 V3 | 45 | 2009 | 6 | 67 | 400 |
QDSP6 V3 | 45 | 2009 | 4 | 100 | 400 |
QDSP6 V4 | 28 | 2010–2011 | 3 | 167 | 500 |
QDSP6 V5 | 28 | 2013 | 3 | 200 or greater with DMT | 600 |
QDSP6 V6 68X | 14/10 | 2016-2018 | 4 | 500 | 2000 |
Availability in Snapdragon products
Both Hexagon and pre-Hexagon cores are used in modern Qualcomm SoCs, QDSP5 mostly in low-end products. Modem QDSPs are not shown in the table.QDSP5 usage:
Snapdragon generation | Chipset ID | DSP Generation | DSP Frequency, MHz | Process node, nm |
S1 | MSM7627, MSM7227, MSM7625, MSM7225 | QDSP5 | 320 | 65 |
S1 | MSM7627A, MSM7227A, MSM7625A, MSM7225A | QDSP5 | 350 | 45 |
S2 | MSM8655, MSM8255, APQ8055, MSM7630, MSM7230 | QDSP5 | 256 | 45 |
S4 Play | MSM8625, MSM8225 | QDSP5 | 350 | 45 |
S200 | 8110, 8210, 8610, 8112, 8212, 8612, 8225Q, 8625Q | QDSP5 | 384 | 45 LP |
QDSP6 usage:
Snapdragon generation | Chipset ID | QDSP6 version | DSP Frequency, MHz | Process node, nm |
S1 | QSD8650, QSD8250 | QDSP6 | 600 | 65 |
S3 | MSM8660, MSM8260, APQ8060 | QDSP6 | 400 | 45 |
S4 Prime | MPQ8064 | QDSP6 | 500 | 28 |
S4 Pro | MSM8960 Pro, APQ8064 | QDSP6 | 500 | 28 |
S4 Plus | MSM8960, MSM8660A, MSM8260A, APQ8060A, MSM8930, MSM8630, MSM8230, APQ8030, MSM8627, MSM8227 | QDSP6 | 500 | 28 |
S400 | 8926, 8930, 8230, 8630, 8930AB, 8230AB, 8630AB, 8030AB, 8226, 8626 | QDSP6V4 | 500 | 28 LP |
S600 | 8064T, 8064M | QDSP6V4 | 500 | 28 LP |
S800 | 8974, 8274, 8674, 8074 | QDSP6V5A | 600 | 28 HPm |
S820 | 8996 | QDSP6V6 | 2000 | 14 FinFet LPP |
Code sample
This is a single instruction packet from the inner loop of a FFT::endloop0
This packet is claimed by Qualcomm to be equal to 29 classic RISC operations; it includes vector add, complex multiply operation and hardware loop support. All instructions of the packet are done in the same cycle.